Learning with errors

Linear regression

Suppose you have a linear regression with a couple predictors and no intercept term:

β₁x₁ + β₂x₂ = y + ε

where the x‘s are inputs, the β are fixed but unknown, y is the output, and ε is random error.

Given n observations (x₁, x₂, y + ε), linear regression estimates the parameters β₁ and β₂.

I haven’t said, but I implicitly assumed all the above numbers are real. Of course they’re real. It would be strange if they weren’t!

Well, we’re about to do something strange. We’re going to pick a prime number p and do our calculations modulo p except for the addition of the error ε. Our inputs (x₁, x₂) are going to be pairs of integers. Someone is going to compute

r = β₁x₁ + β₂x₂ mod p

where β₁ and β₂ are secret integers. Then they’re going to tell us

r/p + ε

where ε is a random variable on the interval [0, 1]. We give them n pairs (x₁, x₂) and they give back n values of r/p with noise added. Our job is to infer the βs.

This problem is called learning with errors or LWE. It’s like linear regression, but much harder when the problem size is bigger. Instead of just two inputs, we could have m inputs with m secret coefficients where m is large. Depending on the number of variables m, the number of equations n, the modulus p, and the probability distribution on ε, the problem may be possible to solve but computationally very difficult.

Why is it so difficult? Working mod p is discontinuous. A little bit of error might completely change our estimation of the solution. If n is large enough, we could recover the coefficients anyway, using something like least squares. But how would we carry that out? If m and p are small we can just try all p^m possibilities, but that’s not going to be practical if m and p are large.

In linear regression, we assume there is some (approximately) linear process out in the real world that we’re allowed to reserve with limited accuracy. Nobody is playing a game with us, that is just how data come to us. But with LWE, we are playing a game that someone has designed to be hard. Why? For cryptography. In particular, quantum-resistant cryptography.

Post Quantum Cryptography

Variations on LWE are the basis for several proposed encryption algorithms that believed to be secure even if an adversary has access to a quantum computer.

The public key encryption systems in common use today would all be breakable if quantum computing becomes practical. They depend on mathematical problems like factoring and discrete logarithms being computationally difficult, which they appear to be with traditional computing resources. But we know that these problems could be solved in polynomial time on a quantum computer with Shor’s algorithm. But LWE is a hard problem, even on a quantum computer. Or so we suspect.

The US government’s National Institute of Standards and Technology (NIST) is holding a competition to identify quantum-resistant encryption algorithms. Last month they announced 26 algorithms that made it to the second round. Many of these algorithms depend on LWE or variations.

One variation is LWR (learning with rounding) which uses rounding rather than adding random noise. There are also ring-based counterparts RLWE and RLWR which add random errors and use rounding respectively. And there are polynomial variations such as poly-LWE which uses a polynomial-based learning with errors problem. The general category for these methods is lattice methods.

Lattice methods

Of the public-key algorithms that made it to the second round of the NIST competition, 9 out of 17 use lattice-based cryptography:

CRYSTALS-KYBER
FrodoKEM
LAC
NewHope
NTRU
NTRU Prime
Round5
SABER
Three Bears

Also, two of the nine digital signature algorithms are based on lattice problems:

CRYSTALS-DILITHIUM
FALCON

Based purely on the names, and not on the merits of the algorithms, I hope the winner is one of the methods with a science fiction allusion in the name.

Regression, modular arithmetic, and PQC

Linear regression