Let *p* be an odd prime number. If the equation

*x*² = *n* mod *p*

has a solution then *n* is a square mod *p*, or in classical terminology, *n* is a quadratic residue mod *p*. Half of the numbers between 0 and *p* are quadratic residues and half are not. The residues are distributed somewhat randomly. For large *p*, quadratic residues are distributed enough like random numbers to be useful in cryptographic applications. For example, see this post. But the distribution isn’t entirely random as we’ll demonstrate here.

First let’s look at a couple illustrations, using *p* = 101 and *p* = 103. We’ll use the following Python code to plot the residues.

import matplotlib.pyplot as plt for p in [101, 103]: s = {x**2 % p for x in range(1, p)} for q in s: plt.vlines(q, 0, 1) plt.title("Quadratic residues mod {}".format(p)) plt.savefig("residues_mod_{}.svg".format(p)) plt.close()

Here are the plots.

## Symmetry and anti-symmetry

If you look closely at the first image, you’ll notice it appears to be symmetric about the middle. And if you look closely at the second image, you’ll notice it appears to be anti-symmetric, i.e. the left side has bars where the right side has gaps and vice versa.

The following code shows that these observations are correct.

p = 101 s = {x**2 % p for x in range(1, p)} for q in s: assert(p - q in s) p = 103 s = {x**2 % p for x in range(1, p)} for q in s: assert(p - q not in s)

Both of these observations hold more generally. If *p* = 1 mod 4, the residues are symmetric: *n* is a quadratic residue if and only if *p* – *n* is a quadratic residue. And if *p* = 3 mod 4, the residues are anti-symmetric: *n* is a quadratic residue if and only if *p* – *n* is *not* a quadratic residue. Both of these statements are consequences of the quadratic reciprocity theorem.

## Quadratic excess

If you look again at the plot of residues for *p* = 103, you’ll notice that not only is it anti-symmetric, it is also darker on the left side. We can verify with the following code

print(len( [q for q in s if q < 51] )) print(len( [q for q in s if q > 51] ))

that there are 28 residues on the left and and 23 on the right. This is a general pattern called quadratic excess. The quadratic excess of an odd prime *p*, denoted *E*(*p*) is the number of quadratic residues to the left of the middle minus the number to the right of the middle. If *p* = 1 mod 4, *E*(*p*) is zero by symmetry, but if *p* = 3 mod 4, *E*(*p*) is always positive.

Here is a plot of quadratic excess for primes less than 3000 and congruent to 3 mod 4.

## Statistical properties

In this post I’ve said a couple things that are in tension. At the top I said that quadratic residues act sufficiently like random bits to be useful in cryptography. Then later on I showed that they have major departures from random behavior. That would be a problem if you looked at the entire sequence of residues, but if *p* has thousands of digits, and you’re looking at a small sample of the reciprocity values, that’s a different situation.

Suppose you pick two large enough primes, *p* and *q*, and keep them secret but tell me their product *N* = *pq*. Then you pick a random value *n* and ask me whether it’s a quadratic residue mod *N*, the best I can do is say “Maybe, with probability 1/2.” There are crypto systems that depend on this being a computationally hard problem to solve, unless the factors *p* and *q* are known.

However, you need to be a little careful. If *n* is smaller than √*N*, I could test whether *n* is a square, working in the integers, not the integers mod *N*. If it is a square integer, then it’s a square mod *N*. If it’s not a square integer, it might be a square mod *N*, but I wouldn’t know.

There’s a related issue I may explore in the future. Suppose instead of just asking whether a particular number is a quadratic residue, you take a range of numbers and return 0 or 1 depending on whether each number is a residue. Can I distinguish the sequence from a random sequence using statistical tests? If I took a sequence of length greater than *N*/2 then I could tell by looking at symmetry or antisymmetry. But what if the range is small relative to *N*? Maybe *N* is a thousand-digit numnber but you only give me a sequence of a million bits. Would the distribution be distinguishable from uniform random bits? Is there any sort of autocorrelation that might betray the sequence as not random?

This the basis for the Blum Blum Shub random number generator:

https://en.wikipedia.org/wiki/Blum_Blum_Shub