Gauss’ golden theorem: quadratic reciprocity

Suppose you have an odd prime p and an integer a, with a not a multiple of p. Does a have a square root mod p? That is, does there exist an integer x such that x² is congruent to a mod p? Half the time the answer is yes, and half the time it’s no. (We will remove the restriction that p is prime near the end of the post.)

In principle it’s easy to tell whether a has a square root: square all the integers from 1 to p-1 and see whether one of them is congruent to a. For example, let p = 7. Square the numbers 1 through 6 and take the remainders by 7. You get 1, 4, 2, 2, 4, 1. So the numbers 1, 2, and 4 have square roots mod 7, and the numbers 3, 5, and 6 do not. In application, p might be very large, and so such brute force isn’t practical.

Legendre symbols

Before we go any further it will help to introduce some notation. The Legendre symbol


looks like a fraction, but it’s not. As we’ll see, it behaves something like a fraction, which was the motivation for the notation. But the symbol is defined to be 1 if a has a square root mod p and -1 otherwise.

One key fact is for any numbers m and n that are not multiples of p,

\left(\frac{m}{p}\right) \left(\frac{n}{p}\right) = \left(\frac{mn}{p}\right)

So Legendre symbols multiply like fractions on top but not on bottom. This tells us that we can tell whether a has a square root mod p by factoring a and determining whether each of its factors has a square root. For example, suppose we want to know whether 108 has a square root modulo some prime p. Since 108 = 2² 3³ the question boils down to whether 3 has a square root mod p, because obviously 2² 3² has a square root, namely 2×3 = 6.

Quadratic reciprocity

The most important result we need is Gauss’ quadratic reciprocity theorem, what he called his golden theorem. Gauss published six proofs of this theorem, and left two more proofs unpublished.

One thing you can do with a fraction is take its reciprocal, and the quadratic reciprocity theorem takes the “reciprocal” of the Legendre symbol. The quadratic reciprocity theorem says that if p and q are odd primes, then

\left(\frac{p}{q}\right) \left(\frac{q}{p}\right) = (-1)^{\frac{p-1}{2} \frac{q-1}{2}}

Algorithm for computing Legendre symbols

Suppose pq. The quadratic reciprocity theorem says we can reduce the problem of determining whether p has a square root mod q to the opposite problem, determining whether q has a square root mod p. Why is that progress? Because we can reduce q mod p, and so now we’re dealing with smaller numbers. For example, quadratic reciprocity tells us that 7 has a square root mod 61 if and only if 61 has a square root mod 7. But 61 is congruent to 5 mod 7, so we’ve reduced the question to whether 5 has a square root mod 7, and we know from above that it does not.

You can repeatedly apply the techniques of factoring the top of the Legendre symbol and applying quadratic reciprocity to steadily decrease the size of the numbers you’re dealing with until you get either a 1 or a 2 on top. (Why not apply quadratic reciprocity one more time when you get a 2 on top? Because the theorem applies to odd primes.)

If you get a 1 on top, the answer is obvious: 1 has a square root, namely itself, modulo anything. If you get a 2 on top, you need the result

\left(\frac{2}{p}\right) = (-1)^{\frac{p^2 - 1}{8}}

Jacobi symbols

You might think you could compute the Legendre symbol in Mathematica with a function called LegendreSymbol, but there’s no such function. Instead, you call JacobiSymbol. The Jacobi symbol is a generalization of the Legendre symbol, using the same notation but allowing the “denominator” to be any odd positive number. As before the symbol is defined to be 1 if the top has a square root modulo the bottom, and -1 otherwise.

If m has prime factors pi with exponents ei, then the Jacobi symbol can be computed by

\left(\frac{n}{m}\right) = \prod \left(\frac{n}{p_i} \right )^{e_i}

Technically the symbol on the left is a Jacobi symbol and the symbols on the right are Legendre symbols. But the distinction doesn’t matter because when m is an odd prime, the Jacobi symbol equals the Legendre symbol.

In Python, you can compute Legendre symbols with sympy.ntheory.legendre_symbol and Jacobi symbols with sympy.ntheory.jacobi_symbol.

Finding square roots

So far we’ve described how to determine whether a number has a square root mod m where m is odd. Once you know a square root exists, how do you find it?

These notes show how to solve quadratic congruences, whether m is odd or not.

Almost prime generators and almost integers

Here are two apparently unrelated things you may have seen before. The first is an observation going back to Euler that the polynomial

n^2 - n + 41

produces a long sequence of primes. Namely, the values are prime for n = 1, 2, 3, …, 40.

The second is that the number

e^{\pi \sqrt{163}}

is extraordinarily close to an integer. This number is known as Ramanujan’s constant. It differs from its nearest integer by 3 parts in 1030. Ramanujan’s constant equals


There is a connection between these two facts: The polynomial

n^2 - n + k

returns primes for n = 1, 2, 3, …, k-1 primes if 4k – 1 is a Heegner number, and

e^{\pi \sqrt{d}}

is almost an integer if d is a (large) Heegner number.

Source: The Book of Numbers by Conway and Guy.

Heegner numbers

So what’s a Heegner number and how many are there? An integer d is a Heegner number if the ring generated by appending √-d to the integers has unique factorization. There are nine such numbers:

1, 2, 3, 7, 11, 19, 43, 67, 163.

There’s deeper stuff going on here than I understand—modular forms, the j-function, etc.—so this post won’t explain everything. There’s something unsatisfying about saying something is “almost” an integer without quantifying. There’s a way to be more precise, but we won’t go there. Instead, we’ll just play with the results.

Mathematica computation

First we look at the claim that n² – n + k produces primes for n = 1 through k – 1 if 4k – 1 is a Heegner number. The Heegner numbers of the form 4k + 1 are 2, 3, 5, 11, and 17. The following code shows that the claim is true for these values of k.

k = {2, 3, 5, 11, 17}
claim[x_] := AllTrue[
  Table[n^2 - n + x, {n, x - 1}], 
AllTrue[k, claim]

This returns True, so the claim is true.

As for exp(π √d) being close to an integer, this apparently only true for the last three Heegner numbers.

h = {1, 2, 3, 7, 11, 19, 43, 67, 163}
For[i = 1, i < 10, i++, 
        Exp[ Pi Sqrt[ h[[i]] ] ], 

(The function AccountingForm suppresses scientific notation, making it easier to see where the decimal portion of the number starts.)

Here are the results:


I manually edited the output to align the decimal points and truncate the decimal places beyond that needed to show that the last number is not an integer.

Partition numbers and Ramanujan’s approximation

The partition function p(n) counts the number of ways n unlabeled things can be partitioned into non-empty sets. (Contrast with Bell numbers that count partitions of labeled things.)

There’s no simple expression for p(n), but Ramanujan discovered a fairly simple asymptotic approximation:

p(n) \sim \frac{1}{4n\sqrt{3}} \exp(\pi \sqrt{2n/3})

How accurate is this approximation? Here’s a little Matheamtica code to see.

p[n_] := PartitionsP[n]
approx[n_] := Exp[ Pi Sqrt[2 n/3]] / (4 n Sqrt[3])
relativeError[n_] := Abs[p[n] - approx[n]]/p[n]
ListLinePlot[Table[relativeError[n], {n, 100}]]

So for values of n around 100, the approximation is off by about 5%.

Since it’s an asymptotic approximation, the relative error decreases (albeit slowly, apparently) as n increases. The relative error for n = 1,000 is about 1.4% and the relative error for n = 1,000,000 is about 0.044%.

Update: After John Baez commented on the oscillation in the relative error I decided to go back and look at it more carefully. Do the oscillations end or do they just become too small to see?

To answer this, let’s plot the difference in consecutive terms.

relerr[a_, b_] := Abs[a - b]/b
t = Table[relerr[p[n], approx[n]], {n, 300}];
ListLinePlot[Table[ t[[n + 1]] - t[[n]], {n, 60}]]

first differences of relative error

The plot crosses back and forth across the zero line, indicating that the relative error alternately increases and decreases, but only up to a point. Past n = 25 the plot stays below the zero line; the sign changes in the first differences stop.

But now we see that the first differences themselves alternate! We can investigate the alternation in first differences by plotting second differences [1].

    Table[ t[[n + 2]] - 2 t[[n + 1]] + t[[n]], 
    {n, 25, 120}]

first differences of relative error

So it appears that the second differences keep crossing the zero line for a lot longer, so far out that it’s hard to see. In fact the second differences become positive and stay positive after n = 120. But the second differences keep alternating, so you could look at third differences …

Related posts: Special numbers


[1] The code does a small algebraic simplification that might make some people scratch their heads. All it does is simplify

(tn+2tn+1) – (tn+1tn).

Reciprocals of primes

Here’s an interesting little tidbit:

For any prime p except 2 and 5, the decimal expansion of 1/p repeats with a period that divides p-1.

The period could be as large as p-1, but no larger. If it’s less than p-1, then it’s a divisor of p-1.

Here are a few examples.

1/3 = 0.33…
1/7 = 0.142857142857…
1/11= 0.0909…
1/13 = 0.076923076923…
1/17 = 0.05882352941176470588235294117647…
1/19 = 0.052631578947368421052631578947368421…
1/23 = 0.04347826086956521739130434782608695652173913…

Here’s a table summarizing the periods above.

| prime | period |
|     3 |      1 |
|     7 |      6 |
|    11 |      2 |
|    13 |      6 |
|    17 |     16 |
|    19 |     18 |
|    23 |     22 |

For a proof of the claims above, and more general results, see Periods of Fractions.

Probability of coprime sets

The latest blog post from Gödel’s Lost Letter and P=NP looks at the problem of finding relatively prime pairs of large numbers. In particular, they want a deterministic algorithm. They mention in passing that the probability of a set of k large integers being relatively prime (coprime) is 1/ζ(k) where ζ is the Riemann zeta function. This blog post will look closer at that statement.

How large is large?

What does it mean that the numbers are large? That they are large enough that the asymptotic result given by the zeta function is accurate enough. We’ll explore this with simulation code shortly.

The exact statement is that in the limit as N goes to infinity, the probability of k integers chosen uniformly from 1 to N being coprime converges to 1/ζ(k). I won’t go into the proof here, but in a nutshell, it falls out of Euler’s product formula for the zeta function.

What does it mean for a set of numbers to be coprime?

The probability above seemed wrong to me at first. The function ζ(k) decreases as k increases, and so its reciprocal increases. That means it’s more likely that a set of numbers is coprime as the size of the set increases. But the larger the set of numbers, the more likely it is that two will have a common factor, so shouldn’t the probability that they’re coprime go down with k, not up?

The resolution to the apparent contradiction is that I had the wrong idea of coprime in mind. Two integers are coprime if they have no prime factors in common. How do you generalize that to more than two integers? You could define a set of numbers to be coprime if every pair of numbers in the set is coprime. That’s the definition I had in mind. But that’s not the standard definition. The right definition here is that the greatest common divisor of the set is 1. For example, (6, 14, 21) would be coprime because no single prime divides all three numbers, even though no pair of numbers from the set is coprime.

Python simulation

Let’s write some Python code to try this out. We’ll randomly generate some sets of large numbers and see how often they’re coprime.

How do you generate large random integers in Python? You can use the function getrandbits to generate numbers as large as you like. In the code below we’ll generate 128-bit integers.

How do you compute the greatest common divisor in Python? There’s a library function gcd, but it only takes two integers. We’ll use the reduce function to fold gcd over a list of integers.

How do you compute the zeta function in Python? SciPy doesn’t have an implementation of ζ(x) per se, but it does have an implementation of ζ(x)-1. Why not just implement ζ itself? Because for large x, ζ(x) is close to 1, so more precision can be saved by computing ζ – 1.

Putting these pieces together, here’s code to randomly generate triples of 128-bit integers, see how often they’re coprime, and compare the result to 1/ζ(3).

    from random import getrandbits
    from fractions import gcd
    from functools import reduce
    from scipy.special import zetac
    def riemann_zeta(x):
        return zetac(x) + 1
    k = 3
    size = 128
    N = 10000
    coprime_count = 0
    for _ in range(N):
        x = [getrandbits(size) for _ in range(k)]
        if reduce(gcd, x) == 1:
            coprime_count += 1

When I ran this I got


Simulation agrees with theory to two decimal places, which is just what we should expect from 10,000 repetitions. (We’d expect error on the order of 1/√N.)

Here’s what I got when I changed k to 4:


And here’s what I got for k set to 5:


Related posts

Sums of palindromes

Every positive integer can be written as the sum of three palindromes, numbers that remain the same when their digits are reverse. For example, 389 = 11 + 55 + 323. This holds not just for base 10 but for any base b ≥ 5.

[Update: a new paper on arXiv extends this to b = 3 and 4 as well. Base 2 requires four palindromes.]

The result and algorithms for finding the palindromes was published online last August and is in the most recent print issue of Mathematics of Computation.

Javier Cilleruelo, Florian Luca and Lewis Baxter. Every positive integer is a sum of three palindromes. DOI:

Fibonacci / Binomial coefficient identity

Here’s an identity relating Fibonacci numbers and binomial coefficients.

F_{n+1} = {n \choose 0} + {n-1 \choose 1} + {n-2\choose 2} + \cdots

You can think of it as a finite sum or an infinite sum: binomial coefficients are zero when the numerator is an integer and the denominator is a larger integer. See the general definition of binomial coefficients.

Source: Sam E. Ganis. Notes on the Fibonacci Sequence. The American Mathematical Monthly, Vol. 66, No. 2 (Feb., 1959), pp. 129-130

More Fibonacci number posts:

New prime number record: 50th Mersenne prime

A new record for the largest known prime was announced yesterday:

2^{77,232,917} - 1

This number has 23,249,425 digits when written in base 10.

In base 2, 2p – 1 is a sequence of p ones. For example, 31 = 25 -1  which is 11111 in binary. So in binary, the new record prime is a string of 77,232,917 ones.

77,232,917 ones

You can convert a binary number to hexadecimal (base 16) by starting at the right end and converting blocks of 4 bits into hexadecimal. For example, to convert 101101111 to hex, we break it into three blocks: 1, 0110, and 1111. These blocks convert to 1, 6, and F, and so 101101111 in binary corresponds to 16F in hex.

Now 77,232,917  = 19,308,229 * 4 + 1, so if we break our string of 77,232,917 ones into groups of four, we have a single bit followed by 19,308,229 groups of 4. This means that in hexadecimal, the new prime record is 1FFF…FFF, a 1 followed by 19,308,229 F’s.

19,308,229 Fs

The new record is the 50th Mersenne prime. A Mersenne prime is a prime number that is one less than a power of 2, i.e. of the form 2p – 1. It turns out p must be prime before 2p – 1 can be prime. In the case of the new record, 77,232,917 is prime.

It is unknown whether there are infinitely many Mersenne primes. But now we know there are at least 50 of them.

All the recent prime number records have been Mersenne primes because there is an algorithm for testing whether a number of the special form 2p – 1 is prime, the Lucas-Lehmer test.

Mersenne primes are closely related to perfect numbers. A number n is perfect if the sum of the numbers less than n that divide n equals n. For example, 28 is divisible by 1, 2, 4, 7, and 14, and 1 + 2 + 4 + 7 + 14 = 28.

Finding a new Mersenne prime means finding a new perfect number. If m is a Mersenne prime, then nm(m + 1)/2 is a perfect number. Conversely, if n is an even perfect number, n has the form m(m + 1)/2, Nobody knows whether odd perfect numbers exist. Since we don’t know whether there are infinitely many Mersenne primes, we don’t know whether there are infinitely many perfect numbers. But there are at least 50 of them.

Related posts:

Distribution of Fibonacci numbers mod m

The last digits of Fibonacci numbers repeat with period 60. This is something I’ve written about before.

The 61st Fibonacci number is 2504730781961. The 62nd is 4052739537881. Since these end in 1 and 1, the 63rd Fibonacci number must end in 2, etc. and so the pattern starts over.

It’s not obvious that the cycle should have length 60, but it is fairly easy to see that there must be a cycle.

In any base, the last digits must cycle. The length of these cycles varies erratically:

In this post I want to as a different question: how often do Fibonacci numbers take on each of the possible last digits in each base? In other words, how are the Fibonacci numbers distributed mod m for various values of m?

I haven’t tried to prove any theorems. I’ve just run some examples using the following Python code.

    def histogram(base):

        prev = 1
        F = 2
        counts = [0]*base
        counts[F] += 1
        while F != 1 or prev != 1:
            temp = F
            F = (F + prev) % base
            counts[F] += 1    
            prev = temp

        return counts

In base 10, the last digits of Fibonacci numbers have period 60. Do all digits appear in the cycle? Do they all appear equally often?

Yes and no. Here are the results for base 10:

    >>> histogram(10) 
    >>> [4, 8, 4, 8, 4, 8, 4, 8, 4, 8]

Each even digits appears 4 times and each odd digit appears 8 times.

What happens if we look at the last two digits, i.e. if we look at Fibonacci numbers mod 100?

    >>> histogram(100)
    >>> [2, 6, 2, 2, …, 2, 6, 2, 2]

Each two-digit number n appears six times if n = 1 mod 4. Otherwise it appears two times.

What about the last three digits? Now we see some zeros. For example, no Fibonacci number is congruent to 4 or 6 mod 1000. (Or mod 8.)

    >>> histogram(1000)
    >>> [2, 3, 2, 1, 0, 3, 0, 1, …, 2, 3, 2, 1, 0, 3, 0, 1]

Here’s something curious. The Fibonacci numbers are exactly evenly distributed mod 5.

    >>> histogram(5)
    >>> [4, 4, 4, 4, 4]

The same is apparently true for all powers of 5. Not only are all the frequencies the same, they’re all 4’s. I’ve tested this for powers of 5 up to 510. And conversely, it seems the Fibonacci numbers are not evenly distributed mod m unless m is a power of 5. I’ve tested this for m up to 100,000.

Conjecture: The Fibonacci numbers are uniformly distributed mod m if and only if m is a power of 5.

Update: The conjecture is correct, and was proved in 1972 by Niederreiter.


Recent exponential sums

The exponential sum of the day draws a line between consecutive partial sums of

\sum_{n=0}^N \exp\left( 2\pi i \left( \frac{n}{m} + \frac{n^2}{d} + \frac{n^3}{y} \right ) \right )

where md, and y are the current month, day, and two-digit year. The four most recent images show how different these plots can be.

exponential sums

These images are from 10/30/17, 10/31/17, 11/1/17, and 11/2/17.

Consecutive dates often produce very different images for a couple reasons. First, consecutive integers are relatively prime. From a number theoretic perspective, 30 and 31 are very different, for example. (This touches on the motivation for p-adic numbers: put a different metric on integers, one based on their prime factorization.)

The other reason consecutive dates produce qualitatively different images is that you might roll over a month or year, such as going from October (10) to November (11). You’ll see a big change when we roll over from 2017 to 2018.

The partial sums are periodic [1] with period lcm(mdy). The image for 10/31/17 above has the most points because 10, 31, and 17 are relatively prime. It’s also true that 11, 2, and 17 are relatively prime, but these are smaller numbers.

You could think of the month, day, and year components of the sum as three different gears. The sums repeat when all three gears return to their initial positions. In the image for yesterday, 11/1/17, the middle gear is effectively not there.

[1] The terms of the sums are periodic. The partial sums are periodic if the full sum is zero. So if the partial sums are periodic, the lcm is a period.

Poisson distribution and prime numbers

Let ω(n) be the number of distinct prime factors of x. A theorem of Landau says that for N large, then for randomly selected positive integers less than N, ω-1 has a Poisson(log log N) distribution. This statement holds in the limit as N goes to infinity.

Apparently N has to be extremely large before the result approximately holds. I ran an experiment for N = 10,000,000 and the fit is not great.

Here’s the data:

|    | Observed | Predicted |
|  1 |   665134 |  620420.6 |
|  2 |  2536837 | 1724733.8 |
|  3 |  3642766 | 2397330.6 |
|  4 |  2389433 | 2221480.4 |
|  5 |   691209 | 1543897.1 |
|  6 |    72902 |  858389.0 |
|  7 |     1716 |  397712.0 |
|  8 |        1 |  157945.2 |
|  9 |        0 |   54884.8 |
| 10 |        0 |   16952.9 |

And here’s a plot:

bar chart of actual vs predicted

I found the above theorem in these notes. It’s possible I’ve misunderstood something or there’s an error in the notes. I haven’t found a more formal reference on the theorem yet.

Update: According to the results in the comments below, it seems that the notes I cited may be wrong. The notes say “distinct prime factors”, i.e. ω(n), while numerical results suggest they meant to say Ω(n), the number of prime factors counted with multiplicity. I verified the numbers given below. Here’s a plot.


Here’s the Python code I used to generate the table. (To match the code for the revised graph, replace omega which calculated ω(n) with the SymPy function primeomega which calculates Ω(n).)

from sympy import factorint
from numpy import empty, log, log2
from scipy.stats import poisson

N = 10000000

def omega(n):
    return len(factorint(n))

table = empty(N, int)
table[0] = table[1] = 0
for n in range(2, N):
    table[n] = omega(n)

# upper bound on the largest value of omega(n) for n < N.
u = int(log2(N))

for n in range(1, u+1):
    print(n, len(table[table==n]), poisson(log(log(N))).pmf(n-1)*N)

Related posts

Finding numbers in pi

You can find any integer you want as a substring of the digits in π. (Probably. See footnote for details.) So you could encode a number by reporting where it appears.

If you want to encode a single digit, the best you can do is break even: it takes at least one digit to specify the location of another digit. For example, 9 is the 5th digit of π. But 7 doesn’t appear until the 13th digit. In that case it would take 2 digits to encode 1 digit.

What if we work in another base b? Nothing changes as long as we also describe positions in base b. But there’s a possibility of compression if we look for digits of base b but describe their position using base p numbers where p < b. For example, 15 is the 2nd base-100 “digit” in π.

Blocks of digits

For the rest of this post we’ll describe blocks of k digits by their position as a base 10 number. That is, we’ll use p = 10 and b = 10k in the notation above.

There are ten 2-digit numbers that can be described by a 1-digit position in π: 14, 15, 32, …, 92. There are 57 more 2-digit numbers that can be described by a 2-digit position. The other 33 2-digit numbers require 3 digits to describe their position.

So if we encode a 2-digit number by its position in pairs of digits in π, there’s a 1/10 chance that we’ll only need 1 digit, i.e. a 10% chance of compression. The chance that we’ll break even by needing two digits is 57/100, and the chance that we’ll need three digits, i.e. more digits than what we’re encoding, is 33/100. On average, we expect to use 2.23 digits to encode 2 digits.

If we look at 3-digit numbers, there are 10 that can be described by a 1-digit position, 83 that require 2 digits, 543 that require 3 digits, and 364 that require 4 digits. So on average we’d need 3.352 digits to encode 3 digits. Our “compression” scheme makes things about 8% larger on average.

With 4-digit numbers, we need up to 5 digits to describe the position of each, and on average we need 4.2611 digits.

With 5-digit numbers, we need at least 7 digits to describe each position. As I write this, my program is still running. The positions of 99,994 five-digit numbers can be described by numbers with at most 6 digits. The remaining 6 need at least 7 digits. Assuming they need exactly seven digits, we need an average of 5.2623 digits to encode 5-digit numbers by their position in π.

Compression scheme efficiency

If we’re assuming the numbers we’re wishing to encode are uniformly distributed, the representation as a location in π will take more digits than the number itself, on average. But all compression algorithms expand their contents on average if by “average” you mean picking all possible inputs with the same probability. Uniform randomness is not compressible. It takes n bits to describe n random bits.

Compression methods are practical because their inputs are not completely random. For example, English prose compresses nicely because it’s not random.

If we had some reason to suspect that a number came from a block of digits in π, and one not too far our, then the scheme here could be a form of compression. The possibility of efficient compression comes from an assumption about our input.


You could extend the ideas in this post theoretically or empirically, i.e. you could write a theorem or a program.

Suppose you look at blocks of k base-b numbers whose position is described as a base b number. The case b = 2 seems particularly interesting. What is the compression efficiency of the method as k varies? What is it for particular k and what is it in the limit as k does to infinity?

You could look at this empirically for digits of π, or some other number, or you could try to prove it theoretically for digits of a normal number.

Footnote on normal numbers

It might not be true that you can find any integer in the digits of π, though we know it’s true for numbers of the size we’re considering here. You can find any number in the digits of π if it is a “normal number.” Roughly speaking, a number is normal if you can find any pattern in its digits with the probability you’d expect. That is, 1/10 of the digits in its decimal expansion will be 7’s, for example. Or if you grouped the digits in pairs, thinking of the number as being in base 100, 1/100 of the pairs to be 42’s, etc.

Almost all numbers are normal, so in a sense, the probability that π is normal is 1. Even though almost all numbers are normal, nobody has been able to prove that that any familiar number is normal: π, e, √2, etc.

It’s possible that every integer appears in the decimal expansion of π, but not necessarily with the expected probability. This would be a weaker condition than normality. Maybe this has been proven for π, but I don’t know. It would be strange if every number appeared in the digits of π but with some bias.

Empirically testing the Chowla conjecture

Terry Tao’s most recent blog post looks at the Chowla conjecture theoretically. This post looks at the same conjecture empirically using Python. (Which is much easier!)

The Liouville function λ(n) is (-1)Ω(n) where Ω(n) is the number of prime factors of n counted with multiplicity. So, for example, Ω(9) = 2 because even though 9 has only one distinct prime factor, that factor counts twice.

Given some set of k strides h1, h2, …, hk, define

f(n) = λ(n + h1) λ(n + h1) … λ(n + hk).

The Chowla conjecture says that the average of the first N values of f(n) tends to zero as N goes to infinity. This has been proven for k = 1 but not for larger k.

If f(n) acts like a Bernoulli random variable, our experiments might increase our confidence in the conjecture, but they wouldn’t prove anything. Unexpected results wouldn’t prove anything either, but a departure from random behavior might help find a counterexample if the conjecture is false.

We’re going to be evaluating the Liouville function repeatedly at the same arguments, so it will save some compute time if we tabulate its values. This will also let us use some compact Python notation to average f. We’ll need to tabulate the function up to Nhk.

In the code below, maxstride is an upper bound on the strides hk we may look at. SymPy has a function primeomega that calculates Ω(n) so we might as well use it. If you wanted to use a very large value of N, you might want to fill the array of Liouville function values using a more direct approach that avoids all the factoring implicit in calls to primeomega.

    from sympy import primeomega
    from numpy import empty

    N = 10000
    maxstride = 100

    liouville = empty(N + maxstride)
    liouville[0] = 1
    for n in range(1, len(liouville)):
        liouville[n] = (-1)**primeomega(n)

The following code looks at the Chowla conjecture for h1 = 0 and h2 ranging over 1, 2, …, 99.

    average = empty(maxstride-1)
    for s in range(1, maxstride):
        average[s-1] = (liouville[0:N] * liouville[s:N+s]).sum()/N

If the Liouville function is distributed like a random variable taking on -1 and 1 with equal probability, we’d expect the averages to vary like samples form a normal distribution with mean 0 and variance 1/(2N).

    print( (2*N)**-0.5 )

This returns


and so the means are indeed near zero, and the standard deviation of the samples is about what we’d expect.

What if we replace Ω(n), the number of prime factors with multiplicity, with ω(n), the number of distinct prime factors? The averages above are still small, but the sample variance is about twice as big as before. I’ve seen this pattern with several different large values of N, so I suspect there’s something going on here.

(I didn’t see a function in SymPy corresponding to ω(n), but you can create your own with len(factorint(n)).

Exponential sums make pretty pictures

Exponential sums are a specialized area of math that studies series with terms that are complex exponentials. Estimating such sums is delicate work. General estimation techniques are ham-fisted compared to what is possible with techniques specialized for these particular sums. Exponential sums are closely related to Fourier analysis and number theory.

Exponential sums also make pretty pictures. If you make a scatter plot of the sequence of partial sums you can get surprising shapes. This is related to the trickiness of estimating such sums: the partial sums don’t simply monotonically converge to a limit.

The exponential sum page at UNSW suggests playing around with polynomials with dates in the denominator. If we take that suggestion with today’s date, we get the curve below:

f(n) = n/10 + n**2/7 + n**3/17

These are the partial sums of exp(2πi f(n)) where f(n) = n/10 + n²/7 + n³/17.

[Update: You can get an image each day for the current day’s date here.]

Here’s the code that produced the image.

    import matplotlib.pyplot as plt
    from numpy import array, pi, exp, log

    N = 12000
    def f(n):
        return n/10 + n**2/7 + n**3/17 

    z = array( [exp( 2*pi*1j*f(n) ) for n in range(3, N+3)] )
    z = z.cumsum()

    plt.plot(z.real, z.imag, color='#333399')

If we use logarithms, we get interesting spirals. Here f(n) = log(n)4.1.

f(n) = log(n)**4.1

And we can mix polynomials with logarithms. Here f(n) = log(n) + n²/100.

f(n) = log(n) + n**2/100

In this last image, I reduced the number of points from 12,000 to 1200. With a large number of points the spiral nature dominates and you don’t see the swirls along the spiral as clearly.


Pascal’s triangle and Fermat’s little theorem

I was listening to My Favorite Theorem when Jordan Ellenberg said something in passing about proving Fermat’s little theorem from Pascal’s triangle. I wasn’t familiar with that, and fortunately Evelyn Lamb wasn’t either and so she asked him to explain.

Fermat’s little theorem says that for any prime p, then for any integer a,

ap = a (mod p).

That is, ap and a have the same remainder when you divide by p. Jordan Ellenberg picked the special case of a = 2 as his favorite theorem for the purpose of the podcast. And it’s this special case that can be proved from Pascal’s triangle.

The pth row of Pascal’s triangle consists of the coefficients of (xy)p when expanded using the binomial theorem. By setting x = y = 1, you can see that the numbers in the row must add up to 2p. Also, all the numbers in the row are divisible by p except for the 1’s on each end. So the remainder when 2p is divided by p is 2.

It’s easy to observe that the numbers in the pth row, except for the ends, are divisible by p. For example, when p = 5, the numbers are 1, 5, 10, 10, 5, 1. When p = 7, the numbers are 1, 7, 28, 35, 35, 28, 7, 1.

To prove this you have to show that the binomial coefficient C(p, k) is divisible by p when 0 < k < p. When you expand the binomial coefficient into factorials, you see that there’s a factor of p on top, and nothing can cancel it out because p is prime and the numbers in the denominator are less than p.

By the way, you can form an analogous proof for the general case of Fermat’s little theorem by expanding

(1 + 1 + 1 + … + 1)p

using the multinomial generalization of the binomial theorem.