Riemann hypothesis, the fine structure constant, and the Todd function

This morning Sir Michael Atiyah gave a presentation at the Heidelberg Laureate Forum with a claimed proof of the Riemann hypothesis. The Riemann hypothesis (RH) is the most famous open problem in mathematics, and yet Atiyah claims to have a simple proof.

Photo via https://twitter.com/QuinteScience/status/1044134956996468736

Simple proofs of famous conjectures

If anyone else claimed a simple proof of RH they’d immediately be dismissed as a crank. In fact, many people have sent me simple proofs of RH just in the last few days in response to my blog post, and I imagine they’re all cranks [1]. But Atiyah is not a crank. He won the Fields Medal in 1966 and the Abel prize in 2004. In other words, he was in the top echelon of mathematicians 50 years ago and has keep going from there. There has been speculation that although Atiyah is not a crank, he has gotten less careful with age. (He’s 89 years old.)

QuinteScience, source of the image above, quoted Atiyah as saying

Solve the Riemann hypothesis and you’ll become famous. But if you’re already famous, you run the risk of becoming infamous.

If Atiyah had a simple self-contained proof of RH that would be too much to believe. Famous conjectures that have been open for 150 years don’t have simple self-contained proofs. It’s logically possible, but practically speaking it’s safe to assume that the space of possible simple proofs has been very thoroughly explored by now.

But Atiyah’s claimed proof is not self-contained. It’s really a corollary, though I haven’t seen anyone else calling it that. He is claiming that a proof of RH follows easily from his work on the Todd function, which hasn’t been published. If his proof is correct, the hard work is elsewhere.

Andrew Wiles’ proof of Fermat’s last theorem was also a corollary. He proved a special case of the Taniyama–Shimura conjecture, and at end of a series of lectures noted, almost as an afterthought, that his work implied a proof to Fermat’s last theorem. Experts realized this was where he was going before he said it. Atiyah has chosen the opposite approach, presenting his corollary first.

Connections with physics

Atiyah has spoken about connections between mathematics and physics for years. Maybe he was alluding to his work on the the fine structure constant which he claims yields RH as a corollary. And he is not the only person talking about connections between the Riemann hypothesis specifically and physics. For example, there was a paper in Physical Review Letters last year by Bender, Brody, and Müller stating a possible connection. I don’t know whether this is related to Atiyah’s work.

Fine structure constant

The fine structure constant is a dimensionless physical constant α, given by

\alpha = \frac{e^2}{4 \pi \varepsilon_0 \hbar c}

where e is the elementary charge, ħ is the reduced Planck constant, and c is the speed of light in a vacuum. Its value is roughly 1/137.

The Todd function

The Todd function T is a function introduced by Atiyah, named after his teacher J. A. Todd. We don’t know much about this function, except that it is key to Atiyah’s proof. Atiyah says the details are to be found in his manuscript The Fine Structure Constant which has been submitted to the Proceedings of the Royal Society.

Atiyah says that his manuscript shows that on the critical line of the Riemann zeta function, the line with real part 1/2, the Todd function has a limit ж and that the fine structure constant α is exactly 1/ж. That is,

limy → ∞ T(1/2 + yi) = ж = 1/α.

Now I don’t know what he means by proving that a physical constant has an exact mathematical value; the fine structure constant is something that is empirically measured. Perhaps he means that in some mathematical model of physics, the fine structure constant has a precise mathematical value, and that value is the limit of his Todd function.

Or maybe it’s something like Koide’s coincidence where a mathematical constant is within the error tolerance of a physical constant, an interesting but not necessarily important observation.

Taking risks

Michael Atiyah is taking a big risk. I’ve seen lots of criticism of Atiyah online. As far as I know, none of the critics have a Fields Medal or Abel Prize in their closet.

Atiyah’s proof is probably wrong, just because proofs of big theorems are usually wrong. Andrew Wiles’ proof of Fermat’s Last Theorem had a flaw that took a year to patch. We don’t know who Atiyah has shown his work to. If he hasn’t shown it to anyone, then it is almost certainly flawed: nobody does flawless work alone. Maybe his proof has a patchable flaw. Maybe it is flawed beyond repair, but contains interesting ideas worth pursuing further.

The worst case scenario is that Atiyah’s work on the fine structure constant and the Todd function is full of holes. He has made other big claims in the last few years that didn’t work out. Some say he should quit doing mathematics because he has made big mistakes.

I’ve made big mistakes too, and I’m not quitting. I make mistakes doing far less ambitious work than trying to prove the Riemann hypothesis. I doubt I’ll ever produce anything as deep as a plausible but flawed proof of the Riemann hypothesis.

Related posts

[1] I don’t call someone a crank just because they’re wrong. My idea of a crank is someone without experience in an area, who thinks he has found a simple solution to a famous problem, and who believes there is a conspiracy to suppress their work. They’re not just wrong, they can’t conceive that they might be wrong.

Three applications of Euler’s theorem

Fermat’s little theorem says that if p is a prime and a is not a multiple of p, then

ap-1 = 1 (mod p).

Euler’s generalization of Fermat’s little theorem says that if a is relatively prime to m, then

aφ(m) = 1 (mod m)

where φ(m) is Euler’s so-called totient function. This function counts the number of positive integers less than m and relatively prime to m. For a prime number p, φ(p) = p-1, and to Euler’s theorem generalizes Fermat’s theorem.

Euler’s totient function is multiplicative, that is, if a and b are relatively prime, then φ(ab) = φ(a) φ(b). We will need this fact below.

This post looks at three applications of Fermat’s little theorem and Euler’s generalization:

  • Primality testing
  • Party tricks
  • RSA public key encryption

Primality testing

The contrapositive of Fermat’s little theorem is useful in primality testing: if the congruence

ap-1 = 1 (mod p)

does not hold, then either p is not prime or a is a multiple of p. In practice, a is much smaller than p, and so one can conclude that p is not prime.

Technically this is a test for non-primality: it can only prove that a number is not prime. For example, if 2p-1 is not congruent to 1 (mod p) then we know p is not a prime. But if 2p-1 is congruent to 1 (mod p) then all we know is that we haven’t failed the test; it’s still conceivable that p is prime. So we try another value of a, say 3, and see whether 3p-1 is congruent to 1 (mod p).

If we haven’t disproved that p is prime after several attempts, we have reason to believe p is probably prime.[1]. There are pseudoprimes, a.k.a. Carmichael numbers that are not prime but pass the primality test above for all values of a. But these numbers are much less common than primes.

By the way, if p is a huge number, say with hundreds or thousands of digits, doesn’t it seem odd that we would want to compute numbers to the power p? Actually computing ap would be impossible. But because we’re computing mod p, this is actually easy. We can apply the fast exponentiation algorithm and take remainders by p at every step, so we’re never working with numbers more than twice as long as p.

Fifth root party trick

A few days ago I wrote about the fifth root party trick. If someone raises a two-digit number to the fifth power, you can quickly tell what the number was. Part of what makes the trick work is that in base 10, any number n and its fifth power end in the same digit. You can prove this by trying all 10 possible last digits, but if you want to generalize the trick to other bases, it helps to use Euler’s theorem. For example, you could use 9th powers in base 15.

Euler’s theorem shows why raising a to the power  φ(m) + 1 in base m keeps the last digit the same, but only if a is relatively prime to m. To extend the fifth root trick to other bases you’ll have a little more work to do.

RSA encryption

The original [2] RSA public key cryptography algorithm was a clever use of Euler’s theorem.

Search for two enormous prime numbers p and q [3]. Keep p and q private, but make npq public. Pick a private key d and solve for a public key e such that de = 1 (mod φ(n)).

Since you know p and q, you can compute φ(n) = (p – 1)(q – 1), and so you can compute the public key e. But someone who doesn’t know p and q, but only their product n, will have a hard time solving for d from knowing e. Or at least that’s the hope! Being able to factor n is sufficient to break the encryption scheme, but it’s not logically necessary. Maybe recovering private keys is much easier than factoring, though that doesn’t seem to be the case.

So where does Euler come in? Someone who has your public key e and wants to send you a message m computes

me (mod n)

and sends you the result [4]. Now, because you know d, you can take the encrypted message me and compute

(me)d = mφ(n) = 1 (mod n)

by Euler’s theorem.

This is the basic idea of RSA encryption, but there are many practical details to implement the RSA algorithm well. For example, you don’t want p and q to be primes that make pq easier than usual to factor, so you want to use safe primes.

Related posts

[1] Saying that a number is “probably prime” makes sense the first time you see it. But then after a while it might bother you. This post examines and resolves the difficulties in saying that a number is “probably prime.”

[2] The original RSA paper used Euler’s totient function φ(n) = (p – 1)(q – 1), but current implementations use Carmichael’s totient function λ(n) = lcm(p – 1, q – 1). Yes, same Carmichael as Carmichael numbers mentioned above, Robert Daniel Carmichael (1879–1967).

[3] How long does it take to find big primes? See here. One of the steps in the process it to weed out composite numbers that fail the primality test above based on Fermat’s little theorem.

[4] This assumes the message has been broken down into segments shorter than n. In practice, RSA encryption is used to send keys for non-public key (symmetric) encryption methods because these methods are more computationally efficient.

News regarding ABC conjecture and Riemann Hypothesis

There have been a couple news stories regarding proofs of major theorems. First, an update on Shinichi Mochizuki’s proof of the abc conjecture, then an announcement that Sir Michael Atiyah claims to have proven the Riemann hypothesis.

Shinichi Mochizuki’s proof of the abc conjecture

Quanta Magazine has a story today saying that two mathematicians have concluded that Shinichi Mochizuki’s proof of the ABC conjecture is flawed beyond repair. The story rightly refers to a “clash of Titans” because Shinichi Mochizuki and his two critics Peter Scholze and Jakob Stix are all three highly respected.

I first wrote about the abc conjecture when it came out in 2012. In find the proposed proof fascinating, not because I understand it, but because nobody understands it. The proof is 500 pages of dense, sui generis mathematics. Top mathematicians have been trying to digest the proof for six years now. Scholze and Stix believe they’ve found an irreparable hole in the proof, but Mochizuki has not conceded defeat.

Sometimes when a flaw is found in a proof, the flaw can later be fixed, as was the case with Andrew Wiles’ proof of Fermat’s last theorem. Other times the proof cannot be salvaged entirely, but interesting work comes out of it, as was the case with the failed attempts to prove FLT before Wiles. What will happen with Mochizuki’s proof of the abc conjecture if it cannot be repaired? I can imagine two outcomes.

  1. Because the proof is so far out of the mainstream, there must be useful pieces of it that can be salvaged.
  2. Because the proof is so far out of the mainstream, nobody will build on the work and it will be a dead end.

Michael Atiyah and the Riemann hypothesis

This morning I heard rumors that Michael Atiyah claims to have proven the Riemann hypothesis. The Heidelberg Laureate Forum twitter account confirmed that Atiyah is scheduled to announce his work at the forum on Monday.

I was a fan of Atiyah’s work when I was a grad student, and I got to talk to him at the 2013 and 2014 Heidelberg Laureate Forums. I hope that he really has a proof, but all the reaction I’ve seen has been highly skeptical. He claims to have a relatively simple proof, and long-standing conjectures rarely have simple proofs.

It would be great for the Riemann hypothesis to be settled, and it would be great for Atiyah to be the one who settles it.

Whereas Mochizuki’s proof is radically outside the mainstream, Atiyah’s proof appears to be radically inside the mainstream. He says he has a “radically new approach … based on work of von Neumann (1936), Hirzebruch (1954) and Dirac (1928).” It doesn’t seem likely that someone could prove the Riemann hypothesis using such classical mathematics. But maybe he found a new approach by using approaches that are not fashionable.

I hope that if Atiyah’s proof doesn’t hold up, at least something new comes out of it.

Update (9/23/2018): I’ve heard a rumor that Atiyah has posted a preprint to arXiv, but I don’t see it there. I did find a paper online that appeared to be his that someone posted to Dropbox.  It gives a very short proof of RH by contradiction, based on a construction using the Todd function. Apparently all the real work is in his paper The Fine Structure Constant which has been submitted to Proceedings of the Royal Society. I have not been able to find a preprint of that article.

Atiyah gave a presentation of his proof in Heidelberg this morning. From what I gather the presentation didn’t remove much mystery because what he did was show that RH follows as a corollary of his work on the Todd function that hasn’t been made public yet.

Update: More on Atiyah’s proof here.

What are these theorems?

The abc conjecture claims a deep relationship between addition and multiplication of integers. Specifically, for every ε > 0, there are only finitely many coprime triples (abc) such that abc and c > rad(abc)1 + ε.

Here coprime means ab, and c share no common divisor larger than 1. Also, rad(abc) is the product of the distinct prime factors of ab, and c.

This is a very technical theorem (conjecture?) and would have widespread consequences in number theory if true. This is sort of the opposite of Fermat’s Last Theorem: the method of proof had widespread application, but the result itself did not. With the abc conjecture, it remains to be seen whether the method of (attempted?) proof has application.

The Riemann hypothesis concerns the Riemann zeta function, a function ζ of complex values that encodes information about primes. The so-called trivial zeros of ζ are at negative integers. The Riemann hypothesis says that the rest of the zeros are all lined up vertically with real part 1/2 and varying imaginary parts.

Many theorems have been proved conditional on the Riemann hypothesis. If it is proven, a lot of other theorems would immediately follow.

Related posts

An empirical look at the Goldbach conjecture

The Goldbach conjecture says that every even number bigger than 2 is the sum of two primes. I imagine he tried out his idea on numbers up to a certain point and guessed that he could keep going. He lived in the 18th century, so he would have done all his calculation by hand. What might he have done if he could have written a Python program?

Let’s start with a list of primes, say the first 100 primes. The 100th prime is p = 541. If an even number less than p is the sum of two primes, it’s the sum of two primes less than p. So by looking at the sums of pairs of primes less than p, we’ll know whether the Goldbach conjecture is true for numbers less than p. And while we’re at it, we could keep track not just of whether a number is the sum of two primes, but also how many ways it is a sum of two primes.

    from sympy import prime
    from numpy import zeros
    N = 100
    p = prime(N)
    primes = [prime(i) for i in range(1, N+1)]
    sums = zeros(p, int)
    for i in range(N):
        # j >= i so we don't double count
        for j in range(i, N):
            s = primes[i] + primes[j]
            if s >= p:
            sums[s] += 1
    # Take the even slots starting with 4
    evens = sums[4::2]
    print( min(evens), max(evens) )

This prints 1 and 32. The former means that every even number greater than 4 and less than p was hit at least once, that every number under consideration was the sum of two primes. The latter means that at least one number less than p can be written as a sum of two primes 32 different ways.

According to the Wikipedia article on the Goldbach conjecture, Nils Pipping manually verified the Goldbach conjecture for even numbers up to 100,000 in 1938, an amazing feat.

There are 9,952 primes less than 100,000 and so we would need to take N = 9592 in our program to reproduce Pipping’s result. This took about seven minutes.

Update: As suggested in the comments, nearly all of the time is being spent generating the list of primes. When I changed the line

    primes = [prime(i) for i in range(1, N+1)]


    primes = [x for x in primerange(1, p)]

the runtime dropped from 7 minutes to 18 seconds.

Pi primes

I was reading a recent blog post by Evelyn Lamb where she mentioned in passing that 314159 is a prime number and that made me curious how many such primes there are.

Let’s look at numbers formed from the digits of π to see which ones are prime.

Obviously 3 and 31 are prime. 314 is even. 3141 is divisible by 9 because its digits sum to 9, and 31415 is clearly divisible by 5. And now we know that 314159 is prime. What’s the next prime in the sequence? Here’s a little Python code to find out.

    from sympy import pi, isprime

    M = 1000
    digits = "3" + str(pi.evalf(M+1))[2:]
    for i in range(1, M+1):
        n = int(digits[:i])
        if isprime(n):

This looks at numbers formed from the first digit up to the thousandth digit in the representation of π. The only other prime it finds is


After writing running the Python code above, I wondered whether this sequence is in OEIS, and indeed it is, sequence A005042. The OEIS says the next number in the sequence is 16,208 digits long.

Expected number of primes

I expected to find more primes out of the first 1,000 digits of π and was surprised that the next prime is so long.

How many primes whould we expect if we look at random numbers of length 1 through 1000? (The numbers formed from the digits of π are not random, but I want to compare them to random selections.)

From the prime number theorem, the probability that a d-digit number is prime is approximately 1/d log 10. So if we pick numbers of length 1, 2, 3, … N, the number of primes we’d expect to find is

\frac{1}{\log 10}\left(1 + \frac{1}{2} + \frac{1}{3} + \frac{1}{4} + \cdots +\frac{1}{N} \right ) = \frac{H_N}{\log 10}

where HN is the Nth harmonic number. Harmonic numbers grow very slowly, and so this gives us an idea why the number of primes we find is so small. In particular, it says if we look out 1,000 digits, we’d expect to find 3.25 primes. And so finding 4 primes isn’t surprising.

By the way, HN is roughly log N, so the expected number of primes is roughly log N / log 10, or simply log10 N. [1]

Other numbers

To see if our prediction holds, let’s try a few more irrational numbers.

First, let’s look at e. Then we get these four prime numbers:


Next, let’s try the square root of 2. Then we get these three prime numbers:


And finally, sin(2018) gives us two primes:


We might expect to find more primes when the leading digit is small because each time we select a number with d digits we’re looking lower in the range where primes are more dense. That is consistent with the small number of examples in this post.

Related posts

[1] When I write “log” without a subscript, I always mean natural log, log base e. If you’re not accustomed to this, see why this is conventional.

Distribution of prime powers

The prime number theorem says that π(x), the number of primes less than or equal to x, is asymptotically x / log x. So it’s easy to estimate the number of primes below some number N. But what if we want to estimate the number of prime powers less than N? This is a question that comes up in finite fields, for example, since there is a finite field with n elements if and only if n is a prime power. It’s also important in finite simple groups because these groups are often indexed by prime powers.

Riemann’s prime-power counting function Π(x) counts the number of prime powers less than or equal to x. Clearly Π(x) > π(x) for x ≥ 4 since every prime is a prime power, and 4 is a prime power but not a prime. Is  Π(x) much bigger than π(x)? What is its limiting distribution, i.e. what is the analog of the prime number theorem for prime powers?

Numerical examples

It turns out Π(x) equals π(x) asymptotically. That is, even though Π(x) is always bigger than π(x), their ratio converges to 1 as x increases.

Why is that? Let’s first look at N = 1,000,000. The number of primes less than one million happens to be 78,498.  The number of prime powers less than N is 78,734. So the latter includes only 236 prime powers with exponent greater than 1.

If we increase N to 1,000,000,000, there are 50,847,534 primes less than N and 50,851,223 prime powers, a difference of 3,689. Said another way, 99.99% of the prime powers less than a billion have exponent 1.

Equation for Π(x)

The number of prime powers less than N with exponent 2 equals the number of primes less than the square root of N. And the number of prime powers less than N with exponent 3 equals the number of primes less than the cube root of N. The number of prime powers with exponent 4 equals the number of primes less than the fourth root of N. Etc.

Even if N is large, these counts start getting small pretty soon. How soon? We’re taking roots of order r until the rth root of is less than 2, because then there are no more primes less than that root. That means we keep going until r > log2 N. And so we have the following equation:

\Pi(x) = \sum_{r=1}^{\lfloor \log_2 x\rfloor} \pi(x^{1/r})

Mathematica and Python code

I looked in Mathematica and SymPy for a function to compute Π(x) and didn’t see one. Maybe I missed something. But in either case it’s easy to implement our own using the equation above.

In Mathematica:

pp[n_] := Sum[PrimePi[n^(1/r)], {r, 1, Log2[n]}]

In Python:

from sympy import primepi
from math import log2

def pp(n):
    top = int(log2(n))
    return sum(
        [primepi(n**(1/r)) for r in range(1, 1+top)]

Distribution of zeros of the Riemann zeta

A recent video by Quanta Magazine says that the eigenvalues of random matrices and the zeros of the Riemann zeta function appear to have the same distribution.

I assume by “random matrices” the video is referring specifically to Gaussian orthogonal ensembles. [Update: See the first comment below for more details. You need some assumptions on the distribution—fat tails are out—but you don’t need to assume a GOE.]

By zeros of the Riemann zeta function, they mean the imaginary parts of the zeros in the critical strip. You can download the first 100,000 zeros of the Riemann zeta function here. So, for example, the first zero is 14.134725142, which actually means 0.5 + 14.134725142 i.

Here’s the histogram of random matrix eigenvalues from my previous post:

And here’s a histogram of the spacing between the first two thousand zeros of the Riemann zeta function:

The distribution of random matrix eigenvalues is known exactly. I believe the distribution of the Riemann zeta function zeros is conjectural but persuasive. Here’s a plot using the first 100,000 zeros.

How long does it take to find large primes?

Descriptions of cryptography algorithms often start with something like “Let p be a large prime” or maybe more specifically “Let p be a 256 bit prime.” How hard is it to find large prime numbers?

Let’s look at the question in base b, so we can address binary and base 10 at the same time. The prime number theorem says that for large n, the number of primes less than bn is approximately

bn / n log b

and so the probability that an n-digit number base b is prime is roughly 1 / n log b. If you select an odd b-digit number, you double your chances.

If you generate odd n-digit numbers at random, how long would it be until you find a prime? You might get lucky and find one on your first try. Or you might have extraordinarily bad luck and find a composite every time until you give up.

The number of tries until you find a prime is a geometric random variable with

p = 2 / n log b.

The expected value is 1 / p.

So, for example, if you wanted to find a 256-bit prime, you would need to test on average 256 log(2) / 2 = 89 candidates. And if you wanted to find a 100-digit prime, you’d need to test on average 100 log(10) / 2 = 115 candidates.

You could get more accurate bounds by using a refined variation of the prime number theorem. Also, we implicitly looked at the problem of finding a number with at most n digits rather than exactly n digits. Neither of these would make much difference to our final result. To back up this claim, let’s compare our estimate to a simulation that actually looks for primes.


The histogram above was based on finding 10,000 primes, each with 100 digits, and counting how many candidates were evaluated each time. The plot matches a geometric distribution well. The mean from the simulation was 113.5, close to the 115 estimated above.

For cryptographic applications, you often don’t want just any prime. You might want a safe prime or a prime satisfying some other property, in which case you’ll need to test more primes. But your probability of success will likely still be geometrically distributed.

It’s plausible that for large pp being prime and 2p + 1 being prime are independent events, at least approximately, and a numerical experiment suggests that’s true. I generated 10,000 primes, each 100-digits long, and 48 were safe primes, about what you’d expect from theory. It is not known whether the number of safe primes is infinite, but safe primes with hundreds of thousands of digits have been found, so apparently they’re infinite enough for application.

Related posts

John Conway’s subprime fib sequence

Here’s how to construct what John Horton Conway calls his “subprime fib” sequence. Seed the sequence with any two positive integers. Then at each step, sum the last two elements of the sequence. If the result is prime, append it to the sequence. Otherwise divide it by its smallest prime factor and append that.

If we start the sequence with (1, 1) we get the sequence

1, 1, 2, 3, 5, 4, 3, 7, 5, 6, 11, 17, 14, 31, 15, 23, 19, 21, 20, 41, 61, 51, 56, 107, 163, 135, 149, 142, 97, 239, 168, 37, 41, 39, 40, 79, 17, 48, 13, 61, 37, 49, 43, 46, 89, 45, 67, 56, 41, 97, 69, 83, 76, 53, 43, 48, 13,

We know it will keep repeating because 48 followed by 13 has occurred before, and the “memory” of the sequence only stretches back two steps.

If we start with (6, 3) we get

6, 3, 3, 3, …

It’s easy to see that if the sequence ever repeats a value n then it will produce only that value from then on, because n + n = 2n is obviously not prime, and its smallest prime factor is 2.

Conway believes that this sequence will always get stuck in a cycle but this has not been proven.

Here’s Python code to try out Conway’s conjecture:

from sympy import isprime, primefactors

def subprime(a, b):

    seq = [a, b]
    repeated = False
    while not repeated:
        s = seq[-1] + seq[-2]
        if not isprime(s):
            s //= primefactors(s)[0]
        repeated = check_repeat(seq, s)
    return seq

def check_repeat(seq, s):
    if not s in seq:
        return False
    if seq[-1] == s:
        return True
    # all previous occurances of the last element in seq
    indices = [i for i, x in enumerate(seq) if x == seq[-1]]
    for i in indices[:-1]:
        if seq[i+1] == s:
            return True
    return False

I’ve verified by brute force that the sequence repeats for all starting values less than 1000.

Related posts:

Gauss’ golden theorem: quadratic reciprocity

Suppose you have an odd prime p and an integer a, with a not a multiple of p. Does a have a square root mod p? That is, does there exist an integer x such that x² is congruent to a mod p? Half the time the answer is yes, and half the time it’s no. (We will remove the restriction that p is prime near the end of the post.)

In principle it’s easy to tell whether a has a square root: square all the integers from 1 to p-1 and see whether one of them is congruent to a. For example, let p = 7. Square the numbers 1 through 6 and take the remainders by 7. You get 1, 4, 2, 2, 4, 1. So the numbers 1, 2, and 4 have square roots mod 7, and the numbers 3, 5, and 6 do not. In application, p might be very large, and so such brute force isn’t practical.

Legendre symbols

Before we go any further it will help to introduce some notation. The Legendre symbol


looks like a fraction, but it’s not. As we’ll see, it behaves something like a fraction, which was the motivation for the notation. But the symbol is defined to be 1 if a has a square root mod p and -1 otherwise.

One key fact is for any numbers m and n that are not multiples of p,

\left(\frac{m}{p}\right) \left(\frac{n}{p}\right) = \left(\frac{mn}{p}\right)

So Legendre symbols multiply like fractions on top but not on bottom. This tells us that we can tell whether a has a square root mod p by factoring a and determining whether each of its factors has a square root. For example, suppose we want to know whether 108 has a square root modulo some prime p. Since 108 = 2² 3³ the question boils down to whether 3 has a square root mod p, because obviously 2² 3² has a square root, namely 2×3 = 6.

Quadratic reciprocity

The most important result we need is Gauss’ quadratic reciprocity theorem, what he called his golden theorem. Gauss published six proofs of this theorem, and left two more proofs unpublished.

One thing you can do with a fraction is take its reciprocal, and the quadratic reciprocity theorem takes the “reciprocal” of the Legendre symbol. The quadratic reciprocity theorem says that if p and q are odd primes, then

\left(\frac{p}{q}\right) \left(\frac{q}{p}\right) = (-1)^{\frac{p-1}{2} \frac{q-1}{2}}

Algorithm for computing Legendre symbols

Suppose pq. The quadratic reciprocity theorem says we can reduce the problem of determining whether p has a square root mod q to the opposite problem, determining whether q has a square root mod p. Why is that progress? Because we can reduce q mod p, and so now we’re dealing with smaller numbers. For example, quadratic reciprocity tells us that 7 has a square root mod 61 if and only if 61 has a square root mod 7. But 61 is congruent to 5 mod 7, so we’ve reduced the question to whether 5 has a square root mod 7, and we know from above that it does not.

You can repeatedly apply the techniques of factoring the top of the Legendre symbol and applying quadratic reciprocity to steadily decrease the size of the numbers you’re dealing with until you get either a 1 or a 2 on top. (Why not apply quadratic reciprocity one more time when you get a 2 on top? Because the theorem applies to odd primes.)

If you get a 1 on top, the answer is obvious: 1 has a square root, namely itself, modulo anything. If you get a 2 on top, you need the result

\left(\frac{2}{p}\right) = (-1)^{\frac{p^2 - 1}{8}}

Jacobi symbols

You might think you could compute the Legendre symbol in Mathematica with a function called LegendreSymbol, but there’s no such function. Instead, you call JacobiSymbol. The Jacobi symbol is a generalization of the Legendre symbol, using the same notation but allowing the “denominator” to be any odd positive number. As before the symbol is defined to be 1 if the top has a square root modulo the bottom, and -1 otherwise.

If m has prime factors pi with exponents ei, then the Jacobi symbol can be computed by

\left(\frac{n}{m}\right) = \prod \left(\frac{n}{p_i} \right )^{e_i}

Technically the symbol on the left is a Jacobi symbol and the symbols on the right are Legendre symbols. But the distinction doesn’t matter because when m is an odd prime, the Jacobi symbol equals the Legendre symbol.

In Python, you can compute Legendre symbols with sympy.ntheory.legendre_symbol and Jacobi symbols with sympy.ntheory.jacobi_symbol.

Finding square roots

So far we’ve described how to determine whether a number has a square root mod m where m is odd. Once you know a square root exists, how do you find it?

These notes show how to solve quadratic congruences, whether m is odd or not.

Almost prime generators and almost integers

Here are two apparently unrelated things you may have seen before. The first is an observation going back to Euler that the polynomial

n^2 - n + 41

produces a long sequence of primes. Namely, the values are prime for n = 1, 2, 3, …, 40.

The second is that the number

e^{\pi \sqrt{163}}

is extraordinarily close to an integer. This number is known as Ramanujan’s constant. It differs from its nearest integer by 3 parts in 1030. Ramanujan’s constant equals


There is a connection between these two facts: The polynomial

n^2 - n + k

returns primes for n = 1, 2, 3, …, k-1 primes if 4k – 1 is a Heegner number, and

e^{\pi \sqrt{d}}

is almost an integer if d is a (large) Heegner number.

Source: The Book of Numbers by Conway and Guy.

Heegner numbers

So what’s a Heegner number and how many are there? An integer d is a Heegner number if the ring generated by appending √-d to the integers has unique factorization. There are nine such numbers:

1, 2, 3, 7, 11, 19, 43, 67, 163.

There’s deeper stuff going on here than I understand—modular forms, the j-function, etc.—so this post won’t explain everything. There’s something unsatisfying about saying something is “almost” an integer without quantifying. There’s a way to be more precise, but we won’t go there. Instead, we’ll just play with the results.

Mathematica computation

First we look at the claim that n² – n + k produces primes for n = 1 through k – 1 if 4k – 1 is a Heegner number. The Heegner numbers of the form 4k + 1 are 2, 3, 5, 11, and 17. The following code shows that the claim is true for these values of k.

k = {2, 3, 5, 11, 17}
claim[x_] := AllTrue[
  Table[n^2 - n + x, {n, x - 1}], 
AllTrue[k, claim]

This returns True, so the claim is true.

As for exp(π √d) being close to an integer, this apparently only true for the last three Heegner numbers.

h = {1, 2, 3, 7, 11, 19, 43, 67, 163}
For[i = 1, i < 10, i++, 
        Exp[ Pi Sqrt[ h[[i]] ] ], 

(The function AccountingForm suppresses scientific notation, making it easier to see where the decimal portion of the number starts.)

Here are the results:


I manually edited the output to align the decimal points and truncate the decimal places beyond that needed to show that the last number is not an integer.

Partition numbers and Ramanujan’s approximation

The partition function p(n) counts the number of ways n unlabeled things can be partitioned into non-empty sets. (Contrast with Bell numbers that count partitions of labeled things.)

There’s no simple expression for p(n), but Ramanujan discovered a fairly simple asymptotic approximation:

p(n) \sim \frac{1}{4n\sqrt{3}} \exp(\pi \sqrt{2n/3})

How accurate is this approximation? Here’s a little Matheamtica code to see.

p[n_] := PartitionsP[n]
approx[n_] := Exp[ Pi Sqrt[2 n/3]] / (4 n Sqrt[3])
relativeError[n_] := Abs[p[n] - approx[n]]/p[n]
ListLinePlot[Table[relativeError[n], {n, 100}]]

So for values of n around 100, the approximation is off by about 5%.

Since it’s an asymptotic approximation, the relative error decreases (albeit slowly, apparently) as n increases. The relative error for n = 1,000 is about 1.4% and the relative error for n = 1,000,000 is about 0.044%.

Update: After John Baez commented on the oscillation in the relative error I decided to go back and look at it more carefully. Do the oscillations end or do they just become too small to see?

To answer this, let’s plot the difference in consecutive terms.

relerr[a_, b_] := Abs[a - b]/b
t = Table[relerr[p[n], approx[n]], {n, 300}];
ListLinePlot[Table[ t[[n + 1]] - t[[n]], {n, 60}]]

first differences of relative error

The plot crosses back and forth across the zero line, indicating that the relative error alternately increases and decreases, but only up to a point. Past n = 25 the plot stays below the zero line; the sign changes in the first differences stop.

But now we see that the first differences themselves alternate! We can investigate the alternation in first differences by plotting second differences [1].

    Table[ t[[n + 2]] - 2 t[[n + 1]] + t[[n]], 
    {n, 25, 120}]

first differences of relative error

So it appears that the second differences keep crossing the zero line for a lot longer, so far out that it’s hard to see. In fact the second differences become positive and stay positive after n = 120. But the second differences keep alternating, so you could look at third differences …

Related posts: Special numbers


[1] The code does a small algebraic simplification that might make some people scratch their heads. All it does is simplify

(tn+2tn+1) – (tn+1tn).

Reciprocals of primes

Here’s an interesting little tidbit:

For any prime p except 2 and 5, the decimal expansion of 1/p repeats with a period that divides p-1.

The period could be as large as p-1, but no larger. If it’s less than p-1, then it’s a divisor of p-1.

Here are a few examples.

1/3 = 0.33…
1/7 = 0.142857142857…
1/11= 0.0909…
1/13 = 0.076923076923…
1/17 = 0.05882352941176470588235294117647…
1/19 = 0.052631578947368421052631578947368421…
1/23 = 0.04347826086956521739130434782608695652173913…

Here’s a table summarizing the periods above.

| prime | period |
|     3 |      1 |
|     7 |      6 |
|    11 |      2 |
|    13 |      6 |
|    17 |     16 |
|    19 |     18 |
|    23 |     22 |

For a proof of the claims above, and more general results, see Periods of Fractions.

Probability of coprime sets

The latest blog post from Gödel’s Lost Letter and P=NP looks at the problem of finding relatively prime pairs of large numbers. In particular, they want a deterministic algorithm. They mention in passing that the probability of a set of k large integers being relatively prime (coprime) is 1/ζ(k) where ζ is the Riemann zeta function. This blog post will look closer at that statement.

How large is large?

What does it mean that the numbers are large? That they are large enough that the asymptotic result given by the zeta function is accurate enough. We’ll explore this with simulation code shortly.

The exact statement is that in the limit as N goes to infinity, the probability of k integers chosen uniformly from 1 to N being coprime converges to 1/ζ(k). I won’t go into the proof here, but in a nutshell, it falls out of Euler’s product formula for the zeta function.

What does it mean for a set of numbers to be coprime?

The probability above seemed wrong to me at first. The function ζ(k) decreases as k increases, and so its reciprocal increases. That means it’s more likely that a set of numbers is coprime as the size of the set increases. But the larger the set of numbers, the more likely it is that two will have a common factor, so shouldn’t the probability that they’re coprime go down with k, not up?

The resolution to the apparent contradiction is that I had the wrong idea of coprime in mind. Two integers are coprime if they have no prime factors in common. How do you generalize that to more than two integers? You could define a set of numbers to be coprime if every pair of numbers in the set is coprime. That’s the definition I had in mind. But that’s not the standard definition. The right definition here is that the greatest common divisor of the set is 1. For example, (6, 14, 21) would be coprime because no single prime divides all three numbers, even though no pair of numbers from the set is coprime.

Python simulation

Let’s write some Python code to try this out. We’ll randomly generate some sets of large numbers and see how often they’re coprime.

How do you generate large random integers in Python? You can use the function getrandbits to generate numbers as large as you like. In the code below we’ll generate 128-bit integers.

How do you compute the greatest common divisor in Python? There’s a library function gcd, but it only takes two integers. We’ll use the reduce function to fold gcd over a list of integers.

How do you compute the zeta function in Python? SciPy doesn’t have an implementation of ζ(x) per se, but it does have an implementation of ζ(x)-1. Why not just implement ζ itself? Because for large x, ζ(x) is close to 1, so more precision can be saved by computing ζ – 1.

Putting these pieces together, here’s code to randomly generate triples of 128-bit integers, see how often they’re coprime, and compare the result to 1/ζ(3).

    from random import getrandbits
    from fractions import gcd
    from functools import reduce
    from scipy.special import zetac
    def riemann_zeta(x):
        return zetac(x) + 1
    k = 3
    size = 128
    N = 10000
    coprime_count = 0
    for _ in range(N):
        x = [getrandbits(size) for _ in range(k)]
        if reduce(gcd, x) == 1:
            coprime_count += 1

When I ran this I got


Simulation agrees with theory to two decimal places, which is just what we should expect from 10,000 repetitions. (We’d expect error on the order of 1/√N.)

Here’s what I got when I changed k to 4:


And here’s what I got for k set to 5:


Related posts

Sums of palindromes

Every positive integer can be written as the sum of three palindromes, numbers that remain the same when their digits are reverse. For example, 389 = 11 + 55 + 323. This holds not just for base 10 but for any base b ≥ 5.

[Update: a new paper on arXiv extends this to b = 3 and 4 as well. Base 2 requires four palindromes.]

The result and algorithms for finding the palindromes was published online last August and is in the most recent print issue of Mathematics of Computation.

Javier Cilleruelo, Florian Luca and Lewis Baxter. Every positive integer is a sum of three palindromes. DOI: https://doi.org/10.1090/mcom/3221