Iterated Mersenne primes

A Mersenne number is a number of the form 2k − 1. A Mersenne prime is a Mersenne number which is also a prime.

It turns out that if 2k − 1 is prime then k must be prime, so Mersenne numbers have the form 2p − 1 is prime. What about the converse? If p is prime, is 2k − 1 also prime? No, because, for example, 211 −  1 = 2047 = 23 × 89.

If p is not just a prime but a Mersenne prime, then is 2p − 1 a prime? Sometimes, but not always. The first counterexample is p = 8191.

There is an interesting chain of iterated Mersenne primes:

\begin{align*} M_1 &= 2 \\ M_2 &= 2^{M_1} - 1 \\ M_3 &= 2^{M_2} - 1 \\ M_4 &= 2^{M_3} - 1 \\ M_{12} &= 2^{M_4} - 1 \\ \end{align*}

This raises the question of whether m = 2M12 − 1 is prime. Direct testing using available methods is completely out of the question. The only way we’ll ever know is if there is some theoretical result that settles the question.

Here’s an easier question. Suppose m is prime. Where would it fall on the list of Mersenne primes if conjectures about the distribution of Mersenne primes are true?

This post reports

It has been conjectured that as x increases, the number of primes px such that 2p – 1 is also prime is asymptotically

eγ log x / log 2

where γ is the Euler-Mascheroni constant.

If that conjecture is true, the number of primes less than M12 that are the exponents of Mersenne primes would be approximately

eγ log M12 / log 2 = 226.2.

So if m is a Mersenne prime, it may be the 226th Mersenne prime, or Mn for some n around 226, if the conjectured distribution of Mersenne primes is correct.

We’ve discovered a dozen Mersenne primes since the turn of the century and we’re up to 51 discovered so far. We’re probably not going to get up to the 226th Mersenne prime, if there even is a 226th Mersenne prime, any time soon.


Double super factorial

I saw someone point out recently that

10! = 7! × 5! × 3! × 1!

Are there more examples like this?

What would you call the pattern on the right? I don’t think there’s a standard name, but here’s why I think it should be called double super factorial or super double factorial.

Super factorial

The factorial of a positive number n is the product of the positive numbers up to and including n. The super factorial of n is the product of the factorials of the positive numbers up to and including n. So, for example, 7 super factorial would be

7! × 6! × 5! × 4! × 3! × 2! × 1!

Double factorial

The double factorial of a positive number n is the product of all the positive numbers up to n with the same parity of n. So, for example, the double factorial of 7 would be

7!! = 7 × 5 × 3 × 1.

Double superfactorial

The pattern at the top of the post is like super factorial, but it only includes odd terms, so it’s like a cross between super factorial and double factorial, hence double super factorial.

Denote the double super factorial of n as dsf(n), the product of the factorials of all numbers up to n with the same parity as n. That is,

dsf(n) = n! × (n − 2)! × (n − 4)! × … × 1

where the 1 at the end is 1! if n is odd and 0! if n is even. In this notation, the observation at the top of the post is

10! = dsf(7).

Super double factorial

We can see by re-arranging terms that a double super factorial is also a super double factorial. For example, look at

dsf(7) = 7! × 5! × 3! × 1!

If we separate out the first term in each factorial we have

(7 × 5 × 3 × 1)(6! × 4! × 2!) = 7!! dsf(6)

We can keep going and show in general that

dsf(n) = n!! × (n − 1)!! × (n − 2)!! … × 1

We could call the right hand side super double factorial, sdf(n). Just as a super factorial is a product of factorials, a super double factorials is a product of double factorials. Therefore

dsf(n) = sdf(n).

Factorials that equal double super factorials

Are there more solutions to

n! = dsf(m).

besides n = 10 and m = 7? Yes, here are some.

0! = dsf(0)
1! = dsf(1)
2! = dsf(2)
3! = dsf(3)
6! = dsf(5)

There are no solutions to

n! = dsf(m)

if n > 10. Here’s a sketch of a proof.

Bertrand’s postulate says that for n > 1 there is always a prime p between n and 2n. Now p divides (2n)! but p cannot divide dsf(n) because dsf(n) only has factors less than or equal to n.

If we can show that for some N, n > N implies (2n)! < dsf(n) then there are no solutions to

n! = dsf(m)

for n > 2N because there is a prime p between N and 2N that divides the left side but not the right. In fact N = 12. We can show empirically there are no solutions for n = 11 up to 24, and the proof shows there are no solutions for n > 24.

Finite differences and Pascal’s triangle

The key to solving a lot of elementary what-number-comes-next puzzles is to take first or second differences. For example, if asked what the next item in the series

14, 29, 50, 77, 110, …

the answer (or at lest the answer the person posing the question is most likely looking for) is 149. You might discover this by first looking at the differences of the items:

15, 21, 27, 33, …

The differences all differ by 6, i.e. the second difference of the series is constant. From there you can infer that the next item in the original series will be 39 more than the previous, i.e. it will be 149.

We can apply the same technique for exploring series that are not artificial puzzles. For example, a one-page article by Harlan Brothers [1] asks what would happen if you looked at the products of elements in each row of Pascal’s triangle.

The products grow very quickly, which suggests we work on a log scale. Define

s(n) = \log \prod{k=0}^n \binom{n}{k}  = \sum_{k=0}^n \log \binom{n}{k}

Let’s use a little Python script to look at the first 10 elements in the series.

    from scipy.special import binom
    from numpy import vectorize, log

    def s(n):
        return sum([log(binom(n, k)) for k in range(n+1)])
    s = vectorize(s)

    n = range(1, 11)
    x = s(n)

This prints


Following the strategy at the top of the post, let’s look at the first differences of the sequence with [2]

    y = x[1:] - x[:-1]

This prints


The first differences are increasing by about 0.9, i.e. the second differences are roughly constant. And if we look at the third differences, we find that they’re small and getting smaller the further out you go.

We can easily look further out in the sequence by changing range(1, 11) to range(1, 101). When we do, we find that the second difference are

…, 0.99488052, 0.9949324. 0.99498325

If we look even further out, looking at a thousand terms, the last of the second differences is

…, 0.99949883, 0.99949933, 0.99949983

We might speculate that the second differences are approaching 1 as n → ∞. And this is exactly what is proved in [1], though the author does not work on the log scale. The paper shows that the ratio of the ratio of consecutive lines converges to e. This is equivalent on a log scale to saying the second differences converge to 1.

[1] Harlan J. Brothers. Math Bite: Finding e in Pascal’s Triangle. Mathematics Magazine , Vol. 85, No. 1 (February 2012), p. 51

[2] In Python, array elements are numbered starting at 0, and x[1:] represents all but the first elements of x. The index −1 is a shorthand for the last element, so x{:-1] means all the elements of x up to (but not including) the last.

Additive functions

A function f from positive integers to real numbers is defined to be additive if for relatively prime numbers m and n,

f(mn) = f(m) + f(n).

The function f is called completely addititive if the above holds for all positive integers m and n, i.e. we drop the requirement that m and n are relatively prime.

Example: total prime factors

One example of an additive function is the function Ω(n) defined to be the number of prime factors of n, counted with multiplicity. For example, Ω(12) = 3 because 12 = 2 × 2 × 3. The numbers 10 and 63 are relatively prime, and

Ω(630) = 5 = Ω(10) + Ω(63).

Example: distinct prime factors

Another example of an additive function is ω(n) defined to be the number of distinct prime factors of n, i.e. not counting with multiplicity. So, for example, ω(12) = 2.

This function is additive but not completely additive because, for example,

ω(20) = 2 ≠ ω(2) + ω(10)  = 3

A theorem of Erdős

Here is a remarkable theorem due to Paul Erdős [1]. Suppose f is an additive function such that

f(n + 1) − f(n)

converges to zero as n goes to infinity. Then

f(n) = c log(n)

for some constant c. And since a multiple of a logarithm is a logarithm to a different base, we can restate the conclusion by simply saying f is a logarithm.

Logarithms are completely additive functions, so even though we only assumed f was additive, this combined with the limit condition proves that in fact f is completely additive.

Related posts

[1] Paul Erdős, “On the distribution function of additive functions,” Ann. of Math., Vol. 47 (1946), pp. 1–20.

Advanced questions about a basic diagram

Unit circle trig diagram

I saw a hand-drawn version of the diagram above yesterday and noticed that the points were too evenly distributed. That got me to thinking: is there any objective way to say that this famous diagram is in some sense complete? If you were to make a diagram with more points, what would they be?

Simple numbers

The numbers on the diagram are all simple. Once we’re more precise about what it means for these numbers to be “simple,” we can answer the questions above.

The angles in the diagram are all rational parts of a circle, that is, rational multiples of 2π. For the rest of the post, I’ll say “rational angle” to mean a rational multiple of 2π.

The sines and cosines all involve only one square root, i.e. no nested roots. A more useful way to express this is that all the values are the roots of a quadratic polynomial with integer coefficients.


Could we add more rational angles whose sines and cosines are roots of quadratics? Maybe the chart would be too cluttered to put in a textbook, but would it be possible in principle? Could there be some chart analogous to the one above that has, for example, (1 + √7)/5 as one of the labels?

The angles in the common unit circle diagram, integer multiples of π/4 and π/6, are the only rational angles with sines and cosines that are roots of a quadratic polynomial with integer coefficients. That is, these are the only rational angles that have sines and cosines that are algebraic of degree 2. In that sense the diagram is complete.

The number (1 + √7)/5 is algebraic of degree 2 [1] but isn’t on our exhaustive list of possible algebraic values of degree 2. So even if you were to try numbers of the form pπ/q for very large integers p and q, you’ll never get a sine or cosine equal to (1 + √7)/5.

In 1933 Lehmer [2] showed how to classify all rational angles whose sines or cosines are algebraic of given degree. His theorem proves that the only rational angles whose sine is algebraic of degree 2 are integer multiples of π/4 and π/6.

Interestingly, there is another rational angle whose cosine is algebraic of degree 2:

cos(π/5) = (1 + √5)/4

So we could extend the unit circle diagram to include multiples of π/5, but only the cosine would be algebraic of degree 2. The sines are more complicated. For example,

sin(π/5) = √(5/8 + √(5)/8)

which is algebraic of degree 4.

Higher degrees

There are no rational angles whose sine is algebraic of degree 3, so going up to degree 3 wouldn’t help.

If we go up to degree 4 then we could add multiples of π/5, π/8, and π/12. These all have sines and cosines that are algebraic of degree 4.

Related posts

[1] (1 + √7)/5  is a root of 25x² − 10x = 6 = 0.

[2] D. H. Lehmer. A Note on Trigonometric Algebraic Numbers. The American Mathematical Monthly , March 1933, Vol. 40, No. 3, pp. 165–166

Factoring pseudoprimes

Fermat’s little theorem says that if p is a prime number, then for any positive integer b < p we hve

bp−1 = 1 (mod p).

This theorem gives a necessary but not sufficient condition for a number to be prime.

Fermat’s primality test

The converse of Fermat’s little theorem is not always true, but it’s often true. That is, if there exists some base 1 < b < n such that

bn−1 = 1 (mod n)

then n is likely to be prime. There are examples where the equation above holds for a pair (b, n) even though n is not prime, and in that case n is called a pseudoprime to the base b.

If you’re searching for large primes, say for use in encryption, then you’d begin by applying Fermat’s little theorem with a few small values of b. This is because although Fermat’s test can’t prove that a number is prime, it can prove that a number is not prime.

For a small example, suppose you wanted to test whether 50621 is prime. You could start by applying Fermat’s test with b = 2 as in the following Python code.

>>> n = 50621
>>> 2**(n−1) % n

Since the result is not 1, we know 50621 is not prime. This doesn’t tell us what the factors of 50621 are, but we know that it has nontrivial factors. We say 2 is a witness that the number 50621 is not prime.

Next, let’s see whether 294409 might be prime.

>>> n = 294409
>>> 2**(n – 1) % n

This tells us 294409 might be prime. It has passed a test that filters out a lot of composite numbers. What now? We could try other values of b: 3, 5, 7, 11, …. This will not resolve the question of whether 294409 is prime unless we keep going until we try 37. And in fact 37 is the smallest factor of 294409. Our number 294409 is a Carmichael number, a composite number n that passes Fermat’s primality test for all bases b relatively prime to n.

Note that it would be more efficient to use pow(b, n − 1, n) rather than 2**(n − 1) % n because the former takes advantage of the fact that we don’t need to compute 2n−1 per se and can reduce all intermediate calculations mod n.

Factoring pseudoprimes

Now suppose we have a number n that has passed Fermat’s primality test for some base b and we suspect that n is a pseudoprime. If we want to (try to) factor n, knowing that it is a pseudoprime to the base b gives us a head start. We can exploit the fact that we know b to factor n in polynomial time, unless n is a strong pseudoprime.

Suppose we have a number n that we suspect is a pseudoprime to the base b, and we’re smart enough to at least check that n is an odd number, then we begin by pulling out all the factors of 2 that we can from n − 1:

n − 1 = 2e f.

Next consider the set of numbers


for k = 1, 2, 4, …, 2e. Let x be the smallest of these numbers which is not congruent to 1 mod n. The existence of such an x is essentially the definition of strong pseudoprime [1].

Then gcd(x − 1, n) and gcd(x + 1, n) are factors of n. This is theorem 10.4 of [2].

Python example

Let n = 873181. This is a pseudoprime to the base b = 3, which we can confirm by seeing that pow(3, n−1, n) returns 1.

Now 873180 is divisible by 4 but not by 8, so e = 2. So the theorem above says we should compute

>>> b, e = 3, 2
>>> [pow(b, f*2**k, n) for k in range(e+1)]

This produces [2643, 1, 1]. So x = 2643,

>> x = 2643
>>> from sympy import gcd
>>> gcd(x−1, n)
>>> gcd(x+1, n)

shows that 1321 and 661 are both factors of 873181.

Related posts

[1] Definition of strong pseudoprime. A strong pseudo prime to base b is a composite odd integer m such that if m − 1 = 2ef  with f odd, then either bf = 1 (mod m) or bf2c ≡ −1 (mod m) for some 0 ≤ c < e.

[2] The Joy of Factoring by Samuel S. Wagstaff, Jr.

Connecting the FFT and quadratic reciprocity

Some readers will look at the title of this post and think “Ah yes, the FFT. I use it all the time. But what is this quadratic reciprocity?”

Others will look at the same title and think “Gauss called the quadratic reciprocity theorem the jewel in the crown of mathematics. But what is this FFT thing? I think I remember an engineer saying something about that.”

Gauss proved a theorem that relates quadratic reciprocity and the FFT. For distinct odd primes p and q, the following equation holds.

\left(\frac{p}{q}\right) \left(\frac{q}{p}\right) = \frac{\text{Tr} {\cal F}_{pq}}{ \text{Tr} {\cal F}_p\, \text{Tr} {\cal F}_q}

I’ll spend the rest of this post unpacking this equation.

Legendre symbols

The expressions on the left are not fractions but rather Legendre symbols. The parentheses are not for grouping but are part of the symbol.

The Legendre symbol


is defined to be 0 if a is a multiple of r, 1 if a has a square root mod r, and −1 otherwise.

Fourier transforms

The Discrete Fourier Transform (DFT) of a vector of length n multiplies the vector by the n by n Fourier matrix Fp whose j, k entry is equal to exp(2πi jk / n). The Fast Fourier Transform (FFT) is a way to compute the DFT more quickly than directly multiplying by the Fourier matrix. Since the DFT is nearly always computed using the FFT algorithm, the DFT is commonly referred to as the FFT.

Matrix trace

The trace of a matrix is the sum of the elements along the main diagonal. So the trace of the Fourier matrix of size n is

\text{Tr} {\cal F}_n = \sum_{j=1}^n \exp(2\pi ij^2/n)

Numerical illustration

The quadratic reciprocity theorem, also due to Gauss, is usually stated as

\left(\frac{p}{q}\right) \left(\frac{q}{p}\right) = (-1)^{\frac{p-1}{2}\frac{q-1}{2}}

We can illustrate the theorem at the top of the page numerically with the following Python code, using the quadratic reciprocity theorem to evaluate the product of the Legendre symbols.

from numpy import exp, pi

tr = lambda p: sum(exp(2j*pi*k**2/p) for k in range(1, p+1))
p, q = 59, 17
print( tr(p*q)/(tr(p)*tr(q)) )
print( (-1)**((p-1)*(q-1)/4) ) 

The first print statement produces (0.9999999999998136-1.4048176871018313e-13j) due to some loss of precision due to floating point calculations, but this is essentially 1, which is what the second print statement produces.

If we change q to 19, both statements print −1 (after rounding the first result).

Quadratic Gauss sum

We can quickly prove

\left(\frac{p}{q}\right) \left(\frac{q}{p}\right) = \frac{\text{Tr} {\cal F}_{pq}}{ \text{Tr} {\cal F}_p\, \text{Tr} {\cal F}_q}

if we assume the quadratic reciprocity theorem and the following equation for the trace of the Fourier matrix.

\text{Tr} {\cal F}_n = \sum_{j=1}^n \exp(2\pi ij^2/n) =
		\sqrt{n}  & \mbox{if } n \equiv 1 \bmod{4} \\
		0 & \mbox{if } n \equiv 2 \bmod{4} \\
                i\sqrt{n} & \mbox{if } n \equiv 3 \bmod{4} \\
                (1+i)\sqrt{n} & \mbox{if } n \equiv 0 \bmod{4} \\

This proof is historically backward. It assumes quadratic reciprocity, but Gauss proved quadratic reciprocity by first proving the equation we’re trying to prove. He then obtained the expression on the right hand side of the quadratic reciprocity theorem using the equation above for the trace of the Fourier matrix.

The trace of the Fourier matrix is now called a quadratic Gauss sum. It’s a special case of more general sums that Gauss studied, motivated by his exploration of quadratic reciprocity.

Incidentally, Gauss gave many proofs of the quadratic reciprocity theorem. I don’t know where the proof outlined hear fits into the sequence of proofs he developed.

Related posts

Factored random numbers

A couple days ago Michael Nielsen posted an image of a one-page paper that gives an algorithm for generating factored random numbers, uniformly distributed from 1 to some designated N.

The algorithm does not generate random numbers then factor them. It’s more efficient than that, generating the factorization along with the final result. It does require testing for whether a number is prime, but this is more efficient than factorization.

I thought about trying to code up the algorithm in Python, but then I see that @iconjack beat me to it.

from sympy import isprime
from random import random, randint

def randfacts(N):
    while True:
        n, r, s = N, 1, []
        while n > 1:
            if r > N: break
            if isprime(n := randint(1,n)):
                r *= n
            if random() < r/N:
                return r, s

When there is only one group of a given size

Today’s date, US style, is 9/26/2023, and there is only one group, up to isomorphism, of size 9262023. You could verify this in Mathematica with the command


which returns 1.

For a given n, when is there only one group of size n?

There are two requirements. First, n has to be the product of distinct primes, i.e. no prime appears in the factorization with a power greater than 1. Second, no prime divides one less than another prime.


9262023 = 3 × 41 × 257 ×293

and you can check that 3 does not divide 40, 256, or 292, nor does 41 divide 2, 252, or 292, etc.

A more compact way to state the criteria above is to say

gcd(n, φ(n)) = 1

where φ(n) is Euler’s totient function, the number of positive numbers less than n and relatively prime to n.

Why are these criteria equivalent? If

n = pqr


φ(n) = (p − 1)(q − 1)(r − 1)…

If n and φ(n) have a nontrivial common factor, it has to be one of the prime factors of n, and none of these divide any term of φ(n).

Source: Dieter Jungnickel. On the Uniqueness of the Cyclic Group of Order n. The American Mathematical Monthly, Vol. 99, No. 6. (Jun. – Jul., 1992), pp. 545–547.

Mersenne primes are unsafe

In the previous post I mentioned that a particular Mersenne prime would be unsuitable for cryptography. In fact, all Mersenne primes are unsuitable for cryptography.

A prime number p is called “safe” if

p = 2q + 1

where q is also a prime. Safe primes are called safe because p − 1 does not have small factors (other than 2). The factors of p − 1 correspond to subgroups of the group used for encryption, and small groups can be exploited to attack encryption.

Mersenne numbers are numbers of the form

Mn = 2n − 1.

Mersenne primes are Mersenne numbers that are also prime. A necessary condition for Mn to be prime is for n to be prime. This condition is not sufficient. For example,

211 − 1 = 23 × 89.

But is necessary, for reasons we’ll get into shortly.

If Mn = 2q + 1, then q = Mn−1. But if n is a prime, then n − 1 is not a prime, with one exception: n = 3. So the only Mersenne prime that is a safe prime is M3 = 7, which is not a particularly large prime. Public key cryptography uses numbers in the thousands of digits, not single digits.

Why does n have to be prime before Mn stands a chance of being prime?

If a > 1, then xa − 1 can be factored:

xa − 1 = (x − 1)(xa−1 + xa−2 + … + 1)

If n can be factored into ab, then set x = 2b. This shows that 2ab − 1 has a factor, namely 2b − 1.

In the previous post we said that M127 − 1 has a lot of small factors. We can find some of those factors easily:

M127 − 1 = 2 M126 = 2 (2126 − 1)

and (2126 − 1) is divisible by 2k – 1 for every k that divides 126.

The nontrivial factors of 126 are 2, 3, 6, 7, 9, 14, 18, 21, 42, 63, and so 2k – 1 is a factor of M126 for k equal to each of these numbers. This is enough to fully factor 2126 − 1 into

3³ × 7² × 19 × 43 × 73 × 127 × 337 × 5419 × 92737 × 649657 × 77158673929

given in the footnote from the previous post. You could easily come up with this factorization using pencil and paper, though it would not be easy to determine by hand that the last factor is a prime number.