Finite differences and Pascal’s triangle

The key to solving a lot of elementary what-number-comes-next puzzles is to take first or second differences. For example, if asked what the next item in the series

14, 29, 50, 77, 110, …

the answer (or at lest the answer the person posing the question is most likely looking for) is 149. You might discover this by first looking at the differences of the items:

15, 21, 27, 33, …

The differences all differ by 6, i.e. the second difference of the series is constant. From there you can infer that the next item in the original series will be 39 more than the previous, i.e. it will be 149.

We can apply the same technique for exploring series that are not artificial puzzles. For example, a one-page article by Harlan Brothers [1] asks what would happen if you looked at the products of elements in each row of Pascal’s triangle.

The products grow very quickly, which suggests we work on a log scale. Define

s(n) = \log \prod{k=0}^n \binom{n}{k}  = \sum_{k=0}^n \log \binom{n}{k}

Let’s use a little Python script to look at the first 10 elements in the series.

    from scipy.special import binom
    from numpy import vectorize, log

    def s(n):
        return sum([log(binom(n, k)) for k in range(n+1)])
    s = vectorize(s)

    n = range(1, 11)
    x = s(n)
    print(x)

This prints

0.
0.69314718
2.19722458
4.56434819
7.82404601
11.9953516
17.0915613
23.1224907
30.0956844
38.0171228

Following the strategy at the top of the post, let’s look at the first differences of the sequence with [2]

    y = x[1:] - x[:-1]
    print(y)

This prints

0.69314718
1.50407740
2.36712361
3.25969782
4.17130560
5.09620968
6.03092943
6.97319372
7.92143836

The first differences are increasing by about 0.9, i.e. the second differences are roughly constant. And if we look at the third differences, we find that they’re small and getting smaller the further out you go.

We can easily look further out in the sequence by changing range(1, 11) to range(1, 101). When we do, we find that the second difference are

…, 0.99488052, 0.9949324. 0.99498325

If we look even further out, looking at a thousand terms, the last of the second differences is

…, 0.99949883, 0.99949933, 0.99949983

We might speculate that the second differences are approaching 1 as n → ∞. And this is exactly what is proved in [1], though the author does not work on the log scale. The paper shows that the ratio of the ratio of consecutive lines converges to e. This is equivalent on a log scale to saying the second differences converge to 1.

[1] Harlan J. Brothers. Math Bite: Finding e in Pascal’s Triangle. Mathematics Magazine , Vol. 85, No. 1 (February 2012), p. 51

[2] In Python, array elements are numbered starting at 0, and x[1:] represents all but the first elements of x. The index −1 is a shorthand for the last element, so x{:-1] means all the elements of x up to (but not including) the last.

Additive functions

A function f from positive integers to real numbers is defined to be additive if for relatively prime numbers m and n,

f(mn) = f(m) + f(n).

The function f is called completely addititive if the above holds for all positive integers m and n, i.e. we drop the requirement that m and n are relatively prime.

Example: total prime factors

One example of an additive function is the function Ω(n) defined to be the number of prime factors of n, counted with multiplicity. For example, Ω(12) = 3 because 12 = 2 × 2 × 3. The numbers 10 and 63 are relatively prime, and

Ω(630) = 5 = Ω(10) + Ω(63).

Example: distinct prime factors

Another example of an additive function is ω(n) defined to be the number of distinct prime factors of n, i.e. not counting with multiplicity. So, for example, ω(12) = 2.

This function is additive but not completely additive because, for example,

ω(20) = 2 ≠ ω(2) + ω(10)  = 3

A theorem of Erdős

Here is a remarkable theorem due to Paul Erdős [1]. Suppose f is an additive function such that

f(n + 1) − f(n)

converges to zero as n goes to infinity. Then

f(n) = c log(n)

for some constant c. And since a multiple of a logarithm is a logarithm to a different base, we can restate the conclusion by simply saying f is a logarithm.

Logarithms are completely additive functions, so even though we only assumed f was additive, this combined with the limit condition proves that in fact f is completely additive.

Related posts

[1] Paul Erdős, “On the distribution function of additive functions,” Ann. of Math., Vol. 47 (1946), pp. 1–20.

Advanced questions about a basic diagram

Unit circle trig diagram

I saw a hand-drawn version of the diagram above yesterday and noticed that the points were too evenly distributed. That got me to thinking: is there any objective way to say that this famous diagram is in some sense complete? If you were to make a diagram with more points, what would they be?

Simple numbers

The numbers on the diagram are all simple. Once we’re more precise about what it means for these numbers to be “simple,” we can answer the questions above.

The angles in the diagram are all rational parts of a circle, that is, rational multiples of 2π. For the rest of the post, I’ll say “rational angle” to mean a rational multiple of 2π.

The sines and cosines all involve only one square root, i.e. no nested roots. A more useful way to express this is that all the values are the roots of a quadratic polynomial with integer coefficients.

Completeness

Could we add more rational angles whose sines and cosines are roots of quadratics? Maybe the chart would be too cluttered to put in a textbook, but would it be possible in principle? Could there be some chart analogous to the one above that has, for example, (1 + √7)/5 as one of the labels?

The angles in the common unit circle diagram, integer multiples of π/4 and π/6, are the only rational angles with sines and cosines that are roots of a quadratic polynomial with integer coefficients. That is, these are the only rational angles that have sines and cosines that are algebraic of degree 2. In that sense the diagram is complete.

The number (1 + √7)/5 is algebraic of degree 2 [1] but isn’t on our exhaustive list of possible algebraic values of degree 2. So even if you were to try numbers of the form pπ/q for very large integers p and q, you’ll never get a sine or cosine equal to (1 + √7)/5.

In 1933 Lehmer [2] showed how to classify all rational angles whose sines or cosines are algebraic of given degree. His theorem proves that the only rational angles whose sine is algebraic of degree 2 are integer multiples of π/4 and π/6.

Interestingly, there is another rational angle whose cosine is algebraic of degree 2:

cos(π/5) = (1 + √5)/4

So we could extend the unit circle diagram to include multiples of π/5, but only the cosine would be algebraic of degree 2. The sines are more complicated. For example,

sin(π/5) = √(5/8 + √(5)/8)

which is algebraic of degree 4.

Higher degrees

There are no rational angles whose sine is algebraic of degree 3, so going up to degree 3 wouldn’t help.

If we go up to degree 4 then we could add multiples of π/5, π/8, and π/12. These all have sines and cosines that are algebraic of degree 4.

Related posts

[1] (1 + √7)/5  is a root of 25x² − 10x = 6 = 0.

[2] D. H. Lehmer. A Note on Trigonometric Algebraic Numbers. The American Mathematical Monthly , March 1933, Vol. 40, No. 3, pp. 165–166

Factoring pseudoprimes

Fermat’s little theorem says that if p is a prime number, then for any positive integer b < p we hve

bp-1 = 1 (mod p).

This theorem gives a necessary but not sufficient condition for a number to be prime.

Fermat’s primality test

The converse of Fermat’s little theorem is not always true, but it’s often true. That is, if there exists some base 1 < b < n such that

bn-1 = 1 (mod n)

then n is likely to be prime. There are examples where the equation above holds for a pair (b, n) even though n is not prime, and in that case n is called a pseudoprime to the base b.

If you’re searching for large primes, say for use in encryption, then you’d begin by applying Fermat’s little theorem with a few small values of b. This is because although Fermat’s test can’t prove that a number is prime, it can prove that a number is not prime.

For a small example, suppose you wanted to test whether 50621 is prime. You could start by applying Fermat’s test with b = 2 as in the following Python code.

>>> n = 50621
>>> 2**(n−1) % n
9605

Since the result is not 1, we know 50621 is not prime. This doesn’t tell us what the factors of 50621 are, but we know that it has nontrivial factors. We say 2 is a witness that the number 50621 is not prime.

Next, let’s see whether 294409 might be prime.

>>> n = 294409
>>> 2**(n−1) % n
1

This tells us 294409 might be prime. It has passed a test that filters out a lot of composite numbers. What now? We could try other values of b: 3, 5, 7, 11, …. This will not resolve the question of whether 294409 is prime unless we keep going until we try 37. And in fact 37 is the smallest factor of 294409. Our number 294409 is a Carmichael number, a composite number n that passes Fermat’s primality test for all bases b relatively prime to n.

Note that it would be more efficient to use pow(b, n−1, n) rather than 2**(n−1) % n because the former takes advantage of the fact that we don’t need to compute 2n−1 per se and can reduce all intermediate calculations mod n.

Factoring pseudoprimes

Now suppose we have a number n that has passed Fermat’s primality test for some base b and we suspect that n is a pseudoprime. If we want to (try to) factor n, knowing that it is a pseudoprime to the base b gives us a head start. We can exploit the fact that we know b to factor n in polynomial time, unless n is a strong pseudoprime.

Suppose we have a number n that we suspect is a pseudoprime to the base b, and we’re smart enough to at least check that n is an odd number, then we begin by pulling out all the factors of 2 that we can from n − 1:

n − 1 = 2e f.

Next consider the set of numbers

bkf

for k = 1, 2, 4, …, 2e. Let x be the smallest of these numbers which is not congruent to 1 mod n. The existence of such an x is essentially the definition of strong pseudoprime [1].

Then gcd(x − 1, n) and gcd(x + 1, n) are factors of n. This is theorem 10.4 of [2].

Python example

Let n = 873181. This is a pseudoprime to the base b = 3, which we can confirm by seeing that pow(3, n−1, n) returns 1.

Now 873180 is divisible by 4 but not by 8, so e = 2. So the theorem above says we should compute

>>> b, e = 3, 2
>>> [pow(b, f*2**k, n) for k in range(e+1)]

This produces [2643, 1, 1]. So x = 2643,

>> x = 2643
>>> from sympy import gcd
>>> gcd(x−1, n)
1321
>>> gcd(x+1, n)
661

shows that 1321 and 661 are both factors of 873181.

Related posts

[1] Definition of strong pseudoprime. A strong pseudo prime to base b is a composite odd integer m such that if m − 1 = 2ef with f odd, then either bf = 1 (mod m) or bf2c ≡ −1 (mod m) for some 0 ≤ c < e.

[2] The Joy of Factoring by Samuel S. Wagstaff, Jr.

Connecting the FFT and quadratic reciprocity

Some readers will look at the title of this post and think “Ah yes, the FFT. I use it all the time. But what is this quadratic reciprocity?”

Others will look at the same title and think “Gauss called the quadratic reciprocity theorem the jewel in the crown of mathematics. But what is this FFT thing? I think I remember an engineer saying something about that.”

Gauss proved a theorem that relates quadratic reciprocity and the FFT. For distinct odd primes p and q, the following equation holds.

\left(\frac{p}{q}\right) \left(\frac{q}{p}\right) = \frac{\text{Tr} {\cal F}_{pq}}{ \text{Tr} {\cal F}_p\, \text{Tr} {\cal F}_q}

I’ll spend the rest of this post unpacking this equation.

Legendre symbols

The expressions on the left are not fractions but rather Legendre symbols. The parentheses are not for grouping but are part of the symbol.

The Legendre symbol

\left(\frac{a}{r}\right)

is defined to be 0 if a is a multiple of r, 1 if a has a square root mod r, and −1 otherwise.

Fourier transforms

The Discrete Fourier Transform (DFT) of a vector of length n multiplies the vector by the n by n Fourier matrix Fp whose j, k entry is equal to exp(2πi jk / n). The Fast Fourier Transform (FFT) is a way to compute the DFT more quickly than directly multiplying by the Fourier matrix. Since the DFT is nearly always computed using the FFT algorithm, the DFT is commonly referred to as the FFT.

Matrix trace

The trace of a matrix is the sum of the elements along the main diagonal. So the trace of the Fourier matrix of size n is

\text{Tr} {\cal F}_n = \sum_{j=1}^n \exp(2\pi ij^2/n)

Numerical illustration

The quadratic reciprocity theorem, also due to Gauss, is usually stated as

\left(\frac{p}{q}\right) \left(\frac{q}{p}\right) = (-1)^{\frac{p-1}{2}\frac{q-1}{2}}

We can illustrate the theorem at the top of the page numerically with the following Python code, using the quadratic reciprocity theorem to evaluate the product of the Legendre symbols.

from numpy import exp, pi

tr = lambda p: sum(exp(2j*pi*k**2/p) for k in range(1, p+1))
p, q = 59, 17
print( tr(p*q)/(tr(p)*tr(q)) )
print( (-1)**((p-1)*(q-1)/4) ) 

The first print statement produces (0.9999999999998136-1.4048176871018313e-13j) due to some loss of precision due to floating point calculations, but this is essentially 1, which is what the second print statement produces.

If we change q to 19, both statements print −1 (after rounding the first result).

Quadratic Gauss sum

We can quickly prove

\left(\frac{p}{q}\right) \left(\frac{q}{p}\right) = \frac{\text{Tr} {\cal F}_{pq}}{ \text{Tr} {\cal F}_p\, \text{Tr} {\cal F}_q}

if we assume the quadratic reciprocity theorem and the following equation for the trace of the Fourier matrix.

\text{Tr} {\cal F}_n = \sum_{j=1}^n \exp(2\pi ij^2/n) =
\left\{
	\begin{array}{ll}
		\sqrt{n}  & \mbox{if } n \equiv 1 \bmod{4} \\
		0 & \mbox{if } n \equiv 2 \bmod{4} \\
                i\sqrt{n} & \mbox{if } n \equiv 3 \bmod{4} \\
                (1+i)\sqrt{n} & \mbox{if } n \equiv 0 \bmod{4} \\
	\end{array}
\right.

This proof is historically backward. It assumes quadratic reciprocity, but Gauss proved quadratic reciprocity by first proving the equation we’re trying to prove. He then obtained the expression on the right hand side of the quadratic reciprocity theorem using the equation above for the trace of the Fourier matrix.

The trace of the Fourier matrix is now called a quadratic Gauss sum. It’s a special case of more general sums that Gauss studied, motivated by his exploration of quadratic reciprocity.

Incidentally, Gauss gave many proofs of the quadratic reciprocity theorem. I don’t know where the proof outlined hear fits into the sequence of proofs he developed.

Related posts

Factored random numbers

A couple days ago Michael Nielsen posted an image of a one-page paper that gives an algorithm for generating factored random numbers, uniformly distributed from 1 to some designated N.

The algorithm does not generate random numbers then factor them. It’s more efficient than that, generating the factorization along with the final result. It does require testing for whether a number is prime, but this is more efficient than factorization.

I thought about trying to code up the algorithm in Python, but then I see that @iconjack beat me to it.

from sympy import isprime
from random import random, randint

def randfacts(N):
    while True:
        n, r, s = N, 1, []
        while n > 1:
            if r > N: break
            if isprime(n := randint(1,n)):
                r *= n
                s.append(n)
        else:
            if random() < r/N:
                return r, s

When there is only one group of a given size

Today’s date, US style, is 9/26/2023, and there is only one group, up to isomorphism, of size 9262023. You could verify this in Mathematica with the command

    FiniteGroupCount[9262023]

which returns 1.

For a given n, when is there only one group of size n?

There are two requirements. First, n has to be the product of distinct primes, i.e. no prime appears in the factorization with a power greater than 1. Second, no prime divides one less than another prime.

Now

9262023 = 3 × 41 × 257 ×293

and you can check that 3 does not divide 40, 256, or 292, nor does 41 divide 2, 252, or 292, etc.

A more compact way to state the criteria above is to say

gcd(n, φ(n)) = 1

where φ(n) is Euler’s totient function, the number of positive numbers less than n and relatively prime to n.

Why are these criteria equivalent? If

n = pqr

then

φ(n) = (p − 1)(q − 1)(r − 1)…

If n and φ(n) have a nontrivial common factor, it has to be one of the prime factors of n, and none of these divide any term of φ(n).

Source: Dieter Jungnickel. On the Uniqueness of the Cyclic Group of Order n. The American Mathematical Monthly, Vol. 99, No. 6. (Jun. – Jul., 1992), pp. 545–547.

Mersenne primes are unsafe

In the previous post I mentioned that a particular Mersenne prime would be unsuitable for cryptography. In fact, all Mersenne primes are unsuitable for cryptography.

A prime number p is called “safe” if

p = 2q + 1

where q is also a prime. Safe primes are called safe because p − 1 does not have small factors (other than 2). The factors of p − 1 correspond to subgroups of the group used for encryption, and small groups can be exploited to attack encryption.

Mersenne numbers are numbers of the form

Mn = 2n − 1.

Mersenne primes are Mersenne numbers that are also prime. A necessary condition for Mn to be prime is for n to be prime. This condition is not sufficient. For example,

211 − 1 = 23 × 89.

But is necessary, for reasons we’ll get into shortly.

If Mn = 2q + 1, then q = Mn−1. But if n is a prime, then n − 1 is not a prime, with one exception: n = 3. So the only Mersenne prime that is a safe prime is M3 = 7, which is not a particularly large prime. Public key cryptography uses numbers in the thousands of digits, not single digits.

Why does n have to be prime before Mn stands a chance of being prime?

If a > 1, then xa − 1 can be factored:

xa − 1 = (x − 1)(xa−1 + xa−2 + … + 1)

If n can be factored into ab, then set x = 2b. This shows that 2ab − 1 has a factor, namely 2b − 1.

In the previous post we said that M127 − 1 has a lot of small factors. We can find some of those factors easily:

M127 − 1 = 2 M126 = 2 (2126 − 1)

and (2126 − 1) is divisible by 2k – 1 for every k that divides 126.

The nontrivial factors of 126 are 2, 3, 6, 7, 9, 14, 18, 21, 42, 63, and so 2k – 1 is a factor of M126 for k equal to each of these numbers. This is enough to fully factor 2126 − 1 into

3³ × 7² × 19 × 43 × 73 × 127 × 337 × 5419 × 92737 × 649657 × 77158673929

given in the footnote from the previous post. You could easily come up with this factorization using pencil and paper, though it would not be easy to determine by hand that the last factor is a prime number.

Primes, weeds, and military precision

Here’s a quote from Don Zagier that I found in Larry Rolen’s lecture notes on modular forms.

There are two facts about the distribution of prime numbers of which I hope to convince you so overwhelmingly that they will be permanently engraved in your hearts. The first is that, despite their simple definition and role as the building blocks of the natural numbers, the prime numbers grow like weeds among the natural numbers, seeming to obey no other law than that of chance, and nobody can predict where the next one will sprout. The second fact is even more astonishing, for it states just the opposite: that the prime numbers exhibit stunning regularity, that there are laws governing their behavior, and that they obey these laws with almost military precision.

Emphasis added.

Gauss map, Euclidean algorithm, and continued fractions

The Gauss map [1] is the function

f(x) = \frac{1}{x} - \left\lfloor \frac{1}{x}\right\rfloor

where ⌊y⌋ is the floor of y, the greatest integer no larger than y.

I’ve written about this map a couple times before. First, I wrote about how this map is measure-preserving. Second, I wrote about the image at the top of the post, based on Michael Trott’s idea of extending the floor function to the complex plane and plotting it.

This post is a third take on the Gauss map, expanding on a comment by Giovanni Panti. His paper opens by saying

The fact that the euclidean algorithm eventually terminates is pervasive in mathematics. In the language of continued fractions, it can be stated by saying that the orbits of rational points under the Gauss map xx−1 −⌊x−1⌋ eventually reach zero.

What does the Gauss map have to do with continued fractions or the Euclidean algorithm? We’ll show this by working through an example.

A continued fraction has the form

a_0 + \cfrac{1}{a_1+\cfrac{1}{a_2+\cfrac{1}{a_3+\ddots}}}

Let’s start with 162/47 and see how we would write it as a continued fraction. An obvious place to start would be to write this as a proper fraction.

\frac{162}{47} = 3 + \frac{21}{47}

Next we turn 21/47 into 1 over something.

\frac{162}{47} = 3 + \frac{21}{47} = 3 + \cfrac{1}{\cfrac{47}{21}}

Now let’s do the same thing with 47/21: turn it into a proper fraction 2 + 5/21, then rewrite the fraction part 5/21 as the reciprocal of its reciprocal:

 \frac{162}{47} = 3 + \cfrac{1}{\cfrac{47}{21}} = 3 + \cfrac{1}{2 + \cfrac{5}{21}} = 3 + \cfrac{1}{2 + \cfrac{1}{\cfrac{21}{5}}

Finally, we write 21/5 as 4 + 1/5, and we’re done:

 \frac{162}{47} = 3 + \cfrac{1}{2 + \cfrac{1}{4 + \cfrac{1}{5}}}

Now go back and look at what happens to the fraction in the bottom left corner at each step:

 \frac{162}{47} = 3 + \frac{21}{47} = 3 + \cfrac{1}{2 + \cfrac{5}{21}} = 3 + \cfrac{1}{2 + \cfrac{1}{4 + \cfrac{1}{5}}}

The sequence of bottom left fractions is 21/47, 5/21, 1/5. Each fraction is replaced by its Gauss map: f(21/47) = 5/21, and f(5/21) = 1/5. We applied the Gauss map above naturally in the process of creating a continued fraction.

Now suppose we wanted to find the greatest common divisor of 162 and 47 using the Euclidean algorithm.

Notice that these are the same numbers, produced by the same calculations as above.

[1] There are other things called the Gauss map, such as the map that takes a point on a surface to the unit normal at that point. That’s not the Gauss map we’re talking about here.