# Pell is to silver as Fibonacci is to gold

As mentioned in the previous post, the ratio of consecutive Fibonacci numbers converges to the golden ratio. Is there a sequence whose ratios converge to the silver ratio the way ratios of Fibonacci numbers converge to the golden ratio?

(If you’re not familiar with the silver ratio, you can read more about it here.)

The Pell numbers Pn start out just like the Fibonacci numbers:

P0 = 0
P1 = 1.

But the recurrence relationship is slightly different:

Pn+2 = 2Pn+1 + Pn.

So the Pell numbers are 0, 1, 2, 5, 12, 29, ….

The ratios of consecutive Pell numbers converge to the silver ratio.

## Metallic ratios

There are more analogs of the golden ratio, such as the bronze ratio and more that do not have names. In general the kth metallic ratio is the larger root of

x² − kx − 1 = 0.

The cases n = 1, 2, and 3 correspond to the gold, silver, and bronze ratios respectively.

The quadratic equation above is the characteristic equation of the recurrence relation

Pn+2 = kPn+1 + Pn.

which suggests how we construct a sequence of integers such that consecutive ratios converge to the nth metallic constant.

So if we use k = 3 in the recurrence relation, we should get a sequence whose ratios converge to the bronze ratio. The results are

0, 1, 3, 10, 33, 109, 360, 1189, 3927, 12970, …

The following code will print the ratios.

    def bronze(n):
if n == 0: return 0
if n == 1: return 1
return 3*bronze(n-1) + bronze(n-2)

for n in range(2, 12):
print( bronze(n)/bronze(n-1) )


Here’s the output.

    3.0
3.3333333333333335
3.3
3.303030303030303
3.302752293577982
3.3027777777777776
3.302775441547519
3.3027756557168324
3.302775636083269
3.3027756378831383


The results are converging to the bronze ratio

(3 + √13)/2 = 3.302775637731995.

## Plastic ratio

The plastic ratio is the real root of x³ − x − 1 = 0. Following the approach above, we can construct a sequence of integers whose consecutive ratios converge to the plastic ratio with the recurrence relation

Sn+3 = Sn+1 + Sn

Let’s try this out with a little code.

    def plastic(n):
if n < 3: return n
return plastic(n-2) + plastic(n-3)

for n in range(10, 20):
print( plastic(n)/plastic(n-1) )


This prints

    1.3
1.3076923076923077
1.3529411764705883
1.3043478260869565
1.3333333333333333
1.325
1.320754716981132
1.3285714285714285
1.3225806451612903


which shows the ratios are approaching the plastic constant 1.324717957244746.

# Miles to kilometers

The number of kilometers in a mile is k = 1.609344 which is close to the golden ratio φ = 1.6180334.

The ratio of consecutive Fibonacci numbers converges to φ, and so you can approximately convert miles to kilometers by multiplying by a Fibonacci number and dividing by the previous Fibonacci number. For example, you could multiply by 8 and divide by 5, or you could multiply by 13 and divide by 8.

As you start going down the Fibonacci sequence, consecutive ratios get closer to k and closer to φ. But since the ratios converge to φ, at some point the ratios get closer to φ and further from k. That means there’s an optimal Fibonacci ratio for converting miles to kilometers.

I was curious what this optimal ratio is, and it turns out to be 21/13. There we have

|k − 21/13| = 0.0060406

and so the error in the approximation is 0.375%. The error is about a third smaller than using φ as the conversion factor.

The Lucas numbers satisfy the same recurrence relation as the Fibonacci numbers, but start with L0 = 2 and L1 = 1. The ratio of consecutive Lucas numbers also converges to φ, and so you could also use Lucas numbers to convert miles to kilometers.

There is an optimal Lucas ratio for converting miles to kilometers for the same reasons there is an optimal Fibonacci ratio. That ratio turns out to be 29/18, and

|k − 29/18| = 0.001767

which is about 4 times more accurate than the best Fibonacci ratio.

# Iterated Mersenne primes

A Mersenne number is a number of the form 2k − 1. A Mersenne prime is a Mersenne number which is also a prime.

It turns out that if 2k − 1 is prime then k must be prime, so Mersenne numbers have the form 2p − 1 is prime. What about the converse? If p is prime, is 2k − 1 also prime? No, because, for example, 211 −  1 = 2047 = 23 × 89.

If p is not just a prime but a Mersenne prime, then is 2p − 1 a prime? Sometimes, but not always. The first counterexample is p = 8191.

There is an interesting chain of iterated Mersenne primes:

This raises the question of whether m = 2M12 − 1 is prime. Direct testing using available methods is completely out of the question. The only way we’ll ever know is if there is some theoretical result that settles the question.

Here’s an easier question. Suppose m is prime. Where would it fall on the list of Mersenne primes if conjectures about the distribution of Mersenne primes are true?

This post reports

It has been conjectured that as x increases, the number of primes px such that 2p – 1 is also prime is asymptotically

eγ log x / log 2

where γ is the Euler-Mascheroni constant.

If that conjecture is true, the number of primes less than M12 that are the exponents of Mersenne primes would be approximately

eγ log M12 / log 2 = 226.2.

So if m is a Mersenne prime, it may be the 226th Mersenne prime, or Mn for some n around 226, if the conjectured distribution of Mersenne primes is correct.

We’ve discovered a dozen Mersenne primes since the turn of the century and we’re up to 51 discovered so far. We’re probably not going to get up to the 226th Mersenne prime, if there even is a 226th Mersenne prime, any time soon.

# Double super factorial

I saw someone point out recently that

10! = 7! × 5! × 3! × 1!

Are there more examples like this?

What would you call the pattern on the right? I don’t think there’s a standard name, but here’s why I think it should be called double super factorial or super double factorial.

## Super factorial

The factorial of a positive number n is the product of the positive numbers up to and including n. The super factorial of n is the product of the factorials of the positive numbers up to and including n. So, for example, 7 super factorial would be

7! × 6! × 5! × 4! × 3! × 2! × 1!

## Double factorial

The double factorial of a positive number n is the product of all the positive numbers up to n with the same parity of n. So, for example, the double factorial of 7 would be

7!! = 7 × 5 × 3 × 1.

## Double superfactorial

The pattern at the top of the post is like super factorial, but it only includes odd terms, so it’s like a cross between super factorial and double factorial, hence double super factorial.

Denote the double super factorial of n as dsf(n), the product of the factorials of all numbers up to n with the same parity as n. That is,

dsf(n) = n! × (n − 2)! × (n − 4)! × … × 1

where the 1 at the end is 1! if n is odd and 0! if n is even. In this notation, the observation at the top of the post is

10! = dsf(7).

## Super double factorial

We can see by re-arranging terms that a double super factorial is also a super double factorial. For example, look at

dsf(7) = 7! × 5! × 3! × 1!

If we separate out the first term in each factorial we have

(7 × 5 × 3 × 1)(6! × 4! × 2!) = 7!! dsf(6)

We can keep going and show in general that

dsf(n) = n!! × (n − 1)!! × (n − 2)!! … × 1

We could call the right hand side super double factorial, sdf(n). Just as a super factorial is a product of factorials, a super double factorials is a product of double factorials. Therefore

dsf(n) = sdf(n).

## Factorials that equal double super factorials

Are there more solutions to

n! = dsf(m).

besides n = 10 and m = 7? Yes, here are some.

0! = dsf(0)
1! = dsf(1)
2! = dsf(2)
3! = dsf(3)
6! = dsf(5)

There are no solutions to

n! = dsf(m)

if n > 10. Here’s a sketch of a proof.

Bertrand’s postulate says that for n > 1 there is always a prime p between n and 2n. Now p divides (2n)! but p cannot divide dsf(n) because dsf(n) only has factors less than or equal to n.

If we can show that for some N, n > N implies (2n)! < dsf(n) then there are no solutions to

n! = dsf(m)

for n > 2N because there is a prime p between N and 2N that divides the left side but not the right. In fact N = 12. We can show empirically there are no solutions for n = 11 up to 24, and the proof shows there are no solutions for n > 24.

# Finite differences and Pascal’s triangle

The key to solving a lot of elementary what-number-comes-next puzzles is to take first or second differences. For example, if asked what the next item in the series

14, 29, 50, 77, 110, …

the answer (or at lest the answer the person posing the question is most likely looking for) is 149. You might discover this by first looking at the differences of the items:

15, 21, 27, 33, …

The differences all differ by 6, i.e. the second difference of the series is constant. From there you can infer that the next item in the original series will be 39 more than the previous, i.e. it will be 149.

We can apply the same technique for exploring series that are not artificial puzzles. For example, a one-page article by Harlan Brothers [1] asks what would happen if you looked at the products of elements in each row of Pascal’s triangle.

The products grow very quickly, which suggests we work on a log scale. Define

Let’s use a little Python script to look at the first 10 elements in the series.

    from scipy.special import binom
from numpy import vectorize, log

def s(n):
return sum([log(binom(n, k)) for k in range(n+1)])
s = vectorize(s)

n = range(1, 11)
x = s(n)
print(x)


This prints

0.
0.69314718
2.19722458
4.56434819
7.82404601
11.9953516
17.0915613
23.1224907
30.0956844
38.0171228

Following the strategy at the top of the post, let’s look at the first differences of the sequence with [2]

    y = x[1:] - x[:-1]
print(y)


This prints

0.69314718
1.50407740
2.36712361
3.25969782
4.17130560
5.09620968
6.03092943
6.97319372
7.92143836

The first differences are increasing by about 0.9, i.e. the second differences are roughly constant. And if we look at the third differences, we find that they’re small and getting smaller the further out you go.

We can easily look further out in the sequence by changing range(1, 11) to range(1, 101). When we do, we find that the second difference are

…, 0.99488052, 0.9949324. 0.99498325

If we look even further out, looking at a thousand terms, the last of the second differences is

…, 0.99949883, 0.99949933, 0.99949983

We might speculate that the second differences are approaching 1 as n → ∞. And this is exactly what is proved in [1], though the author does not work on the log scale. The paper shows that the ratio of the ratio of consecutive lines converges to e. This is equivalent on a log scale to saying the second differences converge to 1.

[1] Harlan J. Brothers. Math Bite: Finding e in Pascal’s Triangle. Mathematics Magazine , Vol. 85, No. 1 (February 2012), p. 51

[2] In Python, array elements are numbered starting at 0, and x[1:] represents all but the first elements of x. The index −1 is a shorthand for the last element, so x{:-1] means all the elements of x up to (but not including) the last.

A function f from positive integers to real numbers is defined to be additive if for relatively prime numbers m and n,

f(mn) = f(m) + f(n).

The function f is called completely addititive if the above holds for all positive integers m and n, i.e. we drop the requirement that m and n are relatively prime.

## Example: total prime factors

One example of an additive function is the function Ω(n) defined to be the number of prime factors of n, counted with multiplicity. For example, Ω(12) = 3 because 12 = 2 × 2 × 3. The numbers 10 and 63 are relatively prime, and

Ω(630) = 5 = Ω(10) + Ω(63).

## Example: distinct prime factors

Another example of an additive function is ω(n) defined to be the number of distinct prime factors of n, i.e. not counting with multiplicity. So, for example, ω(12) = 2.

ω(20) = 2 ≠ ω(2) + ω(10)  = 3

## A theorem of Erdős

Here is a remarkable theorem due to Paul Erdős [1]. Suppose f is an additive function such that

f(n + 1) − f(n)

converges to zero as n goes to infinity. Then

f(n) = c log(n)

for some constant c. And since a multiple of a logarithm is a logarithm to a different base, we can restate the conclusion by simply saying f is a logarithm.

Logarithms are completely additive functions, so even though we only assumed f was additive, this combined with the limit condition proves that in fact f is completely additive.

## Related posts

[1] Paul Erdős, “On the distribution function of additive functions,” Ann. of Math., Vol. 47 (1946), pp. 1–20.

I saw a hand-drawn version of the diagram above yesterday and noticed that the points were too evenly distributed. That got me to thinking: is there any objective way to say that this famous diagram is in some sense complete? If you were to make a diagram with more points, what would they be?

## Simple numbers

The numbers on the diagram are all simple. Once we’re more precise about what it means for these numbers to be “simple,” we can answer the questions above.

The angles in the diagram are all rational parts of a circle, that is, rational multiples of 2π. For the rest of the post, I’ll say “rational angle” to mean a rational multiple of 2π.

The sines and cosines all involve only one square root, i.e. no nested roots. A more useful way to express this is that all the values are the roots of a quadratic polynomial with integer coefficients.

## Completeness

Could we add more rational angles whose sines and cosines are roots of quadratics? Maybe the chart would be too cluttered to put in a textbook, but would it be possible in principle? Could there be some chart analogous to the one above that has, for example, (1 + √7)/5 as one of the labels?

The angles in the common unit circle diagram, integer multiples of π/4 and π/6, are the only rational angles with sines and cosines that are roots of a quadratic polynomial with integer coefficients. That is, these are the only rational angles that have sines and cosines that are algebraic of degree 2. In that sense the diagram is complete.

The number (1 + √7)/5 is algebraic of degree 2 [1] but isn’t on our exhaustive list of possible algebraic values of degree 2. So even if you were to try numbers of the form pπ/q for very large integers p and q, you’ll never get a sine or cosine equal to (1 + √7)/5.

In 1933 Lehmer [2] showed how to classify all rational angles whose sines or cosines are algebraic of given degree. His theorem proves that the only rational angles whose sine is algebraic of degree 2 are integer multiples of π/4 and π/6.

Interestingly, there is another rational angle whose cosine is algebraic of degree 2:

cos(π/5) = (1 + √5)/4

So we could extend the unit circle diagram to include multiples of π/5, but only the cosine would be algebraic of degree 2. The sines are more complicated. For example,

sin(π/5) = √(5/8 + √(5)/8)

which is algebraic of degree 4.

## Higher degrees

There are no rational angles whose sine is algebraic of degree 3, so going up to degree 3 wouldn’t help.

If we go up to degree 4 then we could add multiples of π/5, π/8, and π/12. These all have sines and cosines that are algebraic of degree 4.

## Related posts

[1] (1 + √7)/5  is a root of 25x² − 10x = 6 = 0.

[2] D. H. Lehmer. A Note on Trigonometric Algebraic Numbers. The American Mathematical Monthly , March 1933, Vol. 40, No. 3, pp. 165–166

# Factoring pseudoprimes

Fermat’s little theorem says that if p is a prime number, then for any positive integer b < p we have

bp−1 = 1 (mod p).

This theorem gives a necessary but not sufficient condition for a number to be prime.

## Fermat’s primality test

The converse of Fermat’s little theorem is not always true, but it’s often true. That is, if there exists some base 1 < b < n such that

bn−1 = 1 (mod n)

then n is likely to be prime. There are examples where the equation above holds for a pair (b, n) even though n is not prime, and in that case n is called a pseudoprime to the base b.

If you’re searching for large primes, say for use in encryption, then you’d begin by applying Fermat’s little theorem with a few small values of b. This is because although Fermat’s test can’t prove that a number is prime, it can prove that a number is not prime.

For a small example, suppose you wanted to test whether 50621 is prime. You could start by applying Fermat’s test with b = 2 as in the following Python code.

>>> n = 50621
>>> 2**(n−1) % n
9605

Since the result is not 1, we know 50621 is not prime. This doesn’t tell us what the factors of 50621 are, but we know that it has nontrivial factors. We say 2 is a witness that the number 50621 is not prime.

Next, let’s see whether 294409 might be prime.

>>> n = 294409
>>> 2**(n – 1) % n
1

This tells us 294409 might be prime. It has passed a test that filters out a lot of composite numbers. What now? We could try other values of b: 3, 5, 7, 11, …. This will not resolve the question of whether 294409 is prime unless we keep going until we try 37. And in fact 37 is the smallest factor of 294409. Our number 294409 is a Carmichael number, a composite number n that passes Fermat’s primality test for all bases b relatively prime to n.

Note that it would be more efficient to use pow(b, n − 1, n) rather than 2**(n − 1) % n because the former takes advantage of the fact that we don’t need to compute 2n−1 per se and can reduce all intermediate calculations mod n.

## Factoring pseudoprimes

Now suppose we have a number n that has passed Fermat’s primality test for some base b and we suspect that n is a pseudoprime. If we want to (try to) factor n, knowing that it is a pseudoprime to the base b gives us a head start. We can exploit the fact that we know b to factor n in polynomial time, unless n is a strong pseudoprime.

Suppose we have a number n that we suspect is a pseudoprime to the base b, and we’re smart enough to at least check that n is an odd number, then we begin by pulling out all the factors of 2 that we can from n − 1:

n − 1 = 2e f.

Next consider the set of numbers

bkf

for k = 1, 2, 4, …, 2e. Let x be the smallest of these numbers which is not congruent to 1 mod n. The existence of such an x is essentially the definition of strong pseudoprime [1].

Then gcd(x − 1, n) and gcd(x + 1, n) are factors of n. This is theorem 10.4 of [2].

## Python example

Let n = 873181. This is a pseudoprime to the base b = 3, which we can confirm by seeing that pow(3, n−1, n) returns 1.

Now 873180 is divisible by 4 but not by 8, so e = 2. So the theorem above says we should compute

>>> b, e = 3, 2
>>> [pow(b, f*2**k, n) for k in range(e+1)]

This produces [2643, 1, 1]. So x = 2643,

>> x = 2643
>>> from sympy import gcd
>>> gcd(x−1, n)
1321
>>> gcd(x+1, n)
661

shows that 1321 and 661 are both factors of 873181.

## Related posts

[1] Definition of strong pseudoprime. A strong pseudo prime to base b is a composite odd integer m such that if m − 1 = 2ef  with f odd, then either bf = 1 (mod m) or bf2c ≡ −1 (mod m) for some 0 ≤ c < e.

[2] The Joy of Factoring by Samuel S. Wagstaff, Jr.

# Connecting the FFT and quadratic reciprocity

Some readers will look at the title of this post and think “Ah yes, the FFT. I use it all the time. But what is this quadratic reciprocity?”

Others will look at the same title and think “Gauss called the quadratic reciprocity theorem the jewel in the crown of mathematics. But what is this FFT thing? I think I remember an engineer saying something about that.”

Gauss proved a theorem that relates quadratic reciprocity and the FFT. For distinct odd primes p and q, the following equation holds.

I’ll spend the rest of this post unpacking this equation.

## Legendre symbols

The expressions on the left are not fractions but rather Legendre symbols. The parentheses are not for grouping but are part of the symbol.

The Legendre symbol

is defined to be 0 if a is a multiple of r, 1 if a has a square root mod r, and −1 otherwise.

## Fourier transforms

The Discrete Fourier Transform (DFT) of a vector of length n multiplies the vector by the n by n Fourier matrix Fp whose j, k entry is equal to exp(2πi jk / n). The Fast Fourier Transform (FFT) is a way to compute the DFT more quickly than directly multiplying by the Fourier matrix. Since the DFT is nearly always computed using the FFT algorithm, the DFT is commonly referred to as the FFT.

## Matrix trace

The trace of a matrix is the sum of the elements along the main diagonal. So the trace of the Fourier matrix of size n is

## Numerical illustration

The quadratic reciprocity theorem, also due to Gauss, is usually stated as

We can illustrate the theorem at the top of the page numerically with the following Python code, using the quadratic reciprocity theorem to evaluate the product of the Legendre symbols.

from numpy import exp, pi

tr = lambda p: sum(exp(2j*pi*k**2/p) for k in range(1, p+1))
p, q = 59, 17
print( tr(p*q)/(tr(p)*tr(q)) )
print( (-1)**((p-1)*(q-1)/4) )


The first print statement produces (0.9999999999998136-1.4048176871018313e-13j) due to some loss of precision due to floating point calculations, but this is essentially 1, which is what the second print statement produces.

If we change q to 19, both statements print −1 (after rounding the first result).

We can quickly prove

if we assume the quadratic reciprocity theorem and the following equation for the trace of the Fourier matrix.

This proof is historically backward. It assumes quadratic reciprocity, but Gauss proved quadratic reciprocity by first proving the equation we’re trying to prove. He then obtained the expression on the right hand side of the quadratic reciprocity theorem using the equation above for the trace of the Fourier matrix.

The trace of the Fourier matrix is now called a quadratic Gauss sum. It’s a special case of more general sums that Gauss studied, motivated by his exploration of quadratic reciprocity.

Incidentally, Gauss gave many proofs of the quadratic reciprocity theorem. I don’t know where the proof outlined hear fits into the sequence of proofs he developed.

# Factored random numbers

A couple days ago Michael Nielsen posted an image of a one-page paper that gives an algorithm for generating factored random numbers, uniformly distributed from 1 to some designated N.

The algorithm does not generate random numbers then factor them. It’s more efficient than that, generating the factorization along with the final result. It does require testing for whether a number is prime, but this is more efficient than factorization.

I thought about trying to code up the algorithm in Python, but then I see that @iconjack beat me to it.

from sympy import isprime
from random import random, randint

def randfacts(N):
while True:
n, r, s = N, 1, []
while n > 1:
if r > N: break
if isprime(n := randint(1,n)):
r *= n
s.append(n)
else:
if random() < r/N:
return r, s