Rational height functions

Mathematicians often speak informally about the relative simplicity of rational numbers. For example, musical intervals that correspond to simple fractions have less tension than intervals that correspond to more complicated fractions.

Such informal statements can be made more precise using height functions. There are a variety of height functions designed for different applications, but the most common height function defines the height of a fraction p/q in lowest terms to be the sum of the numerator and denominator:

height(p/q) = |p| + |q|.

This post will look at how this applies to musical intervals, to approximations for π, and the number of days in a year.

Musical intervals

Here are musical intervals, ranked by the height of their just tuning frequency ratios.

  1. octave (2:1)
  2. fifth (3:2)
  3. forth (4:3)
  4. major sixth (5:3)
  5. major third (5:4)
  6. minor third (6:5)
  7. minor sixth (8:5)
  8. minor seventh (9:5)
  9. major second (10:9)
  10. major seventh (15:8)
  11. minor second (16:15)
  12. augmented fourth (45:32)

The least tension is an interval of an octave. The next six intervals are considered consonant. A minor seventh is considered mildly dissonant, and the rest are considered more dissonant. The most dissonant interval is the augmented fourth, also known as a tritone because it is the same interval as three whole steps.

Incidentally, a telephone busy signal consists of two pitches, 620 Hz and 480 Hz. This is a ratio of 24:31, which has a height of 54. This is consistent with the signal being moderately dissonant.

Approximations to π

The first four continued fraction approximations to π are 3, 22/7, 333/106, and 335/113.

Continued fraction convergents give the best rational approximation to an irrational for a given denominator. But for a height value that is not the height of a convergent, the best approximation might not be a convergent.

For example, the best approximation to π with height less than or equal to 333 + 106 is 333/106. But the best approximation with height less than or equal to 400 is 289/92, which is not a convergent of the continued fraction for π.

Days in a year

The number of days in a year is 365.2424177. Obviously that’s close to 365 1/4, and 1/4 is the best approximation to 0.2424177 for its height.

The Gregorian calendar has 97 leap days every 400 years, which approximates 0.2424177 with 97/400. This approximation has practical advantages for humans, but 8 leap days every 33 years would be a more accurate approximation with a much smaller height.

Related posts

Finite field Diffie Hellman primes

Diffie-Hellman key exchange is conceptually simple. Alice and Bob want to generate a shared cryptographic key. They want to use asymmetric (public) cryptography to share a symmetric (private) key.

The starting point is a large prime p and a generator 1 < g < p.

Alice generates a large random number x, her private key, and sends Bob gx mod p.

Similarly, Bob generates a large random number y, his private key, and sends Alice gy mod p.

Alice takes gy and raises it to her exponent x, and Bob takes gx and raises it to the exponent y. They arrive at a common key k because

k = (gy)x = (gx)y mod p.

The security of the system rests on the assumption that the discrete logarithm problem is hard, i.e. given g and gz it is computationally impractical to solve for z. This assumption appears to be true in general, but can fail when the group generated by g has exploitable structure.

You can read more about Diffie-Hellman here.

Recommended primes

The choice of prime p and generator g can matter is subtle ways and so there are lists of standard choices that are believed to be secure.

IETF RFC 7919 recommends five standard primes. These have the form

p = 2^b - 2^{b-64} + 2^{64} \left( \lfloor 2^{b-130} e\rfloor + X \right) - 1

where b is the size of p in bits, e is the base of natural logarithms, and X is the smallest such that p is a safe prime. In every case the generator is g = 2.

The values of b are 2048, 3072, 4096, 6144, and 8192. The values of X and p are given in RFC 7919, but they’re both determined by b.

I don’t imagine there’s anything special about the constant e above. I suspect it’s there to shake things up a bit in a way that doesn’t appear to be creating a back door. Another irrational number like π or φ would probably do as well, but I don’t know this for sure.


The recommended primes have names of the form “ffdhe” followed by b. For b = 2048, the corresponding value is X is 560316.

I wrote a little Python code to verify that this value of X does produce a safe prime and that smaller values of X do not.

    #!/usr/bin/env python3
    from sympy import isprime, E, N, floor
    b = 2048
    e = N(E, 1000)
    c = floor(2**(b-130) * e)
    d = 2**b - 2**(b-64) + 2**64*c - 1
    def candidate(b, x):
        p = d + 2**64*x
        return p
    for x in range(560316, 0, -1):
        p = candidate(b, x)
        if isprime(p) and isprime((p-1)//2):

This took about an hour to run. It only printed 560316, verifying the claim in RFC 7919.

Other groups

Finite field Diffie-Hellman is so called because the integers modulo a prime form a finite field. We don’t need a field per se; we’re working in the group formed by the orbit of g within that field. Such groups need to be very large in order to provide security.

It’s possible to use Diffie-Hellman over any group for which the discrete logarithm problem is intractable, and the discrete logarithm problem is harder over elliptic curves than over finite fields. The elliptic curve groups can be smaller and provide the same level of security. Smaller groups mean smaller keys to exchange. For this reason, elliptic curve Diffie-Hellman is more commonly used than finite field Diffie-Hellman.

Chinese Remainder Theorem synthesis algorithm

Suppose m = pq where p and q are large, distinct primes. In the previous post we said that calculations mod m can often be carried out more efficiently by working mod p and mod q, then combining the results to get back to a result mod m.

The Chinese Remainder Theorem assures us that the system of congruences

\begin{align*} x &\equiv a\pmod p \\ x &\equiv b\pmod q \end{align*}

has a unique solution mod m, but the theorem doesn’t say how to compute x efficiently.

H. L. Garner developed an algorithm to directly compute x [1]:

x = p \biggl(\big(\,(a-b)(q^{-1}\bmod p\,)\big)\bmod p\biggr) + b

You compute the inverse of q mod p once and save it, then solving the system above for multiple values of a and b is very efficient.

Garner’s algorithm extends to more than two factors. We will present the general case of his algorithm below, but first we do a concrete example with RSA keys.

RSA key example

This is a continuation of the example at the bottom of this post.

This shows that the numbers in the key file besides those that are strictly necessary for the RSA algorithm are numbers needed for Garner’s algorithm.

What the key file calls “coefficient” is the inverse of q modulo p.

What the key file calls “exponent1” is the the decryption exponent d reduced mod p-1. Similarly, “exponent2” is d reduced mod q-1 as explained here.

    from sympy import lcm

    prime1 = 0xf33514...d9
    prime2 = 0xfee496...51

    publicExponent = 65537
    privateExponent = 0x03896d...91

    coefficient = 0x4d5a4c...b7 # q^-1 mod p
    assert(coefficient*prime2 % prime1 == 1)

    exponent1 = 0x37cc69...a1 # e % phi(p)
    exponent2 = 0x2aa06f...01 # e % phi(q)
    assert(privateExponent % (prime1 - 1) == exponent1)
    assert(privateExponent % (prime2 - 1) == exponent2)

More factors

Garner’s algorithm can be used more generally when m is the product of more than two primes [2]. Suppose

m = \prod_{i=1}^n m_i

where the mi are pairwise relatively prime (not necessarily prime). Then the system of congruences

x \equiv x_i \pmod {m_i}

for i = 1, 2, 3, …, n can be solved by looking for a solution of the form

x = v_1 + v_2 m_1 + v_3 m_1m_2 + \cdots + v_n \prod_{i=1}^{n-1} m_i


v_k \equiv \bigg( x_k - \big(v_1 + v_2m_1 + \cdots + v_{k-1} \prod_{i=1}^{k-2}m_i \big) \bigg) \bigg( \prod_{i=1}^{k-1} m_i\bigg)^{-1} \pmod {m_k}

Again, in practice the modular inverses of the products of the ms would be precomputed and cached.

Related posts

[1] Ferguson, Schneier, Kohno. Cryptography Engineering. Wiley. 2010.

[2] Geddes, Czapor, and Labahn. Algorithms for Computer Algebra. Kluwer Academic Publishers. 1992.

Gaining efficiency by working modulo factors

Suppose m is a large integer that you are able to factor. To keep things simple, suppose m = pq where p and q are distinct primes; everything in this post generalizes easily to the case of m having more than two factors.

You can carry out calculations mod m more efficiently by carrying out the same calculations mod p and mod q, then combining the results. We analyze m into its remainders by p and q, carry out our calculations, then synthesize the results to get back to a result mod m.

The Chinese Remainder Theorem (CRT) says that this synthesis is possible; Garner’s algorithm, the subject of the next post, shows how to compute the result promised by the CRT.

For example, if we want to multiply xy mod m, we can analyze x and y as follows.

\begin{align*} x &\equiv x_p \pmod p \\ x &\equiv x_q \pmod q \\ y &\equiv y_p \pmod p \\ y &\equiv y_q \pmod q \\ \end{align*}


\begin{align*} xy &\equiv x_py_p \pmod p \\ xy &\equiv x_qy_q \pmod q \\ \end{align*}

and by repeatedly multiplying x by itself we have

\begin{align*} x^e &\equiv x_p^e \pmod p \\ x^e &\equiv x_q^e \pmod q \\ \end{align*}

Now suppose p and q are 1024-bit primes, as they might be in an implementation of RSA encryption. We can carry out exponentiation mod p and mod q, using 1024-bit numbers, rather than working mod n with 2048-bit numbers.

Furthermore, we can apply Euler’s theorem (or the special case Fermat’s little theorem) to reduce the size of the exponents.

\begin{align*} x^e &\equiv x_p^e \equiv x_p^{e \bmod p-1}\pmod p \\ x^e &\equiv x_q^e \equiv x_q^{e \bmod q-1}\pmod q \end{align*}

Assuming again that p and q are 1024-bit numbers, and assuming e is a 2048-bit number, by working mod p and mod q we can use exponents that are 1024-bit numbers.

We still have to put our pieces back together to get the value of xe mod n, but that’s the subject of the next post.

The trick of working modulo factors is used to speed up RSA decryption. It cannot be used for encryption since p and q are secret.

The next post shows that is in fact used in implementing RSA, and that a key file contains the decryption exponent reduced mod p-1 and mod q-1.

Can the chi squared test detect fake primes?

Jeweler examining gemstones

This morning I wrote about Dan Piponi’s fake prime function. This evening I thought about it again and wondered whether the chi-squared test could tell the difference between the distribution of digits in real primes and fake primes.

If the distributions were obviously different, this would be apparent from looking at histograms. When distributions are more similar, visual inspection is not as reliable as a test like chi-squared. For small samples, we can be overly critical of plausible variations. For large samples, statistical tests can detect differences our eyes cannot.

When data fall into a number of buckets, with a moderate number of items expected to fall in each bucket, the chi squared test attempts to measure how the actual counts in each bucket compare to the expected counts.

This is a two-sided test. For example, suppose you expect 12% of items to fall in bucket A and 88% to fall in bucket B. Now suppose you test 1,000 items. It would be suspicious if only 50 items landed in bucket A since we’d expect around 120. On the other hand, getting exactly 120 items would be suspicious too. Getting exactly the expected number is unexpected!

Let’s look a the primes, genuine and fake, less than N = 200. We’ll take the distribution of digits in the list of primes as the expected values and compare to the distribution of the digits in the fake primes.

When I ran this experiment, I got a chi-squared value of 7.77. This is an unremarkable value for a sample from a chi-squared distribution with 9 degrees of freedom. (There are ten digits, but only nine degrees of freedom because if you rule out nine possibilities then digit is determined with certainty.)

The p-value in this case, the probability of seeing a value as large as the one we saw or larger, is 0.557.

Next I increased N to 1,000 and ran the experiment again. Now I got a chi-squared value of 19.08, with a corresponding p-value of 0.024. When I set N to 10,000 I got a chi-squared value of 18.19, with a corresponding p-value of 0.033.

When I used N = 100,000 I got a chi-squared value of 130.26, corresponding to a p-value of 10-23. Finally, when I used N = 1,000,000 I got a chi-squared value of 984.7 and a p-value of 3.4 × 10-206.

In short, the chi-squared test needs a fair amount of data to tell that fake primes are fake. The distribution of digits for samples of fake primes less than a thousand or so is plausibly the same as that of actual primes, as far as the test can distinguish. But the chi-squared values get implausibly large for fake primes up to  100,000.

Fake primes

Someone asked on Math Overflow about the distribution of digits in primes. It seems 0 is the least common digit and 1 the most common digit.

Dan Piponi replies “this is probably just a combination of general properties of sets of numbers with a density similar to the primes and the fact that primes end in 1, 3, 7 or 9” and supports this by showing that “fake primes” have very similar digit distributions as actual primes. He generates the nth fake prime by starting with n log n and generating a nearby random integer ending in 1, 3, 7, or 9.

It seems like this fake prime function could be useful for studying more questions. Here is Dan Piponi’s Mathematica implementation:

    fakePrime[n_] :=
      {m = n Log[n]},
      10 RandomInteger[{Floor[0.09 m], Floor[0.11 m]}] + 
         RandomChoice[{1, 3, 7, 9}]

Twin stars and twin primes

Are there more twin stars or twin primes?

If the twin prime conjecture is true, there are an infinite number of twin primes, and that would settle the question.

We don’t know whether there are infinitely many twin primes, and it’s a little challenging to find any results on how many twin primes we’re sure exist.

According to OEIS, we know there are 808,675,888,577,436 pairs of twin primes less than 1018.

There are perhaps 1024 stars in the observable universe. If so, there are certainly less than 1024 pairs of binary stars in the observable universe. In our galaxy about 2/3 of stars are isolated and about 1/3 come in pairs (or larger clusters). If that holds in every galaxy, then the number of binary stars is within an order of magnitude of the total number of stars.

Do we know for certain there are at least 1024 twin primes? It doesn’t seem anybody is interested in that question. There is more interest in finding larger and larger twin primes. The largest pair currently know have 388,342 digits.

The Hardy-Littlewood conjecture speculates that π2(x), the number of twin prime pairs less than x, is asymptotically equal to the following

\pi_2(x) \sim C_2 \int_2^x \frac{dt}{(\log t)^2}

where C2 is a constant, the product of (1 − 1/(p − 1)²) over all odd primes. Numerically C2 = 0.6601618….

When x = 1018 the right hand side of the Hardy Littlewood conjecture agrees with the actual number to at least six decimal places. If the integral gives an estimate of π2(x) within an order of magntude of being correct for x up to 1028, then there are more twin primes than twin stars.

It’s interesting that our knowledge of both twin stars and twin primes is empirical, though in different ways. We haven’t counted the number of stars in the universe or, as far as I know, the number of twin primes less than 1028, but we have evidence that gives us reason to believe estimates of both.

Floors and roots

I recently stumbled across the identity

\left\lfloor \sqrt{n} + \sqrt{n+1} + \sqrt{n+2}\right\rfloor = \left\lfloor \sqrt{9x+8}\right\rfloor

while reading [1]. Here ⌊x⌋ is the floor of x, the largest integer not larger than x.

My first thought was that this was hard to believe. Although the floor function is very simple, its interactions with other functions tends to be complicated. I was surprised that such a simple equation was true.

My second thought was that the equation makes sense. For large n the three terms on the left are roughly equal, and so

\sqrt{n} + \sqrt{n+1} + \sqrt{n+2} \approx 3 \sqrt{n+1} = \sqrt{9n + 9}

Not only that, the approximation gets better as n gets larger. So the theorem is at least plausible.

My third thought was that there is something subtle going on here. Why the 8 on the right hand side? It turns out the theorem is false if you replace the 8 with a 9. Equality fails to hold for n = 0, 3, 8, 15, …

There is a difference between saying xy and saying ⌊x⌋ = ⌊y⌋. The former makes the latter plausible, but it’s not the same. If x and y are very close, but on opposite sides of an integer, then ⌊x⌋ ≠ ⌊y⌋.

In fact, the approximation

\sqrt{n} + \sqrt{n+1} + \sqrt{n+2} \approx \sqrt{9n+a}

is better when a = 9 than when a = 8. Here’s a plot of the approximation errors.

So there’s something clever going on with the selection a = 8.

According to [1], the equation at the top was proposed as a problem in 1988 and a solution was published in 2005. The time span between the proposed theorem and its proof suggests that the proof isn’t trivial, though I have not yet read it.

This is an obscure problem, and so we can’t assume that hundreds of mathematicians were feverishly trying to find a solution for 17 years. On the other hand, if there were a trivial proof it seems someone would have posted it quickly.

[1] Prapanpong Pongsriiam. Analytic Number Theory for Beginners, 2nd edition.. American Mathematical Society 2023

Numbers don’t typically have many prime factors

Suppose you select a 100-digit number at random. How many distinct prime factors do you think it would have?

The answer is smaller than you might think: most likely between 5 and 6.

The function ω(n) returns the number of distinct prime factors of n [1]. A theorem of Hardy and Ramanujan says that as N goes to infinity, the average value of ω(n) for positive integers up to N is log log N.

Since log log 10100 = 5.43…, we’d expect 100-digit numbers to have between 5 and 6 distinct factors.

The above calculation gives the average number of distinct prime factors for all numbers with up to 100 digits. But if we redid the calculation looking at numbers between 1099 and 10100 it would only make a difference in the third decimal place.

Let’s look at a much smaller example where we can tally the values of ω(n), numbers from 2 to 1,000,000.

Most numbers with up to six digits have two or three distinct prime factors, which is consistent with log log 106 ≈ 2.6.

There are 2,285 six-digit numbers that have six distinct prime factors. Because this is out of a million numbers, it corresponds to a bar in the graph above that is barely visible.

There are 8 six-digit number with 7 distinct prime factors and none with more factors.

Hardy’s theorem with Ramanujan established the mean of ω(n) for large numbers. The Erdős–Kac theorem goes further and says roughly that ω(n) is normally distributed with mean and variance log log n. More on this here.

So returning to our example of 100-digit numbers, the Hardy-Ramanujan theorem implies these numbers would have between 5 and 6 prime factors on average, and the Erdős–Kac theorem implies that about 95% of such numbers have 10 or fewer distinct prime factors. The maximum number of distinct prime factors is 54.

Related posts

[1] SymPy has a function primeomega, but it does not compute ω(n). Instead, it computes Ω(n), the number of prime factors of n counted with multiplicity. So, for example, ω(12) = 2 but Ω(12) = 3. The SymPy function to compute ω(n) is called primenu.

Leading digits of primes

How are the first digits of primes distributed? Do some digits appear as first digits of primes more often that others? How should we even frame the problem?

There are an infinite number of primes that begin with each digit, so the cardinalities of the sets of primes beginning with each digit are the same. But cardinality isn’t the right tool for the job.

The customary way to (try to) approach problems like this is to pose a question for integers up to N and then let N got to infinity. This is called natural density. But this limit doesn’t always converge, and I don’t believe it converges in our case.

Logarithmic density

An alternative is to compute logarithmic density. This measure exists in cases where natural density does not. And when natural density does exist, logarithmic density may give the same result [1].

The logarithmic density of a sequence A is defined by

\lim_{N\to\infty}\frac{1}{\log N}\sum_{x \in A, \, x \leq N} \frac{1}{x}

We want to know about the relative logarithmic density of primes that start with 1, for example. The relative density of a sequence A relative to a sequence B is defined by

\lim_{N\to\infty} \frac{\sum_{x \in A, \, x \leq N} \frac{1}{x}} {\sum_{x \in B, \, x \leq N} \frac{1}{x}}

We’d like to know about the logarithmic density of primes that start with 1, 2, 3, …, 9 relative to all primes. The exact value is known [2], and we will get to that shortly, but first let’s try to anticipate the result by empirical calculation.

We want to look at the first million primes, and we could do that by calling the function prime with the integers 1 through 1,000,000. But it is much more efficient to as SymPy to find the millionth prime and then find all primes up to that prime.

    from sympy import prime, primerange
    import numpy as np

    first_digit = lambda n: int(str(n)[0])

    density = np.zeros(10)
    N = 1_000_000
    for x in primerange(prime(N+1)):
        density[0] += 1/x
        density[first_digit(x)] += 1/x
    density /= density[0]

Here’s a histogram of the results.

Clearly the distribution is uneven, and in general more primes begin with small digits than large digits. But the distribution is somewhat irregular. That’s how things work with primes: they act something like random numbers, except when they don’t. This sort of amphibious nature, between regularity and irregularity, makes primes interesting.

The limits defining the logarithmic relative density do exist, though the fact that even a million terms doesn’t produce a smooth histogram suggests the convergence is not too rapid.

In the limit we get that the density of primes with first digit d is

\log_{10}\left(1 + \frac{1}{d}\right)

The densities we’d see in the limit are plotted as follows.

This is exactly the density given by Benford’s law. However, this is not exactly Benford’s law because we are using a different density, logarithmic relative density rather than natural density [3].

Related posts

[1] The Davenport–Erdős theorem says that for some kinds of sequences, if the natural density exists, logarithmic density also exists and will give the same result.

[2] R. E. Whitney. Initial Digits for the Sequence of Primes. The American Mathematical Monthly, Vol. 79, No. 2 (Feb., 1972), pp. 150–152

[3] R. L. Duncan showed that the leading digits of integers satisfy the logarithmic density version of Benford’s law.