When Benford’s law is exact

Eight years ago I wrote a blog post on the Leading digits of powers of 2. In that post I said that Benford’s law gives remarkably good predictions for the distribution of leading digits of 2n. In this post I’ll make that statement stronger.

A sequence obeys Benford’s law, in base 10, if the proportion of the first N elements of the sequence that begin with the digit d is asymptotically

N log10(1 + 1/d).

Benford’s law applies to powers of 2, but also to factorials, for example. In general it applies only in the limit as N goes to infinity. For a large but finite value of N it gives a good approximation, but there’s no guarantee on how good the approximation is.  Except sometimes there is.

You could use Benford’s law to estimate how many times n! begins with a 1, but it can tell you exactly how many times 2n begins with a 1. The exact number is the estimate given by Benford’s law with d = 1, rounded down to the nearest integers, i.e. the floor of the expression above.

N log10(2)⌋

See [1] for a proof. The following Python code demonstrates this.

    import decimal
    from math import log10, floor

    def exact(N):
        c = 0
        a = decimal.Decimal(1)
        for i in range(1, N+1):
            a = a*2
            if int(str(a)[0]) == 1:
                c += 1
        return c

    def estimated(N):
        return N*log10(2)

    for i in range(1, 10000):
        assert(exact(i) == int(floor(estimated(i))))

The code uses the decimal class for efficiency. See Andrew Dalke’s comment on the original post. [2]

Benford’s law predicts exactly how often the sequence 2n begins with a 1. It also predicts exactly how often it begins with 4, but it is not exact for other digits.

Related posts

[1] Zhaodong Cai, Matthew Faust, A. J. Hildebrand, Junxian Li, Yuan Zhang. The Surprising Accuracy of Benford’s Law in Mathematics. The American Mathematical Monthly, Vol. 127, No. 3 (MARCH 2020), pp. 217–237

[2] The function exact is efficient for individual values of N, but there is a lot of redundant calculation when you use it to test a range of N‘s as we do in the loop with the assert statement. It wouldn’t be hard to rewrite the code to have it more efficiently check a range.

The Buenos Aires constant

The Buenos Aires constant is 2.92005097731613…

What’s so special about this number? Let’s see when we use it to initialize the following Python script.

s = 2.920050977316134

for _ in range(10):
    i = int(s)
    print(i)
    s = i*(1 + s - i)

What does this print?

    2, 3, 5, 7, 11, 13, 17, 19, 23, 29

If we started with the exact Buenos Aires constant and carried out our operations exactly, the procedure above would generate all prime numbers. In actual IEEE 754 arithmetic, it breaks down around Douglas Adams’ favorite number.

The Buenos Aires constant is defined by the infinite sum

\lambda = \frac{2-1}{1} + \frac{3-1}{2} + \frac{5-1}{3\cdot 2} + \frac{7-1}{5\cdot 3 \cdot 2} + \cdots

As you can tell, the primes are baked into the definition of λ, so the series can’t generate new primes.

I used mpmath to calculate λ to 100 decimal places:

2.920050977316134712092562917112019468002727899321426719772682533107733772127766124190178112317583742

This required carrying out the series defining λ for the first 56 primes. When I carried out the iteration above also to 100 decimal places, it failed on the 55th prime. So I got about as many primes out of the computation as I put into it.

Related posts

Reference: Beyond Pi and e: a Collection of Constants. James Grime, Kevin Knudson, Pamela Pierce, Ellen Veomett, Glen Whitney. Math Horizons, Vol. 29, No. 1 (September 2021), pp. 8–12

Sparse binary Pythagorean triples

I recently ran across an interesting family of Pythagorean triples [1].

\begin{align*} a &= 2^{4n} + 2^{2n+1} \\ b &= 2^{4n} - 2^{4n-2} - 2^{2n} - 1 \\ c &= 2^{4n} + 2^{4n-2} + 2^{2n} + 1 \\ \end{align*}

You can verify that a² + b² = c² for all n.

Sparseness

When written in binary, a has only two bits set, and c has only four bits set.

It’s not as immediately obvious, but b has only two bits that are not set.

For example, here’s what we get writing the Pythagorean triple (a, b, c) in binary when n = 5.

(100000000100000000000,  10111111101111111111, 101000000010000000001)

In linear algebra, we say a matrix is sparse if most of its entries are zeros. The word “sparse” is used similarly in other areas, generally meaning something contains a lot of zeros. So in that sense a and c are sparse.

I suppose you could call b dense since its binary representation is a string of almost all 1s. Or you could say that it is sparse in that has little variation in symbols.

Aside from sparseness, these triples touch on a couple other ideas from the overlap of math and computer science.

One’s complement

The number b is the one’s complement of c, i.e. if you flip every bit in c then you obtain b (with a leading zero).

More precisely, one’s complement is given relative to a number of bits N. The N-bit one’s complement of x equals 2Nx.

In our case b is the (4n + 1)-bit one’s complement of c. Also c is the (4n + 1)-bit one’s complement of b because one’s complement is its own inverse, i.e. it is an involution.

Run-length encoding

The binary representations of ab, and c are highly compressible strings when n is large. Run-length encoding (RLE) represents each string compactly.

RLE simply describes a string by stating symbols and how many times each is repeated. So to compute the run-length encoding of

100000000100000000000,10111111101111111111,101000000010000000001

from the example above, you’d observe one 1, eight 0s, one 1, eleven zeros, etc.

There’s ambiguity writing the RLE of a sequence of digits unless you somehow put the symbols and counts in a different namespace. For example, if we write 1180110 we intend this to be read as above, but someone could read this as 180 1s followed by 1o 1s.

Let’s replace 0s with z (for zero) and 1s with u (for unit) so our string will not contain any digits.

uzzzzzzzzuzzzzzzzzzzz,uzuuuuuuuzuuuuuuuuuu,uzuzzzzzzzuzzzzzzzzzu

Then the RLE of the string is

uz8uz11,uzu7zu10,uzuz7uz9u

Here a missing count is implicitly 1. So uz8… is read as u, followed by z repeated 8 times, etc.

As n increases, the length of the binary string grows much faster than the length of the corresponding RLE.

Exercise for the reader: What is the RLE of the triple for general n?

Related posts

[1] H. S. Uhler. A Colossal Primitive Pythagorean Triangle. The American Mathematical Monthly, Vol. 57, No. 5 (May, 1950), pp. 331–332.

RSA security in light of news

A recent article reported on the successful factoring of a 512-bit RSA key. The process took $8 worth of rented computing power. What does that say about the security of RSA encryption?

The following Python function estimates the computation steps required to factor a number b bits long using the best known algorithm. We’re going to take a ratio of two such estimates, so proportionality constants will cancel.

def f(b):
    logn = b*log(2)
    return exp((8/3)**(2/3) * logn**(1/3) * (log(logn))**(2/3))

The minimum recommended RSA key size is 2048 bits. The cost ratio for factoring a 2048 bit key to a 512 bit key is f(2048) / f(512), which is on the order of 1016. So factoring a 2048-bit key would take 80 quadrillion dollars.

This is sort of a back-of-the-envelope estimate. There things it doesn’t take into account. If a sophisticated and highly determined entity wanted to factor a 2048 number, they could probably do so for less than 80 quadrillion dollars. But it’s safe to say that the people who factored the 512 bit key are unlikely to factor a 2048 bit key by the same approach.

Details of generating primes for cryptography

RSA public key cryptography begins by finding a couple large primes. You essentially do this by testing random numbers until you find primes, but not quite.

Filippo Valsorda just posted a good article on this.

Suppose you’re looking for a 1024-bit prime number. You generate random 1024-bit numbers and then test until you find one that’s prime. You can immediately make this process twice as fast by setting the last bit to 1: it would be wasteful to generate a new number every time you happened to draw an even number.

A little less obvious is that it’s common to set the top bit as well. When you’re generating a number between 0 and 21024 − 1, it’s possible that you could generate a small number. It’s possible that you generate a very small number, like 59, but extremely unlikely. But it’s not so unlikely that you’d generate a number on the order of 21020, for example. By setting the top bit, you know you’re generating a number between 21023 and 21024.

Most composite numbers have small factors, so you check for divisibility by 3, 5, 7 etc. before running more time-consuming tests. Probabilistic tests are far more efficient than deterministic tests, so in practice everyone uses probable primes in RSA. For details of how you apply these tests, and how many tests to run, see Filippo Valsorda’s article.

Should you be concerned about setting the top bit of prime candidates? There are naive and sophisticated reasons not work worry, and an intermediate reason to at least think about it.

The naive response is that you’re just losing one bit of randomness. How much could that hurt? But in other contexts, such as losing one bit of randomness in an AES key, people do worry about such losses.

The bits in the prime factors of an RSA modulus do not correspond directly to bits of security. A 2048-bit modulus, the product of two 1024-bit primes, has about 112 bits of security. (See NIST SP 800-57.) You could set several bits in your prime factors before losing a bit of security. If this bothers you, move up to using a 3072-bit modulus rather than worrying that you 2048-bit modulus is in a sense a 2046-bit modulus.

More cryptography posts

Perfect numbers

A perfect number is a positive integer equal to the sum of its proper divisors, all divisors less than itself. The first three examples are as follows.

6 = 1 + 2 + 3
28 = 1 + 2 + 4 + 7 + 14
496 = 1 + 2 + 4 + 8 + 16 + 31 + 62 + 124 + 248

Even and odd perfect numbers

Every known perfect number is even. No one has proved that odd perfect numbers don’t exist, but people keep proving properties than an odd perfect number must have, and maybe some day this list of properties will contain a contradiction, proving that such numbers don’t exist. For example, an odd perfect number, if it exists, must have over 1,500 digits.

Mersenne primes

A Mersenne prime is a prime number of the form 2p− 1. Euclid proved that if M is a Mersenne prime, then M(M + 1)/2 is an even perfect number [1]. Leonhard Euler proved the converse of Euclid’s theorem two millennia later: all even perfect numbers have the form M(M + 1)/2 where M is a Mersenne prime.

There are currently 52 known Mersenne primes. The number of Mersenne primes is conjectured to be infinite, and the Mersenne primes discovered so far have roughly fit the projected distribution of such primes, so there is reason to suspect that there are infinitely many perfect numbers. There are at least 52.

Triangular numbers

By Euler’s theorem, all even perfect numbers have the form M(M + 1)/2 , and so all even perfect numbers are triangular numbers.

P = 1 + 2 + 3 + … + M

Binary numbers

Even perfect numbers have the form 2p−1(2p − 1), and so this means all perfect numbers when written in binary consist of p ones followed by p − 1 zeros.

For example,

496 = 31 × 32 / 2 = 24(25 − 1)

and 496 = 111110000two, five ones followed by four zeros.

Related posts

[1] Euclid didn’t use the term “Mersenne prime” because he lived 17 centuries before Marin Mersenne, but he did prove that if 2p − 1 is prime, then 2p−1(2p − 1) is perfect.

Sawtooth waves

I woke up around 3:00 this morning to some sort of alarm outside. It did not sound like a car alarm; it sounded like a sawtooth wave. The pattern was like a few Morse code O’s. Not SOS, or I would have gotten up to see if anyone needed help. Just O’s.

A sawtooth wave takes its name from the shape of its waveform: it looks like the edge of a saw. It also sounds a little jagged.

Sawtooth waves have come up several times here. For one thing, they have rich harmonics. Because the wave form is discontinuous, the Fourier coefficients decay to zero slowly. I wrote about that here. The post is about square waves and triangular waves, but sawtooth waves are very similar.

Here’s a post oscillators with a sawtooth forcing function.

I took sawtooth functions in a different direction in this post that started with an exercise from Knuth’s TAOCP. This led me down a rabbit hole on replicative functions and multiplication theorems in different contexts.

If I remember correctly the sound used for red alterts in Star Trek TOS started with a sawtooth wave. Early synthesizers had sawtooth generators because, as mentioned above, these waves are rich in overtones and can be manipulated to create interesting sounds such as the red alert sound.

New Mersenne prime found

Mersenne numbers have the form 2p − 1. A Mersenne prime is a Mersenne number that is also a prime.

A new Mersenne prime discovery was announced today: 2p − 1 is prime for p = 136279841. The size of the new Mersenne prime is consistent with what was predicted.

For many years now, the largest known prime has been a Mersenne prime. That is because there is an special algorithm for testing whether a Mersenne number is prime, the Lucas-Lehmer test. Because of this algorithm, Mersenne numbers can be tested for primality far more efficiently than can arbitrary numbers of comparable size.

There are now 52 known Mersenne primes, but the number just announced may not be the 52nd Mersenne prime. It has been confirmed that the 2136279841 − 1 is prime, but it has not been confirmed that there are no Mersenne primes between the 51st Mersenne prime and the number just announced. There could be gaps.

If you were to write the latest Mersenne prime in hexadecimal, it would be a 1 followed by 34,069,960 F’s.

Related posts

Squares, triangles, and octal

I ran across the following theorem in Ross Honsberger’s book Mathematical Morsels:

Every odd square ends in 1 in base 8, and if you cut off the 1 you have a triangular number.

A number is an odd square if and only if it is the square of an odd number, so odd squares have the form (2n + 1)².

Both parts of the theorem above follow from the calculation

( (2n + 1)² − 1 ) / 8 = n(n + 1) / 2.

In fact, we can strengthen the theorem. Not only does writing the nth odd square in base 8 and chopping off the final digit give some triangular number, it gives the nth triangular number.

Average number of divisors

Let d(n) be the number of divisors of an integer n. For example, d(12) = 6 because 12 is divisible by 1, 2, 3, 4, 6, and 12.

The function d varies erratically as the following plot shows.

But if you take the running average of d

f(n) = (d(1) + d(2) + d(3) + … + d(n)) / n

then this function is remarkably smoother.

Not only that, the function f(n) is asymptotically equal to log(n).

Computing

Incidentally, directly computing f(n) for n = 1, 2, 3, …, N would be inefficient because most of the work in computing f(m) would be duplicated in computing f(m + 1). The inefficiency isn’t noticeable for small N but matters more as N increases. It would be more efficient to iteratively compute f(n) by

f(n + 1) = (n f(n) + d(n + 1)) / (n + 1).

Asymptotics and citations

Several people have asked for a reference for the results above. I didn’t know of one when I wrote the post, but thanks to reader input I now know that the result is due to Dirichlet. He proved that

f(n) = log(n) + 2γ − 1 + o(1).

Here γ is the Euler-Mascheroni constant.

You can find Dirichlet’s 1849 paper (in German) here. You can also find the result in Tom Apostol’s book Introduction to Analytic Number Theory.

Related posts