Fortunes and Geometric Means

I saw a post on X recently that said

Bill Gates is closer to you in wealth than he is to Elon Musk. Mind blown.

For round numbers, let’s say Elon Musk’s net worth is 800 billion and Bill Gates’ net worth is 100 billion. So if your net worth is less 450 billion, the statement in the post is true.

The reason the statement above is mind blowing is that in this context you naturally think on a logarithmic scale, even if you don’t know what a logarithm is.

Or to put it another way, we think in terms of orders of magnitude. Musk’s net worth is an order of magnitude greater than Gates’, and Gates’ net worth would be an order of magnitude greater than that of someone worth 10 billion. Musk is a notch above Gates, and Gates is a notch above someone with a net worth around 10 billion, where a “notch” is an order of magnitude.

To put it another way, we think in terms of geometric mean √ab rather than arithmetic mean (a + b)/2 in this context. 100 billion is the geometric mean between 12.5 billion and 800 billion. Geometric mean corresponds to the arithmetic mean on a log scale. And on this scale, Gates is closer to Musk than you are, unless you’re worth more than 12.5 billion.

Here are three more examples of geometric means.

The size of Jupiter is about midway between that of earth and the sun; it’s the geometric mean. On a linear scale Jupiter is much closer to the size of the earth than the sun, but on a logarithmic scale it’s about in the middle. More on that here.

The tritone (augmented fourth) is half an octave. So, for example, an F# is in the middle between a C and the C an octave higher. Its frequency is the geometric mean of the frequencies of the two C’s. More here.

Finally, the humans body is a middle-sized object in the universe. From Kevin Kelly:

Our body size is, weirdly, almost exactly in the middle of the size of the universe. The smallest things we know about are approximately 30 orders of magnitude smaller than we are, and the largest structures in the universe are about 30 orders of magnitude bigger.

Proving you know a product

There is a way to prove that you know two numbers a and b, and their product cab, without revealing ab, or c. This isn’t very exciting without more context — maybe you know that 7 × 3 = 21 — but it’s a building block of more interesting zero knowledge proofs, such as proving that a cryptocurrency transaction is valid without revealing the amount of the transaction.

The proof mechanism requires an elliptic curve G and a pairing of G with itself. (More on pairings shortly.) It also requires a generator g of the group structure on G.

The prover takes the three secret numbers and multiplies the generator g by each, encrypting the numbers as agbg, and cg. When G is a large elliptic curve, say one with on the order of 2256 points, then computing products like ag can be done quickly, but recovering a from g and ag is impractical. In a nutshell, multiplication is easy but division [1] is practically impossible [2].

The verifier receives agbg, and cg. How can he verify that ab = c without knowing ab, or c? Here’s where pairing come in.

I go more into pairings here, but essentially a pairing is a mapping from two groups to a third group

eG1 × G2 → GT

such that

e(aPbQ) = e(PQ)ab.

In our case G1 and G2 are both equal to the group G above, and the target group GT doesn’t matter for our discussion here. Also, P and Q will both be our generator g.

By the defining property of a pairing,

e(agbg) = e(gg)ab

and

e(cgg) = e(gg)c.

So if abc, then e(gg)ab and e(gg)c will be equal.

Related posts

[1] The literature will usually speak of discrete logarithms rather than division. The group structure on an elliptic curve is Abelian, and so it is usually written as addition. If you write the group operation as multiplication, then you’re taking logs rather than dividing. The multiplicative notation highlights the similarity to working in the multiplicative group modulo a large prime.

[2] The computation is theoretically possible but not possible in practice without spending enormous resources, or inventing a large scale quantum computer. This is the discrete logarithm assumption.

Mills ratio and tail thickness

The Mills ratio [1] is the ratio of the CCDF to the PDF. That is, for a random variable X, the Mills ratio at x is the complementary cumulative distribution function divided by the density function. If the density function of X is f, then

m(x) = \frac{\int_x^\infty f(x)\, dx}{f(x)}

The Mills ratio highlights an important difference between the Student t distribution and the normal distribution.

Introductory statistics classes will say things like “you can approximate a t distribution with a normal if it has more than 30 degrees of freedom.” That may be true, depending on the application. A t(30) distribution and a normal distribution behave similarly in the middle but not in the tails.

Mills ratio plot for t(30) vs normal

The Mills ratio for a t distribution with ν degrees of freedom is asymptotically x/ν,  while the Mills ratio for a standard normal distribution is asymptotically 1/x. Note that increasing ν does make the Mills function smaller, but it still eventually grows linearly whereas the Mills function of a normal distribution decays linearly.

In general, the Mills ratio is a decreasing function for thin-tailed distributions and an increasing function for fat-tailed distributions. The exponential distribution is in the middle, with constant Mills function.

Related posts

[1] Named after John P. Mills, so there’s no apostrophe before the s.

Prime gaps and Gapcoin

The previous post looked at tightly clustered primes. This post looks at the opposite, large gaps between primes.

Riecoin is a cryptocurrency that uses finding prime clusters as its proof of work task. Gapcoin uses finding prime gaps as its proof of work task.

There’s some nuance to defining prime gaps. It’s trivial to produce a gap of any size. For example, [n! + 2, n! + n] is an interval of length n − 1 that contains no primes. It is more interesting to find gaps that are large relative to the size of the endpoints. The merit of a gap is the ratio of the gap length to the log of the initial number in the interval.

To be specific, suppose p and q are consecutive primes. The length of the gap between them is defined to be qp and the merit of that gap is (qp) / log p. For large p, the average gap between primes near p is log p and so the merit function measures how large the gap is relative to what you would expect for the size of p.

The following code will compute the merit function.

>>> from sympy import nextprime, log, N
>>> merit = lambda p: (nextprime(p) - p)/log(p)

Gapcoin adjusts its mining difficulty by adjusting the minimum merit value the miner must search for. Gapcoin miners must find a prime p of the form

ph × 2a + b

where h is the SHA256 hash of the previous block in the blockchain and b < 2a.

The prime gap with the largest known merit is [pp + 8350] where

p = 293703234068022590158723766104419463425709075574811762098588798217895728858676728143227

The code

>>> N(merit(p))

shows that the merit is 41.94.

This record was found by the Gapcoin network. I don’t know the backstory, but I presume the mining task wasn’t to find a world record gap. Instead, the miner got lucky and found a much larger gap than necessary.

Related posts

Efficiently testing multiple primes at once

The previous post looked at a technique for inverting multiple integers mod m at the same time, using fewer compute cycles than inverting each integer individually. This post will do something analogous for prime chains, revisiting a post from a few days ago about testing prime chains.

A prime chain is a sequence of primes in which each is twice its predecessor, plus or minus 1. In a Cunningham chain of the first kind, it’s always plus, and in a Cunningham chain of the second kind, it’s always minus.

Primecoin is a cryptocurrency that uses finding prime chains as its proof-of-work (PoW) task. The miner has a choice of finding one of three kinds of prime chain: a Cunningham chain of the first or second kind, or a bi-twin chain. The length of the necessary chain varies over time to keep the difficulty relatively constant. Other PoW blockchains do something similar.

Some people say that Primecoin has miners search for primes for PoW. That’s not quite right. Miners have to find a chain of medium-sized primes rather than finding one big prime. This leads to more predictable compute times.

There is a way to test a candidate Cunningham chain of the second kind all at once. Henri Lifchitz gives his algorithm here. Given a sequence of numbers

n1, n2, n3, …, nk

where ni = 2ni−1 − 1 for each i and  n0 = 1 mod 4, all the numbers in the sequence are probably prime if

2^{n_{k-1} - 1} = 1 \bmod n_0 n_1 n_2 \cdots n_k

For example, consider the chain

31029721, 62059441, 124118881

Note that 31029721 mod 4 = 1 and 31029721 = 2*15514861 − 1. The following code demonstrates that the numbers in the chain are probable primes because it prints 1.

n0 = 15514861
n1 = 2*n0 - 1
n2 = 2*n1 - 1
n3 = 2*n2 - 1
prod = n0*n1*n2*n3
print( pow(2, n2 - 1, prod) )

Next I wanted to try the algorithm on much larger numbers where its efficiency would be more apparent, as in the previous post. But when I did, the test returned a result other than 1 on a known Cunningham chain of the second kind. For example, when I change the first two lines of code above to

n1 = 49325406476*primorial(9811, False) + 1
n0 = (n1 + 1) // 2

the code returns a large result. I verified that each of the numbers in the chain are prime using Sympy’s isprime function.

Usually a probable prime test can have false positives but never a false negative. I haven’t looked at Lifschitz method closely enough to tell whether it can have false negatives, but the code above suggests it can.

Tighter bounds in the prime number theorem

The most elementary form of the prime number theorem says that π(x), the number of prime numbers less than x, is asymptotically equal to x / log(x). That’s true, but a more accurate result says π(x) is asymptotically equal to li(x) where

\text{li}(x) = \int_0^x \frac{dt}{\log t}

Five years ago I wrote about a result that was new at the time, giving a bound on |π(x) − li(x)| for x > exp(2000). This morning I saw a result in a blog post by Terence Tao that says

\left| \pi(x) - \text{li}(x) \right| \leq 9.2211\, x\sqrt{\log(x)} \exp\left( -0.8476 \sqrt{\log(x)} \right)

for all x ≥ 2. The result comes from this paper.

The new bound has the same form as the bound from five years ago but with smaller constants.

The middle binomial coefficient

The previous post contained an interesting observation:

\sqrt{52!} \approx 26! \, 2^{26}

Is it true more generally that

\sqrt{(2n)!} \approx n! \, 2^n

for large n? Sorta, but the approximation gets better if we add a correction factor.

If we square both sides of the approximation and move the factorials to one side, the question becomes whether

\frac{(2n)!}{(n!)^2} = \binom{2n}{n} \approx 4^n

Now the task becomes to estimate the middle coefficient in when we apply the binomial theorem to (xy)2n.

A better approximation for the middle binomial coefficient is

\binom{2n}{n} \approx \frac{4^n}{\sqrt{\pi n}}

Now the right hand side is the first term of an asymptotic series for the left. The ratio of the two sides goes to 1 as n → ∞.

We could prove the asymptotic result using Stirling’s approximation, but it’s more fun to use a probability argument.

Let X be a binomial random variable with distribution B(2n, 1/2). As n grows, X converges in distribution to a normal random variable with the same mean and variance, i.e. with μ = n and σ² = n/2. This says for large n,

\text{Prob}(X = 1/2) = \binom{2n}{n} 4^{-n} \approx \frac{1}{\sqrt{2\pi\sigma^2}} = \frac{1}{\sqrt{\pi n}}

The argument above only gives the first term in the asymptotic series for the middle coefficient. If you want more terms in the series, you’ll need to use more terms in Stirling’s series. If we add a couple more terms we get

\binom{2n}{n} = \frac{4^n}{\sqrt{\pi n}} \left(1 - \frac{1}{8n} + \frac{1}{128n^2} + {\cal O}\left(\frac{1}{n^3}\right) \right)

Let’s see how much accuracy we get in estimating 52 choose 26.

from scipy.special import binom
from numpy import pi, sqrt

n = 26
exact = binom(2*n, n)
approx1 = 4**n/sqrt(pi*n)
approx2 = approx1*(1 - 1/(8*n))
approx3 = approx1*(1 - 1/(8*n) + 1/(128*n**2))

for a in [approx1, approx2, approx3]:
    print(exact/a)

This prints

0.9952041409266293
1.0000118903997048
1.0000002776131290

and so we see substantial improvement from each additional term. This isn’t always the case with asymptotic series. We’re guaranteed that for a fixed number of terms, the relative error goes to zero as n increases. For a fixed n, we do not necessarily get more accuracy by including more terms.

Related posts

Combining in-shuffles and out-shuffles

A few days ago I wrote two posts about perfect shuffles. Once you’ve cut a deck of cards in half, an in-shuffle lets a card from the top half fall first, and an out-shuffle lets a card from the bottom half fall first.

Suppose we have a deck of 52 cards. We said in the earlier posts that the order of an in-shuffle I is 52. That is, after 52 in-shuffles, a deck returns to its initial order. And the order of an out-shuffle O is 8.

We can think of I and O as generators of subgroups of order 52 and 8 respectively in the group S of all permutations of 52 cards. I was curious when I wrote the earlier posts how large the group generated by I and O together would be. Is it possible to reach all 52! permutations of the deck by some combination of applying I and O? If not, how many permutations can be generated?

I’ve since found the answer in [1] in a theorem by Diaconis, Graham, and Kantor. I don’t know who Kantor is, but it’s no surprise that a theorem on card shuffles would come from Persi Diaconis and Ron Graham. The theorem covers the case for decks of size N = 2n, which branches into different results depending on the size of n and the value of n mod 4.

For N = 52, the group generated by I and O has

26! × 226

elements.

On the one hand, that’s a big number, approximately 2.7 × 1034. On the other hand, it’s quite small compared to 52! = 8 × 1067. So while there are a lot of permutations reachable by a combination of in-shuffles and out-shuffles, your chances of selecting such a permutation from the set of all such permutations is vanishingly small.

To put it yet another way, the number of arrangements is on the order of the square root of 52!, a big number, but not big relative to 52!. (Does this pattern

√52! ≈ 26! × 226

generalize? See the next post.)

Not only does the theorem of Diaconis et al give the order of the group, it gives the group itself: the group of permutations generated by I and O is isomorphic to the group of symmetries of a 26-dimensional octahedron.

[1] S. Brent Morris. Magic Tricks, Card Shuffling and Dynamic Computer Memories. MAA 1998.

Primecoin primality test

When I wrote about how Primecoin uses prime chains for proof of work, I left out a detail.

To mine a new Primecoin block, you have to find a prime chain of specified length that starts with a number that is a multiple of the block header hash. According to the Primecoin whitepaper

Another important property of proof-of-work for cryptocurrency is non-reusability. … To achieve this, the prime chain is linked to the block header hash by requiring that its origin be divisible by the block header hash.

So given a hash h, you have to find k such that kh is the origin of a prime chain, where “origin” is defined in footnote [2] here.

Strictly speaking, the primes in a Primecoin prime chain are probable primes. Someone verifying a Primecoin blockchain will be satisfied that the block is authentic if the necessary numbers are prime according to the test used by the Primecoin software. Whether the numbers are actually prime is irrelevant.

Probabilistic primality tests are much more efficient than deterministic tests, and most applications requiring primes, such as RSA encryption, actually use probable primes. If a number is rejected as a probable prime, it’s certainly not a prime. If it is accepted as a prime, it very like is prime. More on probable primes here.

If you write your own Primecoin mining software using a different primality test, that’s fine as long as you actually find primes. But if one time you find pseudoprimes, composite numbers that pass your primality test, then your block will be rejected unless your pseudoprimes also pass the Primecoin software’s primality test.

Primecoin’s primality test

So what is Primecoin’s (probable) primality test? We quote the Primecoin whitepaper:

The classical Fermat test of base 2 is used together with Euler-Lagrange-Lifchitz test to verify probable primality for the prime chains. Note we do not require strict primality proof during verification, as it would unnecessarily burden the efficiency of verification. Composite number that passes Fermat test is commonly known as pseudoprime. Since it is known by the works of Erdös and Pomerance that pseudoprimes of base 2 are much more rare than primes, it suffices to only verify probable primality.

Fermat’s test

The Fermat test with base b says to test whether a candidate number n satisfies

bn − 1 = 1 mod n.

If n is prime, the equation above will hold. The converse is usually true: if the equation above hold, n is likely to be prime. There are exceptions, such as b = 2 and n = 561 = 3 × 11 × 17.

Euler-Lagrange-Lifchitz test

But what is the Euler-Lagrange-Lifchitz test? The whitepaper links here, a page by Henri Lifchitz that gives results generalizing those of Leonard Euler and Joseph-Louis Lagrange.

Testing 2p + 1

Suppose p is an odd prime. Then the Euler-Lagrange test says that if p = 3 mod 4, then q = 2p + 1 is also prime if and only if

2p = 1 mod q.

Lifchitz gives three variations on this result. First, if p = 1 mod 4, then q = 2p + 1 is also prime if and only if

2p = −1 mod q.

Together these two theorems give a way to test with certainty whether 2p + 1, i.e. the next term in a Cunningham chain of the first kind, is prime. However, the test is still probabilistic if we don’t know for certain that p is prime.

Testing 2p − 1

The last two results of Lifchitz are as follows. If p and q = 2p − 1 are prime, then

2p − 1 = 1 mod q

if p = 1 mod 4 and

2p − 1 = −1 mod q

if p = 3 mod 4.

The converses of these two statements give probable prime tests for q if we know p is prime.

So we can verify a (probable) prime chain by using Fermat’s test to verify that the first element is (probably) prime and use the Euler-Lagrange-Lifchitz test that the rest of the numbers in the chain are (probably) prime.

Related posts

Bi-twin prime chains

I mentioned bi-twin prime chains in the previous post, but didn’t say much about them so as not to interrupt the flow of the article.

A pair of prime numbers are called twins if they differ by 2. For example, 17 and 19 are twin primes.

A bi-twin chain is a sequence of twin primes in which the average of each twin pair is double that of the preceding pair. For example,

5, 7, 11, 13

is a bi-twin chain because (5, 7) and (11, 13) are twin pairs of primes, and the average of the latter pair is twice the average of the former pair.

There is some variation in how to describe how long a bi-twin chain is. We will define the length of a bi-twin chain to be the number prime pairs it contains, and so the chain above has length 2. The BiTwin Record page describes a bi-twin chain of length k as a chain with k − 1 links.

It is widely believed that there are infinitely many twin primes, but this has not been proven. So we don’t know for certain whether there are infinitely many bi-twin prime chains of length 1, much less chains of longer length.

The bi-twin chain of length three with smallest beginning number is

211049, 211051, 422099, 422101, 844199, 844201

Bi-twin records are expressed in terms of primorial numbers. I first mentioned primorial numbers a few days ago and now they come up again. Just as n factorial is the product of the positive integers up to n, n primorial is the product of the primes up to n. The primorial of n is written n#. [1]

The longest known bi-twin chains start with

112511682470782472978099, 112511682470782472978101

and has length 9.

We can verify the chains mentioned above with the following Python code.

def bitwin_length(average):
    n = average
    c = 0
    while isprime(n-1) and isprime(n+1):
        c += 1
        n *= 2
    return c

for n in [6, 211050, 112511682470782472978100]:
    print(bitwin_length(n))

[1] There are two conventions for defining n#. The definition used here, and on the BiTwin Record page, is the product of primes up to n. Another convention is to define n# to be the product of the first n primes.

The priomorial function in Sympy takes two arguments and will compute either definition, depending on whether the second argument is True or False. The default second argument is True, in which case primorial(n) returns the product of the first n primes.