Suppose you take factorials of a lot of numbers and look at the leading digit of each result. You could argue that there’s no apparent reason that any digit would be more common than any other, so you’d expect each of the digits 1 through 9 would come up 1/9 of the time. Sounds plausible, but it’s wrong.

The leading digits of factorials follow Benford’s law as described in the previous post. In fact, factorials follow Benford’s law even better than physical constants do. Here’s a graph of the leading digits of the factorials of 1 through 500.

In the remainder of this post, I’ll explain why Benford’s law should apply to factorials, make an aside on statistics, and point out an interesting feature of the Python code used to generate the chart above.

## Why Benford’s law applies

Here’s a hand-waving explanation. One way to justify Benford’s law is to say that physical constants are uniformly distributed, but on a logarithmic scale. The same is true for factorials, and it’s easier to see why.

The leading digits of the logarithms depend on their logarithms in base 10. The gamma function extends the factorial function and it is log-convex. The logarithm of the gamma function is fairly flat (see plot here), and so the leading digits of the log-gamma function applied to integers are uniformly distributed on a logarithmic scale. (I’ve mixed logs base 10 and natural logs here, but that doesn’t matter. All logarithms are the same up to a multiplicative constant. So if a plot is nearly linear on a log10 scale, it’s nearly linear on a natural log scale.)

Update: Graham gives a link in the comments below to a paper proving that factorials satisfy Benford’s law exactly in the limit.

## Uniform on what scale?

This example brings up an important principle in statistics. Some say that if you don’t have a reason to assume anything else, use a uniform distribution. For example, some say that a uniform prior is the ideal uninformative prior for Bayesian statistics. But you have to ask “Uniform on what scale?” It turns out that the leading digits of physical constants and factorials are indeed uniformly distributed, but on a logarithmic scale.

## Python integers and floating point

I used nearly the same code to produce the chart above as I used in its counterpart in the previous post. However, one thing had to change: I couldn’t compute the leading digits of the factorials the same way. Python has extended precision integers, so I can compute 500! factorial without overflowing. Using floating point numbers, I could only go up to 170!. But when I used my previous code to find the leading digit, it first tried to apply `log10`

to an integer larger than the largest representable floating point number and failed. Converting numbers such as 500! to floating point numbers will overflow. (See Floating point numbers are a leaky abstraction.)

The solution was to find the leading digit using only integer operations.

def leading_digit_int(n): while n > 9: n = n/10 return n

This code works fine for numbers like 500! or even larger.

Kang: That’s deeper than I care to go into in a blog post. It’s pretty subtle. See Knuth’s TOACP linked to above.

Can you explain a little bit further on that physical constants are uniformly distributed on a logarithmic scale?

You can also find the leading digit by cheating:

`int(str(n)[0])`

This question was asked in a graduate Probability course I took; we ended up investigating it computationally, but had to resort to some trickery to handle large numbers efficiently—for instance, as the trailing zeros of n! don’t effect the leading digit of (n+1)!, you can drop them.

As it turns out, Persi Diaconis proved that the factorial sequence does indeed follow Benford’s Law in this paper.

Graham: Thanks for the link. Just in case that link goes away, here’s the reference.

“The distribution of leading digits and uniform distribution mod 1” by Persi Diaconis, Annals of Probability, 1977, Vol 5, No. 1, 72-81.

how about using int(str(n)[0]) for leading integer?

int(str(n)[0]) is about 5 times faster than the while loop method.

How about the final digits of factorials?

I’ve wondered the same about final digits of primes. Several years ago I drew a “ribbon” four cells wide, with cells representing final 1, final 3, final 7, final 9, and shaded in the cell if the decade’s 1-final, 3-final, 7-final, 9-final member is prime.

(This starts looking less cool as the numbers grow larger because they appear logarithmically often. But there’s probably a way to improve upon the visualisation that fixes, or even makes use of, this property.)

I didn’t know how to program at the time, but what you wrote would probably be a good starting template to investigate the pattern of final digits in primes.

The final digits of factorials are all zero for n >= 5.

I would expect that final digits of primes are evenly distributed between 1, 3, 7, and 9.

Oh, right. I should have said “The final digit before you get to all the zeroes”.

The final digits of primes are indeed uniform on 1,3, 7, 9. It is a consequence of Dirichlet’s theorem (that any arithmetic progression a + m x with a relatively prime to m has infinitely many primes and that all progressions have the same natural density with respect to prime numbers).

In the limit, it is uniformly distributed, but there is a bias towards the remainders that are not squares (so in this case, 3 and 7). It is called Chebyshev’s bias and there is a very nice paper by Sarnak and Rubinstein exploring this.

It’s not very economic to compute 500! exactly if you’re only interested in the first or last (nonzero) digits. For the first digits, an efficient approach could be to use floats but to divide out powers of ten from time to time (and/or multiply 1…9 and 1.0, 1.1, …, 9.9 and 1.00 , 1.01, …, 4.99, 5.00 ; i.e., whenever you switch to the next larger power of ten also multiply the “normalizer factor” by ten, and/or simply divide by 10^10 whenever the product gets larger than that). For the last n (nonzero) digits, make all computations mod 10^n and whenever one of x=1..500 is divisible by 5^e then remove these factors 5 and also a factor 2 (which are much more frequent) for each of them.