In 1881, astronomer Simon Newcomb noticed something curious. The first pages in books of logarithms were dirty on the edge, while the pages became progressively cleaner in later pages. He inferred from this that people more often looked up the logarithms of numbers with small leading digits than with large leading digits. Why might this […]

This page is an index to articles on the site about Benford’s law. Benford’s law and probability distributions Pareto distribution Weibull distribution Cauchy distribution Benford’s law and number theory Leading digits of powers of two Gelfand’s question Leading digits of factorials Benford’s law and computer science Collatz 3n+1 conjecture Benford’s law and statistics Benford’s law […]

The Pareto probability distribution has density for x ≥ 1 where a > 0 is a shape parameter. The Pareto distribution and the Pareto principle (i.e. “80-20” rule) are named after the same person, the Italian economist Vilfredo Pareto. Samples from a Pareto distribution obey Benford’s law in the limit as the parameter a goes to […]

This is the third, and last, of a series of posts on Benford’s law, this time looking at a famous open problem in computer science, the 3n + 1 problem, also known as the Collatz conjecture. Start with a positive integer n. Compute 3n + 1 and divide by 2 repeatedly until you get an odd […]

Introduction to Benford’s law In 1881, Simon Newcomb noticed that the edges of the first pages in a book of logarithms were dirty while the edges of the later pages were clean. From this he concluded that people were far more likely to look up the logarithms of numbers with leading digit 1 than of […]

A while back I wrote about how the leading digits of factorials follow Benford’s law. That is, if you look at just the first digit of a sequence of factorials, they are not evenly distributed. Instead, 1’s are most popular, then 2’s, etc. Specifically, the proportion of factorials starting with n is roughly log10(1 + 1/n). […]

Imagine you picked up a dictionary and found that the pages with A’s were dirty and the Z’s were clean. In between there was a gradual transition with the pages becoming cleaner as you progressed through the alphabet. You might conclude that people have been looking up a lot of words that begin with letters […]

According to this paper [1], the empirical distribution of real passwords follows a power law [2]. In the authors’ terms, a Zipf-like distribution. The frequency of the rth most common password is proportional to something like 1/r. More precisely, fr = C r−s where s is on the order of 1. The value of s that […]

I needed the inverse factorial function for my previous post. I was sure I’d written a post on computing the inverse factorial, and intended to reuse the code from that earlier post. But when I searched I couldn’t find anything, so I’m posting the code here for my future reference and for anyone else who […]

The latest episode of Erik Seligman’s podcast is entitled The Grim State of Modern Pizza. Although you might not realize it from the title, the post is about fraud detection. GRIM stands for Granularity-Related Inconsistency of Means. In a nutshell, the test looks for means (averages) that are not possible on number theoretic grounds. If […]