In 1881, astronomer Simon Newcomb noticed something curious. The first pages in books of logarithms were dirty on the edge, while the pages became progressively cleaner in later pages. He inferred from this that people more often looked up the logarithms of numbers with small leading digits than with large leading digits. Why might this […]

This page is an index to articles on the site about Benford’s law. Benford’s law and probability distributions Pareto distribution Weibull distribution Cauchy distribution Benford’s law and number theory Leading digits of powers of two Gelfand’s question Leading digits of factorials Benford’s law and computer science Collatz 3n+1 conjecture Benford’s law and statistics Benford’s law […]

The Pareto probability distribution has density for x ≥ 1 where a > 0 is a shape parameter. The Pareto distribution and the Pareto principle (i.e. “80-20” rule) are named after the same person, the Italian economist Vilfredo Pareto. Samples from a Pareto distribution obey Benford’s law in the limit as the parameter a goes to […]

This is the third, and last, of a series of posts on Benford’s law, this time looking at a famous open problem in computer science, the 3n + 1 problem, also known as the Collatz conjecture. Start with a positive integer n. Compute 3n + 1 and divide by 2 repeatedly until you get an odd […]

Introduction to Benford’s law In 1881, Simon Newcomb noticed that the edges of the first pages in a book of logarithms were dirty while the edges of the later pages were clean. From this he concluded that people were far more likely to look up the logarithms of numbers with leading digit 1 than of […]

A while back I wrote about how the leading digits of factorials follow Benford’s law. That is, if you look at just the first digit of a sequence of factorials, they are not evenly distributed. Instead, 1’s are most popular, then 2’s, etc. Specifically, the proportion of factorials starting with n is roughly log10(1 + 1/n). […]

Imagine you picked up a dictionary and found that the pages with A’s were dirty and the Z’s were clean. In between there was a gradual transition with the pages becoming cleaner as you progressed through the alphabet. You might conclude that people have been looking up a lot of words that begin with letters […]

According to this paper [1], the empirical distribution of real passwords follows a power law [2]. In the authors’ terms, a Zipf-like distribution. The frequency of the rth most common password is proportional to something like 1/r. More precisely, fr = C r–s where s is on the order of 1. The value of s that […]

Here’s a strange pseudorandom number generator I ran across recently in [1]. Starting with a positive integer n, create a sequence of bits as follows. Compute n! as a base 10 number. Cut off the trailing zeros. Replace digits 0 through 4 with 0, and the rest with 1. You’d want to use a fairly […]

Revealed preferences are the preferences we demonstrate by our actions. These may be different from our stated preferences. Even if we’re being candid, we may not be self-aware. One of the secrets to the success of Google’s PageRank algorithm is that it ranks based on revealed preferences: If someone links to a site, they’re implicitly […]