Pick random numbers uniformly between 0 and 1, adding them as you go, and stop when you get a result bigger than 1. How many numbers would you expect to add together on average?

You need at least two samples, and often two are enough, but you might get any number, and those larger numbers will pull the expected value up.

Here’s a simulation program in Python.

from random import random from collections import Counter N = 1000000 c = Counter() for _ in range(N): x = 0 steps = 0 while x < 1: x += random() steps += 1 c[steps] += 1 print( sum([ k*c[k] for k in c.keys() ])/N )

When I ran it I got 2.718921. There’s a theoretical result first proved by W. Weissblum that the expected value is *e* = 2.71828… Our error was on the order of 1/√N, which is what we’d expect from the central limit theorem.

Now we can explore further in a couple directions. We could take a look at a the *distribution* of the number steps, not just its expected value. Printing `c`

shows us the raw data.

Counter({ 2: 499786, 3: 333175, 4: 125300, 5: 33466, 6: 6856, 7: 1213, 8: 172, 9: 29, 10: 3 })

And here’s a plot.

We could also generalize the problem by taking powers of the random numbers. Here’s what we get when we use exponents 1 through 20.

There’s a theoretical result that the expected number of steps is asymptotically equal to *cn* where

I computed *c* = 1.2494. The plot above shows that the dependence on the exponent *n* does look linear. The simulation results appear to be higher than the asymptotic prediction by a constant, but that’s consistent with the asymptotic prediction since relative to *n*, a constant goes away as *n* increases.

Reference for theoretical results: D. J. Newman and M. S. Klamkin. Expectations for Sums of Powers. The American Mathematical Monthly, Vol. 66, No. 1 (Jan., 1959), pp. 50-51

Nice!

To see why e is the answer, after n rolls, the probability that the sum is less than 1 is the hyper volume of the convex hull of 0 and n standard basis vectors, a Euclidean n-simplex.

This is also an easy to write down iterated integral of 1 — the bounds will be 0 to 1-x1-x2…-xi.

Calculating the cases n=2 and n=3 makes it easy to see the pattern. we get 1/2, 1/6. In general the Euclidean hyper volume is 1/n! .

Now we are almost done — a typical term in the expected value calculation is n x ( 1/(n-1)! – 1/(n!)) and this simplifies to 1/(n-2)!

Now we sum from n=2 to infinity to get e.

Another, more conceptual way to see why e is the answer:

Let x_1, x_2, … be the random numbers you generated, let y_k = x_1 + x_2 + … + x_k, and let z_k = y_k mod 1. That is, each y_k is the k-th partial sum of the x_n, and each z_k is the fractional part of y_k. Then the z_k are also independent, Uniform [0, 1] random variables. (To see this, just note that, for any given values of z_1, …, z_(k-1), the marginal distribution of z_k is still Uniform [0, 1].)

Furthermore, since the y_k are increasing in increments of at most 1, the fractional part of y_k decreases precisely when the integer part increases; that is, z_k floor(y_(k-1)). In particular, y_k < 1 iff z_1 <= z_2 <= … <= z_k.

But z_1, …, z_k are independent, identically distributed, continuous random variables, so they are almost surely distinct, and all orderings are equally likely. Hence, y_k < 1 with probability 1/k!, and taking the sum gives e.