Too clever Monte Carlo

One way to find the volume of a sphere would be to imagine the sphere in a box, randomly select points in the box, and count how many of these points fall inside the sphere. In principle this would work in any dimension.

The problem with naive Monte Carlo

We could write a program to estimate the volume of a high-dimensional sphere this way. But there’s a problem: very few random samples will fall in the sphere. The ratio of the volume of a sphere to the volume of a box it fits in goes to zero as the dimension increases. We might take a large number of samples and none of them fall inside the sphere. In this case we’d estimate the volume as zero. This estimate would have small absolute error, but 100% relative error.

A more clever approach

So instead of actually writing a program to randomly sample a high dimensional cube, let’s imagine that we did. Instead of doing a big Monte Carlo study, we could be clever and use theory.

Let n be our dimension. We want to draw uniform random samples from [−1, 1]n and see whether they land inside the unit sphere. So we’d draw n random samples from [−1, 1] and see whether the sum of their squares is less than or equal to 1.

Let Xi be a uniform random variable on [−1, 1]. We want to know the probability that

X1² + X2² + X3² + … + Xn² ≤ 1.

This would be an ugly calculation, but since we’re primarily interested in the case of large n, we can approximate the sum using the central limit theorem (CLT). We can show, using the transformation theorem, that each Xi² has mean 1/3 and variance 4/45. The CLT says that the sum has approximately the distribution of a normal random variable with mean n/3 and variance 4n/45.

Too clever by half

The approach above turns out to be a bad idea, though it’s not obvious why.

The CLT does provide a good approximation of the sum above, near the mean. But we have a sum with mean n/3, with n large, and we’re asking for the probability that the sum is less than 1. In other words, we’re asking for the probability in the tail where the CLT approximation error is a bad (relative) fit. More on this here.

This post turned out to not be about what I thought it would be about. I thought this post would lead to a asymptotic approximation for the volume of an n-dimensional sphere. I would compare the approximation to the exact value and see how well it did. Except it did terribly. So instead, this post a cautionary tale about remembering how convergence works in the CLT.