Error in the normal approximation to the Poisson distribution

If X ~ Poisson(λ) with λ “large” then X is well approximated by a normal distribution. This page looks at this approximation in detail. For example, how large does λ have to be? For a fixed value of λ, how does the error vary? Where is it best and worst?

Central Limit Theorem

First we show why there is a normal approximation to the Poisson. If X1 and X2 are independent Poisson random variables with means λ1 and λ2 then X1 + X2 has a Poisson distribution with mean λ1 + λ2. This means that a Poisson random variable X with mean λ has the same distribution as the sum of N independent Poisson random variables Xi with mean λ/N. We can apply the Central Limit Theorem to this sum to show that X has approximately the same distribution as a normal random variable Y with mean λ and variance λ, the same mean and variance as X.

General bound on error

To measure the error in the normal approximation, let FX be the cumulative distribution function (CDF) of X and FY be the CDF of Y, the normal approximation to X. We can apply the Berry-Esséen theorem to ∑Xi to show that |FX(x) − FY(x)| is bounded by C/√λ for all x where C is some constant less than 0.7164. (This summary leaves out some subtle details.)

The bound from the Berry-Esséen theorem is pessimistic, but it shows right away that the error decreases as λ increases. The bound over-estimates the error for two reasons. First, it’s based on a general theorem that doesn’t use any special knowledge of the Poisson distribution other than its second and third moments. Second, it doesn’t take into account a continuity correction that improves the approximation.


The following plot shows the probability mass function (PMF) for a Poisson distribution with λ = 10.

Poisson(10) PMF

This shows there’s good reason to believe the Poisson and normal distributions are connected.

Error in approximating the CDF

Next we want to look at the error in the normal approximation to the Poisson distribution. Let X be a Poisson random variable with mean λ and let Y be a normal random variable with mean and variance λ. Denote the PMFs of X and Y by fX and fY and denote the CDFs of X and Y by FX and FY.

First we look at FXFY and we show what an improvement the continuity correction makes. Then we look at the error in approximating the probability of individual points by looking at fX and fY.

The following graph shows FX(n) – FY(n) for n = 0, 1, 2, …, 20.

CDF error in normal approximation to Poisson(10)

Next we look at FX(n) – FY(n + ½). The reason for adding this extra ½ is the continuity correction we will discuss below.

CDF error in normal approximation to Poisson(10) with continuity correction

The Berry–Esséen theorem gives an upper bound of 0.7164/√10 = 0.2265 for the approximation error. The maximum error without the continuity correction is 0.083. With the continuity correction, the maximum error goes down to 0.021. So in this case the actual error is about three times smaller than the theoretical bound, and the continuity correction reduces the error by another factor of 4. These ratios are typical as λ varies.

In the previous example, the maximum error occurred at n = λ without the error continuity correction and at λ − 1 with the continuity correction. This appears to be true in general; I’ve numerically verified that it is true for integer values of λ ≤ 100.

Error in approximating the PMF

Now we turn our attention to approximating fX(n) by the integral

f_X(n) \approx \int_{n - ½}^{n+½} f_Y(y)\, dy

which approximates P(X = n) by P(n − ½ < Y < n + ½). Adding up the approximations for several values of n explains the extra factor of ½ in the continuity correction in the CDF approximation.

While the error in approximating the CDF is has a maximum value near λ, the error in approximating the PMF has a local minimum near λ. However, the error grows quickly on either side of λ. The following graph plots fX(n) minus its approximation for λ = 20.

error in PMF approx

The absolute value of the error is smallest in the tails. However, the relative error in the approximation is smallest near the mean and is larger in the tails. The following graph gives the signed relative error.

Relative error in approximating Poisson(20) probability mass function

The graph is truncated because the relative errors on the far left are enormous. The relative error at 0 is -2000 and the relative error at 1 is −250.

Other normal approximations

There are other less direct ways to form normal approximations to the Poisson distribution function. See notes on the Wilson-Hilferty normal approximation. This method is often far more accurate than the approximation above.

See also notes on the normal approximation to the beta, binomial, gamma, and student-t distributions.