It is well-known that you can approximate a binomial distribution with a normal distribution. Of course there are a few provisos …

It is also well-known that you can approximate a beta distribution with a normal distribution as well.

This means you could directly approximate a binomial distribution with a beta distribution. This is a trivial consequence of the two other approximation results, but I don’t recall seeing this mentioned anywhere.

Why would you want to approximate a binomial distribution with a beta? The normal distribution is better known than the beta, and the normal approximation is motivated by the central limit theorem. However, approximating a binomial distribution by a beta makes the connection to Bayesian statistics clearer.

Let’s look back at a post I wrote yesterday. There I argued that the common interpretation of a confidence interval, while unjustified by the theory that produced it, could be justified by appealing to Bayesian statistics because a frequentist confidence interval, in practice, is approximately a Bayesian credible interval.

In that post I give an example of estimating a proportion *p* based on a survey with 127 positive responses out of 400 persons surveyed. The confidence interval given in that post implicitly used a normal approximation to a binomial. This is done so often that it typically goes unnoticed, and it is justified for large samples when *p* is not too close to 0 or 1.

Binomial distributions with large *n* are difficult to work with and it is more convenient to work with continuous distributions. Instead of the normal approximation, we could have used a beta approximation. This has nothing to do with Bayesian statistics yet. We could introduce the beta distribution simply as an alternative to the normal distribution.

The distribution on the estimated rate is binomial with *p* = 127/400 and variance *p*(1-*p*)/*n* with *n* = 400.

We could compare this to a beta distribution with the same mean and variance. I worked out here how to solve beta distribution parameters that lead to a specified mean and variance. If we do that with the mean and variance above we get *a* = 126.7 and *b* = 272.3. We could then find a 95% confidence interval by finding the 2.5 and 97.5 percentiles for a beta(126.7, 272.3) distribution. When we do, we get a confidence interval of (0.2728, 0.3639), very nearly what we got in the earlier post using a normal approximation.

At this point we have been doing frequentist statistics, using a beta distribution as a computational convenience. Now let’s put on our Bayesian hats. Starting with a uniform, i.e. beta(1, 1), prior, we get a posterior distribution on the proportion we’re estimating which has a beta(128, 274).

If you plot the density functions of a beta(126.7, 272.3) and a beta(128, 274) you’ll see that they agree to within the thickness of a line.