Sometimes you can approximate a binomial distribution with a normal distribution. Under the right conditions, a Binomial(n, p) has approximately the distribution of a normal with the same mean and variance, i.e. mean np and variance np(1-p). The approximation works best when n is large and p is near 1/2.
This afternoon I was reading a paper that used a normal approximation to a binomial when n was around 10 and p around 0.001. The relative error was enormous. The paper used the approximation to find an analytical expression for something else and the error propagated.
A common rule of thumb is that the normal approximation works well when np > 5 and n(1-p) > 5. This says that the closer p is to 0 or 1, the larger n needs to be. In this case p was very small, but n was not large enough to compensate since np was on the order of 0.01, far less than 5.
Another rule of thumb is that normal approximations in general hold well near the center of the distribution but not in the tails. In particular the relative error in the tails can be unbounded. This paper was looking out toward the tails, and relative error mattered.
For more details, see these notes on the normal approximation to the binomial.
I think this is why many a/b testing tools will tell you it have significant results very early on, then the significance will disappear, and often return later. Obviously part of this is because these significance aren’t intended for repeated measurements. But I think it’s also because a z-test is so convenient that you don’t want to have to switch to a t-test some of the time.
Your post reminded me of the “proxy integration” approximation used for CDO pricing, which did exactly that: used the normal distribution to approximate a quasi-binomial [1] one in the tails.
[1] Quasi-binomial, because it counts the number of successes in n trials with a different (but small) success probability for each trial.
Nice article. I’am currently trying to get a feeling for how data behaves in various distributions and what estimatoins/estimators are usefull. Do you have any recommendations regarding literature about these kind of things?
Good comments but it leaves out what to do when the rule of thumb limits are broken!!!
This is where the use of the conjugate distribution for the binomial (beta) needs to be used to determine error (either p values in hypothesis testing or for confidence intervals. simple algorithms are available. (I can provide a simple excel spread sheet for anyone interested)