These notes explain how to compute probabilities for common statistical distributions using Microsoft Excel 2007. See also notes on working with distributions in Mathematica, R and S-PLUS and in Python with SciPy.
Probability function support in Excel is incomplete and inconsistent. All distributions have a PDF function. Some have a CDF. Some have an inverse CDF. There’s a general pattern to function names, but there are exceptions.
Probability function names in Excel typically consist of a base name and a suffix. The base name is an abbreviation of the distribution name. The suffix is either
INV. For example, the functions for the normal (Gaussian) distribution are
NORMINV. The corresponding functions for the Gamma distribution are
GAMMAINV. Several distributions follow this pattern.
The “DIST” function evaluates the PDF and possibly the CDF. If the function has a
CUMULATIVE argument, setting this argument to
TRUE causes the
DIST function to compute the CDF. If the argument is
FALSE, the function returns the PDF. If there is no
CUMULATIVE argument, the
DIST function can only compute the PDF. The “INV” function evaluates the inverse CDF (quantile) function.
(By convention, Excel functions are written in all capital letters. That’s the way they are documented. But Excel is case-insensitive.)
The command to evaluate the PDF of a normal distribution with mean 1 and standard deviation 2 at the point x = 3 is
NORMDIST(3, 1, 2, FALSE).
FALSE argument says that the cumulative flag is not set. To compute the CDF of the same distribution at the point x = 3, the command is
NORMDIST(3, 1, 2, TRUE).
To evaluate the inverse CDF (quantile function) of the distribution at p = 0.2 the command is
NORMINV(0.2, 1, 2).
Not all distributions have an “INV” function, as indicated in the table below. Also, the lognormal distribution is exceptional in that its CDF inverse function is named
LOGINV rather than
For the Poisson and Weibull functions, the “DIST” function name does not end in “DIST.” Instead, these functions are
The exponential distribution is parameterized in terms of its rate, the reciprocal of the mean. Other distributions that are sometimes parameterized differently are the hypergeometric and the lognormal.
For more details, see the Excel online documentation or statistical functions.
For daily posts on probability, follow @ProbFact on Twitter.