These notes explain how to compute probabilities for common statistical distributions using Mathematica. See also notes on working with distributions in R and S-PLUS, Excel, and in Python with SciPy.
Statistical distributions are standard in Mathematica version 6. Prior to that version, you had to load either the
ContinuousDistributions package. For example, to load the latter you would enter the following.
As with everything else in Mathematica, names use Pascal case (concatenated capitalized words). The name of every distribution object ends with Distribution. For example, the Mathematica object representing the normal (Gaussian) distribution is
NormalDistribution. The arguments to a distribution object constructor are the distribution parameters. (See notes below about possible problems with parameterization conventions.)
Probability density function (PDF)
To calculate the PDF (probability density function) of a distribution, pass the distribution as the first argument to
PDF and the PDF argument as the second argument. For example,
PDF[ GammaDistribution[2, 3], 17.2 ]
gives the value of fX(17.2) where fX is the PDF of a random variable X with a gamma distribution with shape parameter 2 and scale parameters 3. For another example,
f[x_] := PDF[ NormalDistribution[0, 1], x ]
defines a function f as the PDF of a standard normal random variable.
Note that Mathematica uses the term “PDF” for both continuous and discrete random variables. Technically, discrete distributions have or probability mass functions but Mathematica ignores this pedantic detail.
Cumulative density function (CDF)
Mathematica computes the CDF (cumulative density function) of a distribution analogously to the way it computes the PDF. For example,
g[x_] := CDF[ NormalDistribution[0, 1], x ]
defines g to be CDF of a standard normal random variable.
Quantiles (inverse CDF)
To compute the quantile function, i.e. the inverse of the CDF function, use the Mathematica function
Quantile analogous to the functions
CDF described above.
Other associated functions
You can find the mean or variance of a distribution by passing a distribution object to
Variance respectively. To get a random sample, pass a distribution object to
Random. To get an array of random samples, call
The following gives Mathematica names and parameterizations for common distributions.
||n, s, total|
ChiSquareDistribution contains the word “Square” but not “Squared.” Also, Student’s t distribution is
StudentTDistribution and not
The Laplace distribution is also known as the double exponential distribution.
Notes on parameterizations
You always need to verify parameterizations in statistical software to avoid unexpected results. One way to do this is to pass a distribution object to the
Variance functions to see whether you get what you expect
The exponential distribution is sometimes parameterized in terms of its mean, but Mathematica uses the rate, the reciprocal of the mean or scale.
Mathematica parameterizes the geometric distribution in terms of its shape and scale. Some other packages use the shape and the rate (reciprocal of the scale).
There are two common parameterizations for a hypergeometric distribution. Suppose an urn has M red balls and N blue balls. You draw n balls at once and want to know the probability of various numbers of red balls in your sample. Some software packages parameterize the hypergeometric distribution in terms of n, M, and N, but Mathematica uses n, M, and the total number of balls, M+N.
If X has a log-normal distribution, then log(X) has a normal distribution. Note that the mean and standard deviation parameters are the mean and standard deviation of log(X), not of X itself. Said another way, X has the same distribution as exp(Y) where Y is a normal random variable with mean and standard deviation given by the parameters.