Probability distributions in R

This page summarizes how to work with univariate probability distributions in R and S-PLUS. See also notes on working with distributions in Mathematica, Excel, and in Python with SciPy.

R and S-PLUS use prefixes and bases to denote functions related to a distribution. The prefixes are d, p, q, and r. The bases are the name of the distribution family such as norm for the normal distribution.

The prefix d is for density, i.e. PDF.

The prefix p is for CDF (cumulative density function), unless the argument lower.tail = FALSE is supplied, in which case it turns into the CCDF (complementary CDF).

The prefix q is for the CDF inverse, unless the argument lower.tail = FALSE is
supplied, in which case it turns into the CCDF inverse.

The prefix r is for random sample.

The first argument to a distribution-related function is the ostensible argument. Next come the distribution parameters followed by other options.


pnorm(0.77, 0, 2.1) computes FX(0.77) where X is a normal random variable with mean 0 and standard deviation 2.1 and FX is its CDF.

dbeta(0.7, 2.1, 3.4) computes fX(0.7) where X is a beta random variable with parameters 2.1 and 3.4 and fX is its PDF.

qgamma(0.1, 3.1, 1.0, lower.tail = FALSE) finds a value y so that P(Y > y) = 0.1 where Y has a gamma distribution with shape 3.1 and scale 1.

Distributions and parameterizations

DistributionBase nameĀ Parameters
betabetashape1, shape2
binomialbinomsize, prob
Cauchycauchylocation, scale
Ffdf1, df2
gammagammashape, rate
hypergeometrichyperm, n, k
log-normallnormmeanlog, sdlog
logisticlogislocation, scale
negative binomialnbinomsize, prob
normalnormmean, sd
Student ttdf
uniformunifmin, max
Weibullweibullshape, scale
Click to find out more about consulting for statistical computing


Note that the exponential is parameterized in terms of the rate, the reciprocal of the mean.

The gamma can be parameterized by its shape and either the rate or the scale. The rate is the default argument by position, but you can specify the scale by name.

The hypergeometric distribution gives the probability of various numbers of red balls when k balls are taken from an urn containing m red balls and n blue balls. Note that another popular convention uses the number of red balls and the total number of balls m+n.

Note that the parameters for the log-normal are the mean and standard deviation of the log of the distribution, not the mean and standard deviation of the distribution itself.