Probability distributions R

This page summarizes how to work with univariate probability distributions in R and S-PLUS. See also notes on working with distributions in Mathematica, Excel, and in Python with SciPy.

R and S-PLUS use prefixes and bases to denote functions related to a distribution. The prefixes are d, p, q, and r. The bases are the name of the distribution family such as norm for the normal distribution.

The prefix d is for density, i.e. PDF.

The prefix p is for CDF (cumulative density function), unless the argument lower.tail = FALSE is supplied, in which case it turns into the CCDF (complementary CDF).

The prefix q is for the CDF inverse, unless the argument lower.tail = FALSE is
supplied, in which case it turns into the CCDF inverse.

The prefix r is for random sample.

The first argument to a distribution-related function is the ostensible argument. Next come the distribution parameters followed by other options.

Examples

pnorm(0.77, 0, 2.1) computes F_X(0.77) where X is a normal random variable with mean 0 and standard deviation 2.1 and F_X is its CDF.

dbeta(0.7, 2.1, 3.4) computes f_X(0.7) where X is a beta random variable with parameters 2.1 and 3.4 and f_Xis its PDF.

qgamma(0.1, 3.1, 1.0, lower.tail = FALSE) finds a value y so that P(Y > y) = 0.1 where Y has a gamma distribution with shape 3.1 and scale 1.

Distributions and parameterizations

Distribution	Base name	Parameters
beta	`beta`	`shape1`, `shape2`
binomial	`binom`	`size`, `prob`
Cauchy	`cauchy`	`location`, `scale`
chi-squared	`chisq`	`df`
exponential	`exp`	`rate`
F	`f`	`df1`, `df2`
gamma	`gamma`	`shape`, `rate`
geometric	`geom`	`p`
hypergeometric	`hyper`	`m`, `n`, `k`
log-normal	`lnorm`	`meanlog`, `sdlog`
logistic	`logis`	`location`, `scale`
negative binomial	`nbinom`	`size`, `prob`
normal	`norm`	`mean`, `sd`
Poisson	`pois`	`lambda`
Student t	`t`	`df`
uniform	`unif`	`min`, `max`
Weibull	`weibull`	`shape`, `scale`

Note that the exponential is parameterized in terms of the rate, the reciprocal of the mean.

The gamma can be parameterized by its shape and either the rate or the scale. The rate is the default argument by position, but you can specify the scale by name.

The hypergeometric distribution gives the probability of various numbers of red balls when k balls are taken from an urn containing m red balls and n blue balls. Note that another popular convention uses the number of red balls and the total number of balls m+n.

Note that the parameters for the log-normal are the mean and standard deviation of the log of the distribution, not the mean and standard deviation of the distribution itself.