For a random variable X and a particular value x, one often needs to compute the probabilities Pr(X ≤ x) and Pr(X > x). It’s surprising how many different approaches software packages take to naming these two functions. I’ll give a few examples here.
It may seem unnecessary to provide software for computing both probabilities since they must sum to 1. However, sometimes you have to compute Pr(X > x) directly because computing Pr(X ≤ x) first and subtracting the result from 1 will not be accurate. See the discussion of erf(x) and erfc(x) here as an example.
I’m accustomed to calling Pr(X ≤ x) the CDF (cumulative distribution function) and Pr(X > x) the CCDF (complementary cumulative distribution function). In numerical libraries I’ve written, I use the function names
CCDF. This seems natural to me, but hardly any software does this.
In Python (SciPy), distribution classes have a method
cdf to compute the CDF, and a method
sf for the CCDF. (The rationale is that “sf” stands for “survival function.”) Mathematica takes a similar approach with
R takes a different approach. Instead of distribution objects with standard methods, each function has a name formed by concatenating a prefix for the function type and an abbreviation for the distribution family. For example,
pnorm is the CDF of a normal distribution,
dnorm is the PDF of a normal distribution, etc. (I find R’s prefixes hard to remember.) Also, R uses the same function for both the CDF and CCDF. By default,
pfoo computes the CDF of a distribution abbreviated
foo, but if the function has the optional argument
lower.tail = FALSE it computes the CCDF.
calc module takes an interesting approach, similar to R but more memorable in my opinion. CDF function names begin with
ltp (“lower tail probability”) and CCDF function names begin with
utp (“upper tail probability”). The final letter of the function name specifies the distribution family:
b for binomial,
c for chi-square,
n for normal, etc. So, for example, the CDF and CCDF of a normal distribution are computed by
I generally prefer APIs with long, self-evident names, but I like the Emacs calculator scheme. Brevity is more important in a calculator than in production code, and the prefixes
utp are easy to remember if you know what they stand for. They’re more symmetric than, for example, Python’s