Chebyshev’s inequality says that the probability of a random variable being more than *k* standard deviations away from its mean is less than 1/*k*^{2}. In symbols,

This inequality is very general, but also very weak. It assumes very little about the random variable *X* but it also gives a loose bound. If we assume slightly more, namely that *X* has a **unimodal** distribution, then we have a tighter bound, the Vysochanskiĭ-Petunin inequality.

However, the Vysochanskiĭ-Petunin inequality does require that *k* be larger than √(8/3). In exchange for the assumption of unimodality and the restriction on *k* we get to reduce our upper bound by more than half.

While tighter than Chebyshev’s inequality, the stronger inequality is still very general. We can usually do much better if we can say more about the distribution family. For example, suppose *X* has a uniform distribution. What is the probability that *X* is more than two standard deviations from its mean? Zero, because two standard deviations puts you outside the interval the uniform is defined on!

Among familiar distributions, when is the Vysochanskiĭ-Petunin inequality most accurate? That depends, of course, on what distributions you consider familiar, and what value of *k* you use. Let’s look at normal, exponential, and Pareto. These were chosen because they have thin, medium, and thick tails. We’ll also throw in the double exponential, because it has the same tail thickness as exponential but is symmetric. We’ll let *k* be 2 and 3.

Distribution family | P(|X – E(X)| > 2σ) |
V-P estimate | P(|X – E(X)| > 3σ) |
V-P estimate |
---|---|---|---|---|

Uniform | 0.0000 | 0.1111 | 0.0000 | 0.0494 |

Normal | 0.0455 | 0.1111 | 0.0027 | 0.0494 |

Exponential | 0.0498 | 0.1111 | 0.0183 | 0.0494 |

Pareto | 0.0277 | 0.1111 | 0.0156 | 0.0494 |

Double exponential | 0.0591 | 0.1111 | 0.0144 | 0.0494 |

A normal random variable is more than 2 standard deviations away from its mean with probability 0.0455, compared to the Vysochanskiĭ-Petunin bound of 1/9 = 0.1111. A normal random variable is more than 3 standard deviations away from its mean with probability 0.0027, compared to the bound of 4/81 = 0.0484.

An exponential random variable with mean μ also has standard deviation μ, so the only way it could be more than 2μ from its mean is to be 3μ from 0. So an exponential is more that 2 standard deviations from its mean with probability exp(-3) = 0.0498, and more than 3 standard deviations with probability exp(-4) = 0.0183.

We’ll set the minimum value of our Pareto random variable to be 1. As with the exponential, the Pareto cannot be 2 standard deviations less than its mean, so we look at the probability of it being more than 2 greater than its mean. The shape parameter α must be bigger than 2 for the variance to exist. The probability of our random variable being more than *k* standard deviations away from its mean works out to ((α-1)/((*k*-1)α))^{α} and is largest as α converges down toward 2. The limiting values for *k* equal to 2 and 3 are 1/36 = 0.0277 and 1/64 = 0.0156 respectively. Of our examples, the Pareto distribution comes closest to the Vysochanskiĭ-Petunin bounds, but doesn’t come that close.

The double exponential, also known as Laplace, has the highest probability of any of our examples of being two standard deviations from its mean, but this probability is still less than half of the Vysochanskiĭ-Petunin bound. The limit of the Pareto distribution has the highest probability of being three standard deviations from its mean, but still less than one-third of the Vysochanskiĭ-Petunin bound.

Generic bounds are useful, especially in theoretical calculations, but it’s usually possible to do much better with specific distributions.

Chernoff is even better!