Yule-Simon distribution

The Yule-Simon distribution, named after Udny Yule and Herbert Simon, is a discrete probability with pmf

$f(k; \rho) = \rho B(k, \rho + 1)$

The semicolon in f(k; ρ) suggests that we think of f as a function of k, with a fixed parameter ρ. The way the distribution shows the connection to the beta function, but for our purposes it will be helpful to expand this function using

$B(x, y) = \frac{\Gamma(x) \, \Gamma(y)}{\Gamma(x+y)}$

and so

$\begin{align*} f(k; \rho) &= \rho B(k, \rho + 1) \\ &= \rho \Gamma(\rho + 1) \,\,\frac{\Gamma(k)}{\Gamma(k + \rho + 1)} \\ &= \rho \Gamma(\rho + 1) \,\, \frac{1}{(k + \rho)^{\underline{\rho + 1}}} \end{align*}$

Ignore the first part of the last line, ρ Γ(ρ + 1), because it doesn’t involve k. It helps to ignore proportionality constants in probability densities when they’re not necessary. What’s left is the (ρ + 1) falling power of k + ρ.

For large values of k, the falling power term is asymptotically equal to k^ρ+1. To see this, let k = 1000 and ρ = 3. Then we’re saying that the ratio of

1003 × 1002 × 1001 × 1000

1000 × 1000 × 1000 × 1000

is approximately 1, and the ratio converges 1 as k increases.

This says that the Yule-Simon distribution is a power law in the tails, just like the Zipf distribution and the zeta distribution. Details of the comparison between these three distributions are given here.

Related posts