# Central Limit Theorems

These notes summarize several extensions of the Central Limit Theorem (CLT) and related results.

**Outline**:

Classical CLT

Rate of convergence

Directions for generalization

Non-identically distributed random variables

Liapounov's theorem

Lindeberg-Feller theorem

Generalized CLT for infinite variance

## Classical Central Limit Theorem

Let *X*_{n} be a sequence of independent, identically distributed (i.i.d.) random variables.
Assume each *X* has finite mean, E(*X*) = μ, and finite variance, Var(*X*) = σ^{2}.
Let *Z*_{n} be the normalized average of the first n random variables, i.e.

*Z*_{n} = (*X*_{1} + *X*_{2} + ... +
*X*_{n} - nμ)/ σ √ *n*.

The **classical Central Limit Theorem** says that *Z*_{n} converges in distribution to
a standard normal distribution. This means that the CDF of *Z*_{n}
converges pointwise to
Φ, the CDF of a standard normal (Gaussian) random variable.
(See notes on
modes of convergence.)

A special case of the CLT in which the *X*_{n} are assumed to be
binomial goes back to Abraham de Moivre in 1733.

### Rate of convergence

It is natural to ask about the rate of convergence in the CLT. If *F*_{n} is the CDF of
*Z*_{n},
once we know that *F*_{n}(x) converges to Φ(*x*) as
*n* → ∞, we might want to know how quickly this convergence
takes place. Said another way, for a given *n*, we might want to know how well Φ approximates
*F*_{n}.
This question is settled by the **Berry-Esséen theorem**. See
Quantifying the error in the central limit theorem. For examples of
normal approximations for specific distributions, see the following links:
binomial,
beta,
gamma,
Poisson,
Student-t.

### Directions for generalization

The classical CLT has three requirements:

- independence,
- identical distribution, and
- finite variance.

Each of these conditions can be weakened to create variations on the central limit theorem. We will keep the assumption of independence in these notes. For CLT results for dependent radnom variables, see Chow and Teicher. Below we consider non-identically distributed random variables and random variables with infinite variance.

## Non-identically distributed random variables

In this section we allow the possibility that the *X*_{n} variables
are not identically distributed. The main results in this are are the
Lindeberg-Feller theorem and its corollary
Liapounov's theorem.

First we introduce notation and assumptions common to both theorems. Let
*X*_{n} be a sequence of independent random variables, at least one of
which has a non-degenerate distribution. Assume each *X*_{n} has mean
0 and variance σ_{n}^{2}. Define the partial sum

*S*_{n} = *X*_{1} + *X*_{2} + ... +
*X*_{n}

and its variance

*s*_{n}^{2} = σ_{1}^{2} + σ_{2}^{2} + ... + σ_{n}^{2}.

Both theorems concern under what circumstances the normalized partial
sums *S*_{n} / *s*_{n} converge in distribution to a standard normal random variable. We start
with Liapounov's theorem because it is simpler.

### Liapounov's theorem

**Liapounov's theorem **weakens the requirement of indentical distribution
but strengthens the requirement of finite variance. Where the classical CLT
requires finite moments of order 2, Liapounov's CLT requires finite moments
of order 2 + δ for some δ > 0.

Assume E(|*X*_{n}|^{2+δ}) is bounded for some δ > 0 and
for all *n*. If

s_{n}^{-1 - δ/2} ∑_{1 ≤ k ≤ n} E(|*X*_{n}|^{2 + δ}) → 0

as *n* → ∞ then *S*_{n} / *s*_{n} converges in distribution to a standard normal random variable.

### Lindeberg-Feller theorem

The Lindeberg-Feller theorem is more general than Liapounov's theorem. It
gives necessary and sufficient conditions for *S*_{n} / *s*_{n}
to converge to a standard normal.

**Lindeberg**: Under the assumptions above (each *X* has zero mean and finite variance,
and at least one *X* has a non-degenerate distribution) then if the Lindeberg
condition holds, *S*_{n} / *s*_{n} converges in distribution to a standard normal random variable.

**Feller**: Conversely, if *S*_{n} / *s*_{n} converges in distribution to
a standard normal and σ_{n}/s_{n} → 0 and s_{n} → ∞ then the
Lindeberg condition holds.

So what is this **Lindeberg condition**? Let *F*_{n} be the CDF of
*X*_{n}, i.e.
*F*_{n}(*x*) = P(*X*_{n} < *x*). The Lindeberg condition requires

for all ε > 0.

## Generalized CLT for random variables with infinite variance

For this section, we require the random variables *X*_{n} to be
independent and identically distributed. However, we do not require that they
have finite variance.

First we look at some restrictions for what a generalized CLT would look
like for random variables *X*_{n} without finite variance. We would
need sequences of constants *a*_{n} and *b*_{n} such that
(*X*_{1} + *X*_{2} + ... + *X*_{n} -
*b*_{n})/*a*_{n}
converges in distribution to something. It turns out that the something
that the sequence converges to must have a **stable distribution**.

Let *X _{0}*,

*X*, and

_{1}*X*be independent, identically distributed (iid) random variables. The distribution of these random variables is called

_{2}**stable**if for every pair of positive real numbers

*a*and

*b*, there exists a positive

*c*and a real

*d*such that

*cX*+

_{0}*d*has the same distribution as

*aX*+

_{1}*bX*.

_{2}Stable distributions can be specified by four parameters. One of the four
parameters is the **exponent parameter** 0 < α ≤ 2. This
parameter is controls the thickness of the distribution tails. The
distributions with α = 2 are the normal (Gaussian) distributions. For α < 2,
the PDF is asymptotically proportional to |*x*|^{-α-1} and the CDF is asymptotically
proportional to |*x*|^{-α} as* x* → ±∞. And so except
for the normal distribution, all stable distributions have thick tails; the
variance does not exist.

The characterisitc functions for stable distributions can be written in closed form in terms of the four parameters mentioned above. In general, however, the density functions for stable distributions cannot be written down in closed form. There are three exeptions: the normal distributions, the Cauchy distributions, and the Lévy distributions.

Let *F*(*x*) be the CDF for the random variables *X _{i}*.
The following conditions on

*F*are necessary and sufficient for the aggregation of the

*X*’s to converge to a stable distribution with exponent α < 2.

*F*(*x*) = (*c*_{1}+ o(1)) |*x*|^{-α}*h*(|*x*|) as*x*→ -∞, and- 1 -
*F*(*x*) = (*c*_{2}+ o(1))*x*^{-α}*h*(*x*) as*x*→ ∞

where *h*(*x*) is a slowly varying function. Here o(1)
denotes a function tending to 0. (See notes on
asymptotic
notation.) A **slowly varying function** *h*(*x*) is one such
that the ratio *h*(*cx*) / *h*(*x*) → 1 as *x* → ∞
for all *c* > 0. Roughly speaking, this means *F*(*x*)
has to look something like |*x*|^{-α} in both the left and
right tails, and so the *X*’s must be distributed something like the
limiting distribution. For more information, see Petrov's book below.

## References

Limit Theorems of Probability Theory: Sequences of Independent Random Variables by Valentin Petrov.

Probability Theory: Independence, Interchangeability, Martingales by Yuan Shih Chow and Henry Teicher.

Power laws and the generalized CLT blog post

An introuction to stable distributions by John P. Nolan

The Life and Times of the Central Limit Theorem by William Adams