This post shows a connection between three families of orthogonal polynomials—Legendre, Chebyshev, and Jacobi—and the beta distribution.

## Legendre, Chebyshev, and Jacobi polynomials

A family of polynomials *P*_{k} is orthogonal over the interval [-1, 1] with respect to a weight *w*(*x*) if

whenever *m* ≠ *n*.

If *w*(*x*) = 1, we get the Legendre polynomials.

If *w*(*x*) = (1 – *x*²)^{-1/2} we get the Chebyshev polynomials.

These are both special cases of the Jacobi polynomials which have weight *w*(*x*) = (1- *x*)^{α} (1 + *x*)^{β}. Legendre polynomials correspond to α = β = 0, and Chebyshev polynomials correspond to α = β = -1/2.

## Connection to beta distribution

The weight function for Jacobi polynomials is a rescaling of the density function of a beta distribution. The change of variables *x* = 1 – 2*u* shows

The right side is proportional to the expected value of *f*(1 – 2*X*) where *X* is a random variable with a beta(α + 1, β+1) distribution. So for fixed α and β, if *m* ≠ *n* and *X* has a beta(α + 1, β+1) distribution, then the expected value of *P*_{m}(1 – 2*X*) *P*_{n}(1 – 2*X*) is zero.

While we’re at it, we’ll briefly mention two other connections between orthogonal polynomials and probability: Laguerre polynomials and Hermite polynomials.

## Laguerre polynomials

The Laguerre polynomials are orthogonal over the interval [0, ∞) with weight *w*(*x*)* = x ^{α}* exp(-

*x*), which is proportional to the density of a gamma random variable with shape α+1 and scale 1.

## Hermite polynomials

There are two minor variations on the Hermite polynomials, depending on whether you take the weight to be exp(-*x*²) or exp(-*x*²/2). These are sometimes known as the physicist’s Hermite polynomials and the probabilist’s Hermite polynomials. Naturally we’re interested in the latter. The probabilist’s Hermite polynomials are orthogonal over (-∞, ∞) with the standard normal (Gaussian) density as the weight.

Are we going to talk about the connection to Gaussian quadrature next?

Probably not, but I did post notes on that here.

I have often used orthogonal polynomials to determine integrals in a Bayesian context. I used Harper polynomials, orthogonal to t-distributions, in my thesis.

But the first time I used them were Jacobi polynomials to approximate the sampling distribution of the Greenhouse-Geisser correction factor in repeated measures designs.