This post shows a connection between three families of orthogonal polynomials—Legendre, Chebyshev, and Jacobi—and the beta distribution.
Legendre, Chebyshev, and Jacobi polynomials
A family of polynomials Pk is orthogonal over the interval [−1, 1] with respect to a weight w(x) if
whenever m ≠ n.
If w(x) = 1, we get the Legendre polynomials.
If w(x) = (1 − x²)−1/2 we get the Chebyshev polynomials.
These are both special cases of the Jacobi polynomials which have weight w(x) = (1 − x)α (1 + x)β. Legendre polynomials correspond to α = β = 0, and Chebyshev polynomials correspond to α = β = −1/2.
Connection to beta distribution
The weight function for Jacobi polynomials is a rescaling of the density function of a beta distribution. The change of variables x = 1 − 2u shows
The right side is proportional to the expected value of f(1 − 2X) where X is a random variable with a beta(α + 1, β + 1) distribution. So for fixed α and β, if m ≠ n and X has a beta(α + 1, β+1) distribution, then the expected value of Pm(1 − 2X) Pn(1 − 2X) is zero.
While we’re at it, we’ll briefly mention two other connections between orthogonal polynomials and probability: Laguerre polynomials and Hermite polynomials.
Laguerre polynomials
The Laguerre polynomials are orthogonal over the interval [0, ∞) with weight w(x) = xα exp(−x), which is proportional to the density of a gamma random variable with shape α + 1 and scale 1.
Hermite polynomials
There are two minor variations on the Hermite polynomials, depending on whether you take the weight to be exp(−x²) or exp(−x²/2). These are sometimes known as the physicist’s Hermite polynomials and the probabilist’s Hermite polynomials. Naturally we’re interested in the latter. The probabilist’s Hermite polynomials are orthogonal over (−∞, ∞) with the standard normal (Gaussian) density as the weight.
Are we going to talk about the connection to Gaussian quadrature next?
Probably not, but I did post notes on that here.
I have often used orthogonal polynomials to determine integrals in a Bayesian context. I used Harper polynomials, orthogonal to t-distributions, in my thesis.
But the first time I used them were Jacobi polynomials to approximate the sampling distribution of the Greenhouse-Geisser correction factor in repeated measures designs.
Cheysvev polys have beta equal to zero?