# Approximating gamma ratios

Ratios of gamma functions come up often in applications. If the two gamma function arguments differ by an integer, then it’s easy to calculate their ratio exactly by using (repeatedly if necessary) the fact at Γ(x + 1) = x Γ(x).

If the arguments differ by 1/2, there is no closed formula, but the there are useful approximations. I’ve needed something like this a few times lately.

The simplest approximation is

You could motivate or interpret this as saying Γ(x + 1/2) is approximately the geometric mean between Γ(x + 1) and Γ(x). As we’ll see in the plot below, this approximation is good to a couple significant figures for moderate values of x.

There is another approximation that is a little more complicated but much more accurate.

The following plot shows the relative error in both approximations.

By the way, the first approximation above is a special case of the more general approximation

Source:  J. S. Frame. An Approximation to the Quotient of Gamma Function. The American Mathematical Monthly, Vol. 56, No. 8 (Oct., 1949), pp. 529-535

# Hypergeometric functions are key

From Orthogonal Polynomials and Special Functions by Richard Askey:

At first the results we needed were in the literature but after a while we ran out of known results and had to learn something about special functions. This was a very unsettling experience for there were very few places to go to really learn about special functions. At least that is what we thought. Actually there were many, but the typical American graduate education which we had did not include anything about hypergeometric functions. And hypergeometric functions are the key to this subject, as I have found out after many years of fighting them.

Askey’s book was written in 1975, and he was describing his experience from ten years before that. Special functions, and in particular hypergeometric functions, went from being common knowledge among mathematicians at the beginning of the 20th century to being arcane by mid century.

I learned little about special functions and nothing about hypergeometric functions as a graduate student. I first ran into hypergeometric functions reading in Concrete Mathematics how they are used in combinatorics and in calculating sums in closed form. Then when I started working in statistics I found that they are everywhere.

Hypergeometric functions are very useful, but not often taught anymore. Like a lot of useful mathematics, they fall between two stools. They’re considered too advanced or arcane for the undergraduate curriculum, and not a hot enough area of research to be part of the graduate curriculum.

Related posts:

# Orthogonal polynomials and the beta distribution

This post shows a connection between three families of orthogonal polynomials—Legendre, Chebyshev, and Jacobi—and the beta distribution.

## Legendre, Chebyshev, and Jacobi polynomials

A family of polynomials Pk is orthogonal over the interval [-1, 1] with respect to a weight w(x) if

whenever mn.

If w(x) = 1, we get the Legendre polynomials.

If w(x) = (1 – x²)-1/2 we get the Chebyshev polynomials.

These are both special cases of the Jacobi polynomials which have weight w(x) = (1- x)α (1 + x)β. Legendre polynomials correspond to α = β = 0, and Chebyshev polynomials correspond to α = β = -1/2.

## Connection to beta distribution

The weight function for Jacobi polynomials is a rescaling of the density function of a beta distribution. The change of variables x = 1 – 2u shows

The right side is proportional to the expected value of f(1 – 2X) where X is a random variable with a beta(α + 1, β+1) distribution. So for fixed α and β, if mn and X has a beta(α + 1, β+1) distribution, then the expected value of Pm(1 – 2X) Pn(1 – 2X) is zero.

While we’re at it, we’ll briefly mention two other connections between orthogonal polynomials and probability: Laguerre polynomials and Hermite polynomials.

## Laguerre polynomials

The Laguerre polynomials are orthogonal over the interval [0, ∞) with weight w(x) = xα exp(-x), which is proportional to the density of a gamma random variable with shape α+1 and scale 1.

## Hermite polynomials

There are two minor variations on the Hermite polynomials, depending on whether you take the weight to be exp(-x²) or exp(-x²/2). These are sometimes known as the physicist’s Hermite polynomials and the probabilist’s Hermite polynomials. Naturally we’re interested in the latter. The probabilist’s Hermite polynomials are orthogonal over (-∞, ∞) with the standard normal (Gaussian) density as the weight.

# Length of a rose

The polar graph of r = cos(kθ) is called a rose. If k is even, the curve will trace out 2k petals as θ runs between 0 and 2π. If k is odd, it will trace out k petals, tracing each one twice. For example, here’s a rose with k = 5.

(I rotated the graph 36° so it would be symmetric about the vertical axis rather than the horizontal axis.)

The arc length of a curve in polar coordinates is given by

and so we can use this find the length. The integral doesn’t have a closed form in terms of elementary functions. Instead, the result turns out to use a special function E(x), the “complete elliptic integral of the second kind,” defined by

Here’s the calculation for the length of a rose:

So the arc length of the rose r = cos(kθ) with θ running from 0 to 2π is 4 E(-k² + 1). You can calculate E in SciPy with scipy.special.ellipe.

If we compute the length of the rose at the top of the post, we get 4 E(-24) = 21.01. Does that pass the sniff test? Each petal goes from r = 0 out to r = 1 and back. If the petal were a straight line, this would have length 2. Since the petals are curved, the length of each is a little more than 2. There are five petals, so the result should be a little more than 10. But we got a little more than 20. How can that be? Since 5 is odd, the rose with k = 5 traces each petal twice, so we should expect a value of a little more than 20, which is what we got.

As k gets larger, the petals come closer to being straight lines. So we should expect that 4E(-k² + 1) approaches 4k as k gets large. The following plot of E(-k² + 1) – k provides empirical support for this conjecture by showing that the difference approaches 0, and gives an idea of the rate of convergence. It should be possible to prove that, say, that E(-k²) asymptotically approaches k, but I haven’t done this.

Related posts:

# Denver airport, Weierstrass, and A&S

Last night I was driving toward the Denver airport and the airport reminded me of the cover of Abramowitz and Stegun’s Handbook of Mathematical Functions.

Here’s the airport:

And here’s the book cover:

I’ve written about the image on book cover before. Someone asked me what function it graphed and I decided it was probably the Weierstrass ℘ function.

For more on Weierstrass’ elliptic function and why I think that’s what’s on the cover of A&S, see this post.

Photo of Denver airport via Wikipedia.

# Function on cover of Abramowitz & Stegun

Someone mailed me this afternoon asking if I knew what function was graphed on the cover of Abramowitz and Stegun’s famous Handbook of Mathematical Functions.

Here’s a close-up of the graph from a photo of my copy of A&S.

It looks like a complex function of a complex variable. I assume the height is the magnitude and the markings on the graph are the phase. That would make it an elliptic function because it’s periodic in two directions.

It has one pole and one zero in each period. I think elliptic functions are determined, up to a constant, by their periods, zeros, and poles, so it should be possible to identify the function.

In fact, I expect it’s the Weierstrass p function. More properly, the Weierstrass ℘ function, sometimes called Weierstass’ elliptic function. (Some readers will have a font installed that will properly render ℘ and some not. More on the symbol ℘ here.)

Related posts:

# Bessel series for a constant

Fourier series express functions as a sum of sines and cosines of different frequencies. Bessel series are analogous, expressing functions as a sum of Bessel functions of different orders.

Fourier series arise naturally when working in rectangular coordinates. Bessel series arise naturally when working in polar coordinates.

The Fourier series for a constant is trivial. You can think of a constant as a cosine with frequency zero.

The Bessel series for a constant is not as simple, but more interesting. Here we have

Since

we can write the series above more symmetrically as

Related posts:

# Approximate inverse of the gamma function

The other day I ran across a blog post by Brian Hayes that linked to an article by David Cantrell on how to compute the inverse of the gamma function. Cantrell gives an approximation in terms of the Lambert W function.

In this post we’ll write a little Python code to kick the tires on Cantrell’s approximation. The post also illustrates how to do some common tasks using SciPy and matplotlib.

Here are the imports we’ll need.

      import matplotlib.pyplot as plt
from scipy import pi, e, sqrt, log, linspace
from scipy.special import lambertw, gamma, psi
from scipy.optimize import root


First of all, the gamma function has a local minimum k somewhere between 1 and 2, and so it only makes sense to speak of its inverse to the left or right of this point. Gamma is strictly increasing for real values larger than k.

To find k we look for where the derivative of gamma is zero. It’s more common to work with the derivative of the logarithm of the gamma function than the derivative of the gamma function itself. That works just as well because gamma has a minimum where its log has a minimum. The derivative of the log of the gamma function is called ψ and is implemented in SciPy as scipy.special.psi. We use the function scipy.optimize.root to find where ψ is zero.

The root function returns more information than just the root we’re after. The root(s) are returned in the arrayx, and in our case there’s only one root, so we take the first element of the array:

      k = root(psi, 1.46).x[0]


Now here is Cantrell’s algorithm:

      c = sqrt(2*pi)/e - gamma(k)

def L(x):
return log((x+c)/sqrt(2*pi))

def W(x):
return lambertw(x)

def AIG(x):
return L(x) / W( L(x) / e) + 0.5


Cantrell uses AIG for Approximate Inverse Gamma.

How well goes this algorithm work? For starters, we’ll see how well it does when we do a round trip, following the exact gamma with the approximate inverse.

      x = linspace(5, 30, 100)
plt.plot(x, AIG(gamma(x)))
plt.show()


This produces the following plot:

We get a straight line, as we should, so next we do a more demanding test. We’ll look at the absolute error in the approximate inverse. We’ll use a log scale on the x-axis since gamma values get large quickly.

      y = gamma(x)
plt.plot(y, x- AIG(y))
plt.xscale("log")
plt.show()


This shows the approximation error is small, and gets smaller as its argument increases.

Cantrell’s algorithm is based on an asymptotic approximation, so it’s not surprising that it improves for large arguments.

Related posts:

# Mittag-Leffler function and probability distribution

The Mittag-Leffler function is a generalization of the exponential function. Since k!= Γ(k + 1), we can write the exponential function’s power series as

and we can generalize this to the Mittag-Leffler function

which reduces to the exponential function when α = β = 1. There are a few other values of α and β for which the Mittag-Leffler function reduces to more familiar functions. For example,

and

where erfc(x) is the complementary error function.

You can compute the Mittag-Leffler function in Mathematica with MittagLefferE[α, β, x].

## History

Mittag-Leffler was one person, not two. When I first saw the Mittag-Leffler theorem in complex analysis, I assumed it was named after two people, Mittag and Leffler. But the theorem and the function discussed here are named after one man, the Swedish mathematician Magnus Gustaf (Gösta) Mittag-Leffler (1846–1927).

The function that Mr. Mittag-Leffler originally introduced did not have a β parameter; that generalization came later. The function Eα is Eα, 1.

## Mittag-Leffler probability distributions

Just as you can make a couple probability distributions out of the exponential function, you can make a couple probability distributions out of the Mittag-Leffler function.

### Continuous Mittag-Leffler distribution

The exponential function exp(-x) is positive over [0, ∞) and integrates to 1, so we can define a probability distribution whose density (PDF) function is f(x) = exp(-x) and whose distribution function (CDF) is F(x) = 1 – exp(-x). The Mittag-Leffler distribution has CDF is 1 – Eα(-xα) and so reduces to the exponential distribution when α = 1. For 0 < α < 1, the Mittag-Leffler distribution is a fat-tailed generalization of the exponential. [1]

### Discrete Mittag-Leffler distribution

The Poisson distribution comes from taking the power series for exp(λ), normalizing it to 1, and using the kth term as the probability mass for k. That is,

The analogous discrete Mittag-Leffler distribution [2] has probability mass function

## Fractional differential equations

In addition to probability and statistics, the the Mittag-Leffler function comes up in fractional calculus. It plays a role analogous to that of the exponential distribution in classical calculus. Just as the solution to the simple differential equation

is exp(ax), for 0 < μ < 1, the solution to the fractional differential equation

is axμ-1 Eμ, μ(a xμ). Note that this reduces to exp(ax) when μ = 1. [3]

## References

[1] Gwo Dong Lin. Journal of Statistical Planning and Inference 74 (1998) 1–9, On the Mittag–Leffler distributions

[2] Subrata Chakraborty, S. H. Ong. Mittag-Leffler function distribution: A new generalization of hyper-Poisson distribution. arXiv:1411.0980v1

[3] Keith Oldham, Jan Myland, Jerome Spanier. An Atlas of Functions. Springer.

# Cornu’s spiral

Cornu’s spiral is the curve parameterized by

where C and S are the Fresnel functions, sometimes called the Fresnel cosine integral and Fresnel sine integral. Here’s a plot of the spiral.

Both Fresnel functions approach ½ as t → ∞ and so the curve slowly spirals toward (½, ½) in the first quadrant. And by symmetry, because both functions are odd, the curve spirals toward (-½, -½) in the third quadrant.

Here’s the Python code used to make the plot.

    from scipy.special import fresnel
from scipy import linspace
import matplotlib.pyplot as plt

t = linspace(-7, 7, 1000)
y, x = fresnel(t)

plt.plot(x, y)
plt.axes().set_aspect("equal")
plt.show()


The SciPy function fresnel returns both Fresnel functions at the same time. It returns them in the order (S, C) so the code reverses the order of these to match the Cornu curve.

One interesting feature of Cornu’s spiral is that its curvature increases linearly with time. This is easy to verify: because of the fundamental theorem of calculus, the Fresnel functions reduce to sines and cosines when you take derivatives, and you can show that the curvature at time t equals πt.

How fast does the curve spiral toward (½, ½)? Since the curvature at time t is πt, that says that at time t the curve is instantaneously bending like a circle of radius 1/πt. So the radius of the spiral is decreasing like 1/πt.

Cornu’s spiral was actually discovered by Euler. Cornu was an engineer who independently discovered the curve much later. Perhaps because Cornu used the curve in applications, his name is more commonly associated with the curve. At least I’ve more often seen it named after Cornu. This is an example of Stigler’s law that things are usually not named after the first person to discover them.

* * *