Escaping the unit disk

Hypergeometric functions are defined in terms of their power series representation. This power series converges inside the unit circle and diverges outside. The functions extend to most [1] of the complex plane, and do so in a computationally nice way.

Analytic continuation is often sort of existential: we prove that it can be done, but we may not be able to say much explicitly about the result. But for hypergeometric functions, there is an explicit formula.

This formula is so complicated that it’s possible to miss its basic structure. Here’s an outline version:

F(\ldots z) = \ldots F(\ldots z^{-1}) + \ldots F(\ldots z^{-1})

This says that F evaluated at a point z outside of the unit circle is equal to the sum of two terms involving F evaluated at 1/z, a point inside the unit circle. This means you can evaluate F everywhere [1] if you can evaluate it inside the unit circle.

The parameters in each of the F‘s above are different, so the formula doesn’t relate a single function’s value at z to the same function’s values at 1/z. Instead, it relates the value of one instance of a class of functions at z to values of other instances of the class at 1/z.

And now without further ado, here is the analytic continuation formula in all its glory:

\begin{align*}  \frac{\Gamma(a)\Gamma(b)}{\Gamma(c)}F(a, b;c; z) &=  \frac{\Gamma(a)\Gamma(b-a)}{\Gamma(c-a)} \,\,(-z)^{-a}\,\,F(a, 1-c+a; 1-b+a; z^{-1}) \\ &+ \,\frac{\Gamma(b)\Gamma(a-b)}{\Gamma(c-b)} \,\,(-z)^{-b}\,\,F(b,\, 1-c+b; 1-a+b; z^{-1}) \end{align*}

[1] There may be a singularity at 1, and there may be a branch cut along the real axis from 1 to infinity.

Connections

One of my favorite things in math is connecting together things that do not seem related at first. This post will elaborate on a connection between two recent posts.

Gauss’s constant

A couple weeks ago I wrote about Gauss’s constant

g \equiv \frac{1}{\text{AGM}(1, \sqrt{2})} = 0.834626\ldots

and several things it was related to. We can calculate g with n iterations of the process defining the AGM (arithmetic-geometric mean) as follows.

    def f(n):
        a, b = 1, 2**0.5
        for _ in range(n):
            a, b = 0.5*(a+b), (a*b)**0.5
        return 1/a

This process converges very quickly, calculating g to full machine precision when n = 4.

Elliptic integrals

Gauss’s constant, and the AGM more generally, can be computed in terms of an elliptic integral, and that in turn can be computed in terms of a hypergeometric function.

Elliptic integrals are so named because historically they are related to the problem of finding the perimeter of an ellipse. The particular elliptic integral we’re interested here is Legendre’s K function

K(k) = \int_0^{\pi/2} \frac{d\theta}{\sqrt{1 - k^2 \sin^2\theta}}

(There are several conventions for parameterizing elliptic integrals, so this definition may not match what you see elsewhere. See last section.)

With this definition of K,

\frac{1}{\text{AGM}(x,y)} = \frac{2}{\pi y} K(\sqrt{1 - y^2/x^2})

and by the symmetry of the AGM, you can reverse the order of x and y. It follows that

g = \frac{\sqrt{2}}{\pi}K\left(\frac{1}{\sqrt{2}}\right) = \frac{2}{\pi}K(i)

Hypergeometric functions

The previous post briefly introduced hypergeometric functions. The function K is related to Gauss’s hypergeometric function by

K(k) = \frac{\pi}{2} F\left(\frac{1}{2}, \frac{1}{2}; 1; k^2 \right)

Expanding F as a series shows the connection to double factorials.

\begin{align*} F\left(\frac{1}{2}, \frac{1}{2}; 1; x \right) &= \sum_{n=0}^\infty \frac{(1/2)_n (1/2)_n}{(1)_n n!} x^n \\ &= \sum_{n=0}^\infty \left( \frac{(2n-1)!!}{(2n)!!} \right)^2 x^n \end{align*}

This means you could numerically evaluate a power series to compute Gauss’s constant, thought that would be inefficient. Because the AGM converges so quickly, you’d want to evaluate elliptic integrals in terms of the AGM, not evaluate the AGM in terms of elliptic integrals.

Conventions

There is some confusing terminology and variation in conventions when defining elliptic integrals. We’ve defined K here in terms of the “elliptic modulus” k. Some software, such as Mathematica and SciPy, define K in terms of “the parameter” m = k². So to evaluate K above using Mathematica, you’d need to square the argument first.

    g = EllipticK[0.5] Sqrt[2]/Pi

or

    g = 2 EllipticK[-1]/Pi

Similarly, in Python we’d write

    from scipy.special import ellipk
    from math import pi
    g = ellipk(0.5)*2**0.5/pi

or

    g = 2*ellipk(-1)/pi

Foreshadowing hypergeometric functions

I’ve said in several blog posts that multi-factorials come up often in practice, and given examples. This post will give a glimpse of why this is.

Rising powers

The kth rising power of a is

(a)k = a (a+1) (a + 2) … (a + k – 1).

So, for example, (5)3 = 5*6*7 and (1)k = n!.

Double factorials

Now let’s look at rising powers of fractions.

(1/2)k = (1/2) (3/2) (5/2) … (2k-1)/2 = (2k-1)!! / 2k.

This means double factorials of odd numbers can be written as rising powers.

(2k-1)!! = 2k (1/2)k .

Now for even numbers,

(2k)!! = 2k k! = 2k (1)k

and so even double factorials can be written in terms of rising powers too. This means that the term

(2k – 1)!! / (2k)!!

from the series in the previous post could be written as

(1/2)k / (1)k .

Hmm. Ratios of rising powers sound useful.

Generalizing

From the discussion above you can imagine how to create triple factorials out of rising powers of 1/3 and 2/3. In fact it’s easy to see that all multifactorials can be made out of rising powers of rational numbers.

Ratios of multifactorials come up often in coefficients of power series, and these can be rewritten as ratios of rising powers. Maybe we should have a name for functions whose coefficients are ratios of rising powers. And we do: hypergeometric functions. You can find the exact definition here.

At first the definition of a hypergeometric function seems arbitrary. I hope my previous posts, and especially this post, has been a sort of warm up to motivate why you might be interested in such functions.

Applications

In an old post I quoted W. W. Sawyer saying

There must be many universities today where 95 percent, if not 100 percent, of the functions studied by physics, engineering, and even mathematics students, are covered by the single symbol F(a, b; c; x).

The notation F(a, b; c; x) here denotes the hypergeometric function whose kth power series coefficient is

(a)k (b)k / (c)k  k!

More general hypergeometric functions have two lists of parameters, separated by a semicolon. The first set corresponds to rising powers in the numerator, and the second set corresponds to rising powers in the denominator.

The exponential function is the simplest hypergeometric function: both parameter lists are empty, leaving only the k! term in the denominator.

The arcsine function mentioned above is an example of a function that is not hypergeometric per se, but can be written simply in terms of a hypergeometric function:

sin-1(x) = x F(1/2, 1/2; 3/2, x²).

This is more what Sawyer had in mind. Elementary functions like sine, cosine, and log note pure hypergeometric functions but are simply related to hypergeometric functions, similar to arcsine. The same is true of more advanced functions, like elliptic integrals and the CDFs of common probability distributions.

More on hypergeometric functions

Quadruple factorials and Legendre functions

Last week I claimed that double, triple, and quadruple factorials came up in applications. The previous post showed how triple factorials come up in solutions to Airy’s equation. This post will show how quadruple factorials come up in solutions to Legendre’s equation.

Legendre’s differential equation is

(1-x^2) y'' - 2x y' + \nu(\nu+1)y = 0

The Legendre functions Pν and Qν are independent solutions to the equation for given values of ν.

When ν is of the form n + ½ for an integer n, the values of Pν and Qν at 0 involve quadruple factorials and also Gauss’s constant.

For example, if ν = 1/2, 5/2, 9/2, …, then Pν(0) is given by

(-1)^{(2\nu - 1)/4} \frac{\sqrt{2}(2\nu - 2)!!!!}{(2\nu)!!!!\pi g}

Source:  An Atlas of Functions

Triple factorials and Airy functions

Last week I wrote in a post on multifactorials in which I said that

Double factorials come up fairly often, and sometimes triple, quadruple, or higher multifactorials do too.

This post gives a couple examples of triple factorials in practice.

One example I wrote about last week. Triple factorial comes up when evaluating the gamma function at an integer plus 1/3. As shown here,

\Gamma\left(n + \frac{1}{3}\right) = \frac{(3n-2)!!!}{3^n} \Gamma\left(\frac{1}{3}\right)

Another example involves solutions to Airy’s differential equation

y'' - xy = 0

One pair of independent solutions consists of the Airy functions Ai(x) and Bi(x). Another pair consists of the functions

\begin{align*} f(x) &= 1 + \frac{1}{3!}x^3 + \frac{1\cdot 4}{6!}x^6 + \frac{1\cdot 4 \cdot 7}{9!}x^9 + \cdots \\ g(x) &= x + \frac{2}{4!}x^4 + \frac{2\cdot 5}{7!}x^7 + \frac{2\cdot 5 \cdot 8}{10!}x^{10} + \cdots \end{align*}

given in A&S 10.4.3. Because both pairs of functions solve the same linear ODE, each is a linear combination of the other.

Notice that the numerators are triple factorials, and so the series above can be rewritten as

\begin{align*} f(x) &= \sum_{n=0}^\infty \frac{(3n+1)!!!}{(3n+1)!} x^{3n}\\ g(x) &= \sum_{n=0}^\infty \frac{(3n+2)!!!}{(3n+2)!} x^{3n+1} \end{align*}

The next post gives an example of quadruple factorials in action.

Gauss’s constant

I hadn’t heard of Gauss’s constant until recently. I imagine I’d seen it before and paid no attention. But once I paid attention, I started seeing it more often. There’s a psychology term for this—reticular activation?—like when you buy a green Toyota and suddenly you see green Toyotas everywhere.

Our starting point is the arithmetic-geometric mean or AGM. It takes two numbers, takes the arithmetic (ordinary) mean and the geometric mean, then repeats the process over and over. The limit of this process is the AGM of the two numbers.

Gauss’s constant is the reciprocal of the AGM of 1 and √2.

g \equiv \frac{1}{\text{AGM}(1, \sqrt{2})} = 0.834626\ldots

Gauss’s constant can be expressed in terms of the Gamma function:

g = (2\pi)^{-3/2} \, \Gamma\left(\frac{1}{4}\right)^2

Exponential sums

Last week I wrote about the sum

\sum_{n=-\infty}^\infty e^{-\pi n^2} = \frac{\sqrt[4] \pi}{\Gamma\left(\frac 3 4\right)}

which we will see shortly is related to Gauss’s constant.

We can the definition of g in terms of Γ(1/4) and the reflection identify for the gamma function to show that the sum above can be written in terms of Gauss’s constant:

\sum_{n=-\infty}^\infty e^{-\pi n^2} = 2^{1/4} \sqrt{g}

The alternating version of the sum is also related to g:

\sum_{n=-\infty}^\infty (-1)^n e^{-\pi n^2} = g^2

Integrals

Another place where g comes up is in integrals of hyperbolic functions. For example

\begin{align*} \int_0^\infty \sqrt{\text{sech}(x)} \, dx &= \pi g \\ \int_0^\infty \sqrt{\text{csch}(x)} \, dx &= \sqrt{2} \pi g \end{align*}

Here

\begin{align*} \text{sech}(x) &= \frac{1}{\cosh(x)} = \frac{2}{e^x + e^{-x}} \\ \text{csch}(x) &= \frac{1}{\sinh(x)} = \frac{2}{e^x - e^{-x}} \\ \end{align*}

Beta probabilities

Another place Gauss’s constant comes up is in special values of beta distribution probabilities.

Define the incomplete beta function by

B(a, b, x) = \int_0^x t^{a-1} (1-t)^{b-1}\, dt

If X is a random variable with a beta(a, b) distribution, then the CDF of X is the normalized incomplete beta function, i.e.

\text{Prob}(X \leq x) = \frac{B(a, b, x)}{B(a, b, 1)}

A couple special values of B involve Gauss’s constant, namely

B\left(\frac{1}{4}, \frac{1}{4}, \frac{1}{2}\right) = \sqrt{2} \pi g

and

B\left(\frac{3}{4}, \frac{3}{4}, \frac{1}{2}\right) = \frac{1}{\sqrt{2} g}

Source:  An Atlas of Functions

Ubiquitous constant

The constant M defined by 1/M = √2 g is the so-called “ubiquitous constant,” though this seems like an unjustified name for such an arcane constant. Perhaps there is some context in which the constant is ubiquitous in the colloquial sense.

Multiple angle identity for cotangent

The previous post discusses generalized replicative functions, functions that satisfy

a_n f(nx) + b_n = \sum_{i=0}^{n-1} f\left(x + \frac{i}{n} \right )

Donald Knuth says in Exercise 40 of Section 1.2.4 of TAOCP that the functions cot πx and csc² πx fall into this class of functions.

If this is true of cot πx then it follows by differentiation that it is true of csc² πx as well, so we will focus on the former.

I believe that for f(x) = cot(πx) we have an = n and bn = 0, but I don’t have a proof. If this is correct then

n \cot (n \pi x) = \sum_{i=0}^{n-1} \cot\left(\pi\left(x + \frac{i}{n} \right ) \right )

Here’s a Python script that draws values of n and x at random and verifies that the left and right sides above are equal to at least 10 decimal places. It’s not a proof, but it’s good evidence.

    import numpy as np

    def cot(x): return 1/np.tan(x)

    def f1(n, x): return n*cot(n*np.pi*x)

    def f2(n, x): return sum(cot(np.pi*(x + k/n)) for k in range(n))

    np.random.seed(20211011)
    for _ in range(1000):
        n = np.random.randint(1, 50)
        x = np.random.random()
        a, b = f1(n,x), f2(n, x)
        assert( abs(a - b) <= abs(a)*1e-10 )

Note that NumPy doesn’t have a cotangent function. See this post for quirks of how trig functions are supported in various environments.

An actual proof is left as an exercise for the reader. Knuth left it as an exercise too, but he didn’t say what the a‘s are, so I’m being nicer than he was. :)

If you come up with a proof, please share it in the comments.

Update

Here’s a proof combining the first two comments below. Start with the identity

\sin(nx) = 2^{n-1}\prod_{k=1}^{n-1}\sin\left(x+\frac{k\pi}{n}\right)

and substitute πx for x. Take the logarithm and then the derivative of both sides, divide by π, and you get the identity we’re trying to prove.

Multiplication theorem rabbit hole

When I started blogging I was reluctant to allow comments. It seems this would open the door to a flood of spam. Indeed it does, but nearly all of it can be filtered out automatically.

The comments that aren’t spam have generally been high quality. A comment on my post about the sawtooth function a few days ago has sent me down a fascinating rabbit hole.The sawtooth function belongs to a general class of functions called replicative functions, functions that satisfy the equation

f(nx) = \sum_{i=0}^{n-1} f\left(x + \frac{i}{n} \right )

for positive integers n.

I ran across this by reading Knuth’s TAOCP. Knuth gives an understated comment in section 1.2.4 exercise 40 that “it may be interesting to study also the more general class of functions” for which the left-hand side above includes constants that depend on n, i.e.

a_n f(nx) + b_n = \sum_{i=0}^{n-1} f\left(x + \frac{i}{n} \right )

where the constants an and bn depend on n but not on x.

Bernoulli polynomials

The first Bernoulli polynomial B1(x) = x – ½ is a replicative function, and the only continuous replicative functions are multiples of B1 (proved in exercise 40 referenced above).

The rest of the Bernoulli polynomials Bm(x) are generalized replicative functions, with an = nm-1 and bn = 0. This was discovered by Joseph Raabe in the nineteenth century.

Gamma and digamma functions

Gauss’s multiplication theorem for the gamma function says that for positive n,

This is equation 6.1.20 in A&S. If we take logs of both sides, we find that log Γ is a almost a generalized replicative function.

\log \Gamma(nz) = \frac{1-n}{2}\log(2\pi) + \left(nz - \frac{1}{2}\right)\log n + \sum_{k=0}^{n-1} \log \Gamma\left(z + \frac{k}{n} \right )

The factor of n log(n) z prevents the definition from holding, but if we take derivatives, this term goes away.

This means that the derivative of log Γ, a.k.a the ψ function, a.k.a. the digamma function, is a generalized replicative function with an = n and bn = –n log n.

The derivative of a generalized replicative function is another function in the same class, so the higher derivatives of log Γ, known as the polygamma functions, are also generalized replicative functions.

Hurwitz zeta function

The Hurwitz zeta function ζ(s, z) is defined by

\zeta(s, z) = \sum_{k=0}^\infty \frac{1}{(z+k)^s}

The Riemann zeta function is the special case of the Hurwitz zeta function with z = 1.

The Hurwitz zeta function is related to the polygamma functions by

\psi^{(m)}(z)= (-1)^{m+1} m! \zeta (m+1,z)

It follows from this that for fixed s, ζ(s, z) is a generalized replicative function of z. You can show that an = ns and bn = 0.

Trig functions

See the next post for a couple examples involving trig functions: cot πx and csc² πx .

Related posts

Gamma and the Pi function

The gamma function satisfies

Γ(n+1) = n!

for all non-negative integers n, and extends to an analytic function in the complex plane with the exception of singularities at the non-positive integers [1]. Incidentally, going back to the previous post, this is an example of a theorem that would have to be amended if 0! were not defined to be 1.

Wouldn’t it be simpler if the gamma function were defined so that it’s value at n, not at n+1, extended factorial? Well, some things would be simpler, but other things would not.

The Pi function defined by

Π(z) = Γ(z + 1)

extends factorial with no extra factor of 1 to keep up with, and some theorems are simpler to state in terms of the Pi function than the gamma function. For example, it’s simpler to say that the volume of a unit n-sphere is

(π/2)n/2 / Π(n/2)

than to say it’s

(π/2)n/2 / Γ(n/2 + 1).

The former has an easy-to-remember form, with lower case π on top and capital π on bottom.

The reflection identity is also a little simpler in the form

Π(z) Π(-z) = 1/sinc(z)

than in the form

Γ(z) Γ(1-z) = π / sin(πz)

The drawback to the former is that you have to remember to use the convention

sinc(z) = sin(πz) / πz

because some sources define sinc with the factor of π and some without. Neither convention makes a large majority of theorems simpler, so there’s no clear winner. [2]

Fortunately the Pi function has a different name and isn’t an alternative convention for defining the gamma function. That would be terribly confusing. [3]

Related posts

[1] The Pi function is the unique way to extend factorial to an analytic function, given some regularity assumptions. See Wielandt’s theorem and the Bohr-Mollerup theorems.

[2] Fourier transforms may be the worst example in mathematics of no clear convention winner. Multiple conventions thrive because each makes some things easier to state. (See this page for a sort of Rosetta stone translating between the various Fourier transform conventions.) In fact, the lack of standardization for the sinc function is related to the lack of standardization for the Fourier transform: you’d like to define the sinc function so that it has the simplest Fourier transform under your convention.

[3] I’m fuzzy on the history of all this, but I think what we now call the Pi function was an alternative definition of the gamma function briefly, but fortunately that got ironed out quickly.

Computing ζ(3)

I’ve started reading Paul Nahin’s new book “In Pursuit of ζ(3).” The actual title is “In Pursuit of Zeta-3.” I can understand why a publisher would go with such a title, but I assume most people who read this blog are not afraid of Greek letters.

I’ve enjoyed reading several of Nahin’s books, so I was pleased to see he had a new one.

Nahin’s latest book is about attempts to find a closed-form expression for the sum

\sum_{n=1}^\infty \frac{1}{n^3}

If you replace the “3” with “s” in the sum above, you have an expression for the Riemann zeta function ζ(s), and so ζ(3) is a convenient way of describing the sum and putting it in context [1]. There are closed-form expressions for ζ(s) when s is an even integer [2] but so far not nobody has found one for s = 3.

The value of ζ(3) is sometimes called Apéry’s constant because in 1978 Roger Apéry was the first to prove that it is an irrational number.

The names “ζ(3)” and “Apéry’s constant” are anachronisms because Euler studied the sum in the 18th century, but Riemann came along in the 19th century and Apéry in the 20th.

Naive calculation

The most obvious way to compute ζ(3) numerically is to replace the upper limit of the infinite sum with a large number N. This works, but it’s very inefficient.

Let’s calculate the sum with upper limit 100 with a little Python code.

    >>> sum(n**-3 for n in range(1, 101))
    1.2020074006596781

We can judge the accuracy of this approximation using the implementation of the zeta function in SciPy.

    >>> from scipy.special import zeta
    >>> zeta(3)
    1.2020569031595942

We can predict how accurate our sum will be by estimating the tail of the sum using the integral test as follows.

\int_{N}^\infty x^{-3} \, dx >\sum_{n = N+1}^\infty n^{-3} > \int_{N+1}^\infty x^{-3} \, dx

The integral on the left evaluates to N-2/2 and the integral on the right evaluates to (N+1)-2/2.

This says that if we sum up to N, our error will be between (N+1)-2/2 and N-2/2.

We can estimate from this that if we want d decimal places, we’ll need to sum 10d/2 terms. So to get 16 decimal places, we’d have to sum 100,000,000 terms. This is certainly not what the call to zeta(3) above is doing.

A better approach

Nahin’s book derives an equation for ζ(3) by Ernst Kummer (1810–1893):

\zeta(3) = \frac{5}{4} - \sum_{n=2}^\infty \frac{1}{n^3(n^2-1)}

Because the denominator in the sum is a 5th degree polynomial, we get faster convergence than we do from directly evaluating the definition of ζ(3). We could find an interval around the error using the integral test above, and it would show that the error is O(N-4).

Let’s try this out in Python.

    >>> f = lambda n: (n**3 * (n**2-1))**-1
    >>> 1.25 - sum(f(n) for n in range(2, 101))
    1.2020569056101726
    >>> _ - zeta(3)
    2.4505784068651337e-09

This says we get 8 correct decimal places from summing 100 terms, which is just what we’d expect from our error estimate.

Doing even better

Kummer’s approach is better than naively computing ζ(3) from its definition, but there are much more efficient methods.

With naive summation up to N, we get

2 log10 N

correct decimals. With Kummer’s sum we get twice as many,

4 log10 N

correct decimals.

But there are methods for which the number of correct decimals is linear in N rather than logarithmic. Wikipedia mentions a method that gives 5.04 N decimal places.

Update: See this post for a sequence of more efficient algorithms.

Related posts

[1] That’s how the Riemann zeta function is defined for s with real part greater than 1. For the rest of the complex plane, except s = 1, the Riemann zeta function is defined by analytic continuation. The sum is not valid for s with real part less than 1, and so, for example, the sum makes no sense at s = -1. But the analytic continuation of the sum is valid at -1, and equals -1/12 there. This leads to the rigorous justification of the otherwise nonsensical expression

1 + 2 + 3 + … = -1/12.

[2] For s = 2n where n is a positive integer,

\zeta(2n) = \frac{(-1)^{n+1}B_{2n}(2\pi)^{2n}}{2(2n)!}

For s = 0 and negative integers, we have

\zeta(-n)= (-1)^n\frac{B_{n+1}}{n+1}

Here the B‘s are the Bernoulli numbers, which have closed-form expressions.