Product of Chebyshev polynomials

Chebyshev polynomials satisfy a lot of identities, much like trig functions do. This point will look briefly at just one such identity.

Chebyshev polynomials Tn are defined for n = 0 and 1 by

T0(x) = 1
T1(x) = x

and for larger n using the recurrence relation

Tn+1(x) = 2xTn(x) – Tn-1(x)

This implies

T2(x) = 2xT1(x) – T0(x) = 2x2 – 1
T3(x) = 2xT2(x) – T1(x) = 4x3 – 3x
T4(x) = 2xT3(x) – T2(x) = 8x4 – 8x2 + 1

and so forth.

Now for the identity for this post. If mn, then

2 Tm Tn  = Tm+n  + Tmn.

In other words, the product of the mth and nth Chebyshev polynomials is the average of the (m + n)th and (mn)th Chebyshev polynomials. For example,

2 T3(x) T1(x) = 2 (4x3 – 3x) x = T4(x) + T2(x)

The identity above is not at all apparent from the recursive definition of Chebyshev polynomials, but it follows quickly from the fact that

Tn(cos θ) = cos nθ.

Proof: Let θ = arccos x. Then

2 Tm(x) Tn(x)
= 2 Tm(cos θ) Tn(cos θ)
= 2 cos mθ cos nθ
= cos (m+n)θ + cos (mn
= Tm+n(cos θ)  + Tmn(cos θ)
= Tm+n(x)  + Tmn(x)

You might object that this only shows that the first and last line are equal for values of x that are cosines of some angle, i.e. values of x in [-1, 1]. But if two polynomials agree on an interval, they agree everywhere. In fact, you don’t need an entire interval. For polynomials of degree m+n, as above, it is enough that they agree on m + n + 1 points. (Along those lines, see Binomial coefficient trick.)

The close association between Chebyshev polynomials and cosines means you can often prove Chebyshev identities via trig identities as we did above.

Along those lines, we could have taken

Tn(cos θ) = cos nθ

as the definition of Chebyshev polynomials and then proved the recurrence relation above as a theorem, using trig identities in the proof.

Forman Acton suggested in this book Numerical Methods that Work that you should think of Chebyshev polynomials as “cosine curves with a somewhat disturbed horizontal scale.”

Generalization of power polynomials

A while back I wrote about the Mittag-Leffler function which is a sort of generalization of the exponential function. There are also Mittag-Leffler polynomials that are a sort of generalization of the powers of x; more on that shortly.

Recursive definition

The Mittag-Leffler polynomials can be defined recursively by M0(x) = 1
and

M_{n+1}(x) = \frac{x}{2}\left(M_n(x+1) + 2M_n(x) + M_n(x-1) \right )

for n > 0.

Contrast with orthogonal polynomials

This is an unusual recurrence if you’re used to orthogonal polynomials, which come up more often in application. For example, Chebyshev polynomials satisfy

T_{n+1}(x) = 2x T_n(x) - T_{n-1}(x)

and Hermite polynomials satisfy

 H_{n+1}(x) = x H_n(x) - n H_{n-1}(x)

as I used as an example here.

All orthogonal polynomials satisfy a two-term recurrence like this where the value of each polynomial can be found from the value of the previous two polynomials.

Notice that with orthogonal polynomial recurrences the argument x doesn’t change, but the degrees of polynomials do. But with Mittag-Leffler polynomials the opposite is true: there’s only one polynomial on the right side, evaluated at three different points: x+1, x, and x-1.

Generalized binomial theorem

Here’s the sense in which the Mittag-Leffler polynomials generalize the power function. If we let pn(x) = xn be the power function, then the binomial theorem says

p_n(x+y) = \sum_{k=0}^n {n\choose k}\, p_{k}(x)\, p_{n-k}(y).

Something like the binomial theorem holds if we replace pn with Mn:

M_n(x+y) = \sum_{k=0}^n {n\choose k}\, M_{k}(x)\, M_{n-k}(y).

Related posts

Stable and unstable recurrence relations

The previous post looked at computing recurrence relations. That post ends with a warning that recursive evaluations may nor may not be numerically stable. This post will give examples that illustrate stability and instability.

There are two kinds of Bessel functions, denoted J and Y. These are called Bessel functions of the first and second kinds respectively. These functions carry a subscript n denoting their order. Both kinds of Bessel functions satisfy the same recurrence relation:

fn+1 – (2n/x) fn + fn-1 = 0

where f is J or Y.

If you apply the recurrence relation in the increasing direction, it is unstable for J but stable for Y.

If you apply the recurrence relation in the opposite direction, it is stable for J and unstable for Y.

We will illustrate the above claims using the following Python code. Since both kinds of Bessel function satisfy the same recurrence, we pass the Bessel function in as a function argument. SciPy implements Bessel functions of the first kind as jv and Bessel functions of the second kind as yv. [1]

    from scipy import exp, pi, zeros
    from scipy.special import jv, yv

    def report(k, computed, exact):
        print(k, computed, exact, abs(computed - exact)/exact)

    def bessel_up(x, n, f):
        a, b = f(0, x), f(1, x)
        for k in range(2, n+1):
            a, b = b, 2*(k-1)*b/x - a
            report(k, b, f(k,x))     

    def bessel_down(x, n, f):
        a, b = f(n,x), f(n-1,x)
        for k in range(n-2, -1, -1):
            a, b = b, 2*(k+1)*b/x - a
            report(k, b, f(k,x))

We try this out as follows:

    bessel_up(1, 20, jv)
    bessel_down(1, 20, jv)
    bessel_up(1, 20, yv)
    bessel_down(1, 20, yv)

When we compute Jn(1) using bessel_up, the relative error starts out small and grows to about 1% when n = 9. The relative error increases rapidly from there. When n = 10, the relative error is 356%.

For n = 20, the recurrence gives a value of 316894.36 while the true value is 3.87e-25, i.e. the computed value is 30 orders of magnitude larger than the correct value!

When we use bessel_down, the results are correct to full precision.

Next we compute Yn(1) using bessel_up the results are correct to full precision.

When we compute Yn(1) using bessel_down, the results are about as bad as computing Jn(1) using bessel_up. We compute Y0(1) as 0 5.7e+27 while the correct value is roughly 0.088.

There are functions, such as Legendre polynomials, whose recurrence relations are stable in either direction, at least for some range of inputs. But it would be naive to assume that a recurrence is stable without some exploration.

Miller’s algorithm

There is a trick for using the downward recurrence for Bessel functions known as Miller’s algorithm. It sounds crazy at first: Assume JN(x) = 1 and JN+1(x) = 0 for some large N, and run the recurrence downward.

Since we don’t know JN(x) our results be off by some constant proportion. But there’s a way to find out what that proportionality constant is using the relation described here.

1 = J_0(x) + 2J_2(x) + 2J_4(x) + 2J_6(x) + \cdots

We add up out computed values for the terms on the right side, then divide by the sum to normalize our estimates. Miller’s recurrence algorithm applies more generally to other recurrences where the downward recurrence is stable and there exists a normalization identity analogous to the one for Bessel functions.

The following code lets us experiment with Miller’s algorithm.

    def miller(x, N):
        j = zeros(N) # array to store values
        
        a, b = 0, 1
        for k in range(N-1, -1, -1):
            a, b = b, 2*(k+1)*b/x - a
            j[k] = b
            
        norm = j[0] + sum(2*j[k] for k in range(2,N,2))
        j /= norm
        
        for k in range(N-1, -1, -1):
            expected, computed = j[k], jv(k,x)
            report(k, j[k], jv(k,x))

When we call miller(pi, 20) we see that Miller’s method computes Jn(π) accurately. The error starts out moderately small and decreases until the results are accurate to floating point precision.

    |----+------------|
    |  k | rel. error |
    |----+------------|
    | 19 |   3.91e-07 |
    | 17 |   2.35e-09 |
    | 16 |   2.17e-11 |
    | 15 |   2.23e-13 |
    | 14 |   3.51e-15 |
    |----+------------|

For smaller k the relative error is also around 10-15, i.e. essentially full precision.

[1] Why do the SciPy names end in “v”? The order of a Bessel function does not have to be an integer. It could be any real number, and the customary mathematical notation is to use a Greek letter ν (nu) as a subscript rather than n as a reminder that the subscript might not represent an integer. Since a Greek ν looks similar to an English v, SciPy uses v as a sort of approximation of ν.

Analogies between Weierstrass functions and trig functions

If you look at the Wikipedia article on Weierstrass functions, you’ll find a line that says “the relation between the sigma, zeta, and ℘ functions is analogous to that between the sine, cotangent, and squared cosecant functions.” This post unpacks that sentence.

Weierstrass p function

First of all, what is ℘? It’s the Weierstrass elliptic function, which is the mother of all elliptic functions in some sense. All other elliptic functions can be constructed from this function and its derivatives. As for the symbol itself, ℘ is the traditional symbol for the Weierstrass function. It’s U+2118 in Unicode, ℘ in HTML, and \wp in LaTeX.

The line above suggests that ℘(x) is analogous to csc²(x). Indeed, the plots of the two functions are nearly identical.

Here’s Weierstrass’ function:

Weierstrass p plot

And here’s csc²:

plot of csc^2

The two plots basically agree to within the thickness of a line.

The Weierstrass function ℘ has two parameters that I haven’t mentioned. Elliptic functions are periodic in two directions in the complex plane, and so their values everywhere are determined by their values over a parallelogram. The two parameters specify the fundamental parallelogram, or at least that’s one way of parametrizing the function. The WeierstrassP function in Mathematica takes two other parameters, called the invariants, and these invariants were chosen in the plot above to match the period of the cosecant function.

    Plot[Sin[x]^-2, {x, -10, 10}]
    Plot[WeierstrassP[
            x, 
            WeierstrassInvariants[{Pi/2, Pi I/2}]
         ], 
        {x, -10, 10}
    ]

The fundamental parallelogram is defined in terms of half periods, usually denoted with ω’s, and the invariants are denoted with g‘s. The function WeierstrassInvariants converts from half periods to invariants, from ω’s to g‘s.

Note that ℘ and cosecant squared are similar along the real axis, but they’re very different in the full complex plane. Trig functions are periodic along the real axis but grow exponentially along the complex axis. Elliptic functions are periodic along both axes.

Weierstrass zeta function

The Weierstrass zeta function is not an elliptic function. It is not periodic but rather quasiperiodic. The derivative of the Weierstrass zeta function is negative of the Weierstrass elliptic function, i.e.

ζ ‘(x) = -℘(x)

which is analogous to the fact that the derivative of cot(x) is -csc²(x). So in that sense ζ is to ℘ as cotangent is to cosecant squared.

The plots of ζ(x) and cot(x) are similar as shown below.

Plot of Weierstrass zeta and cotangent

The Mathematica code to make the plot above was

    Plot[
        {WeierstrassZeta[x, WeierstrassInvariants[{Pi/2, Pi I/2}]], 
         Cot[x]}, 
        {x, -10, 10}, 
        PlotLabels -> {"zeta", "cot"}
    ]

Weierstrass sigma function

The Weierstrass sigma function is also not an elliptic function. It is analogous to the sine function as follows. The logarithmic derivative of the Weierstrass sigma function is the Weierstrass zeta function, just as the logarithmic derivative of sine is cotangent. That is,

(log(σ(x))’ = ζ(x).

The logarithmic derivative of a function is the derivative of its log, and so the logarithmic derivative of a function f is f ‘ / f.

However, the plot of the sigma function, WeierstrassSigma in Mathematica, hardly looks line sine.

Plot of Weierstrass sigma and sine

So in summary, logarithmic derivative takes Weierstrass sigma to Weierstrass zeta just as it takes sine to cotangent. Negative derivative takes Weierstrass zeta to Weierstrass ℘ just as it takes cotangent to negative cosecant squared.

Related elliptic function posts

Area of sinc and jinc function lobes

Someone left a comment this morning on my blog post on sinc and jinc integrals regarding the area of the lobes.

It would be nice to have the values of integrals of each lobe, i.e. integrals between 0 and multiples of pi. Anyone knows of such a table?

This post will include Python code to address that question. (Update: added asymptotic approximation. See below.)

First, let me back up and explain the context. The sinc function is defined as [1]

sinc(x) = sin(x) / x

and the jinc function is defined analogously as

jinc(x) = J1(x) / x,

substituting the Bessel function J1 for the sine function. You could think of Bessel functions as analogs of sines and cosines. Bessel functions often come up when vibrations are described in polar coordinates, just as sines and cosines come up when using rectangular coordinates.

Here’s a plot of the sinc and jinc functions:

The lobes are the regions between crossings of the x-axis. For the sinc function, the lobe in the middle runs from -π to π, and for n > 0 the nth lobe runs from nπ to (n+1)π. The zeros of Bessel functions are not uniformly spaced like the zeros of the sine function, but they come up in application frequently and so it’s easy to find software to compute their locations.

First of all we’ll need some imports.

    from scipy import sin, pi
    from scipy.special import jn, jn_zeros
    from scipy.integrate import quad

The sinc and jinc functions are continuous at zero, but the computer doesn’t know that [2]. To prevent division by zero, we return the limiting value of each function for very small arguments.

    def sinc(x):
        return 1 if abs(x) < 1e-8 else sin(x)/x

    def jinc(x):
        return 0.5 if abs(x) < 1e-8 else jn(1,x)/x

You can show via Taylor series that these functions are exact to the limits of floating point precision for |x| < 10-8.

Here’s code to compute the area of the sinc lobes.

    def sinc_lobe_area(n):
        n = abs(n)
        integral, info = quad(sinc, n*pi, (n+1)*pi)
        return 2*integral if n == 0 else integral

The corresponding code for the jinc function is a little more complicated because we need to compute the zeros for the Bessel function J1. Our solution is a little clunky because we have an upper bound N on the lobe number. Ideally we’d work out an asymptotic value for the lobe area and compute zeros up to the point where the asymptotic approximation became sufficiently accurate, and switch over to the asymptotic formula for sufficiently large n.

    def jinc_lobe_area(n):
        n = abs(n)
        assert(n < N)
        integral, info = quad(jinc, jzeros[n-1], jzeros[n])
        return 2*integral if n == 0 else integral

Note that the 0th element of the array returned by jn_zeros is the first positive zero of J1; it doesn’t include the zero at the origin.

For both sinc and jinc, the even numbered lobes have positive area and the odd numbered lobes have negative area. Here’s a plot of the absolute values of the lobe areas.

Asymptotic results

We can approximate the area of the nth lobe of the sinc function by using a midpoint approximation for 1/x. It works out that the area is asymptotically equal to

 (-1)^n \frac{4}{(2n+1)\pi}

We can do a similar calculation for the area of the nth jinc lobe, starting with the asymptotic approximation for jinc given here. We find that the area of the nth lobe of the jinc function is asymptotically equal to

\frac{(-1)^n}{\pi^2} \left( \frac{8}{4n+3} \right )^{3/2}

To get an idea of the accuracy of the asymptotic approximations, here are the results for n=100.

    sinc area:      0.00633455
    asymptotic:     0.00633452
    absolute error: 2.97e-8
    relative error: 4.69e-6

    jinc area:      0.000283391
    asymptotic:     0.000283385
    absolute error: 5.66e-9
    relative error: 2.00e-5

More signal processing posts

[1] Some authors define sinc(x) as sin(πx)/πx. Both definitions are common.

[2] Scipy has a sinc function in scipy.special, defined as sin(πx)/πx, but it doesn’t have a jinc function.

Sum of all Spheres

I ran across a video this afternoon that explains that the sum of volumes of all even-dimensional unit spheres equals eπ.

Why is that? Define vol(n) to be the volume of the unit sphere in dimension n. Then

\mathrm{vol}(n) = \frac{\pi^{n/2}}{\Gamma\left(\frac{n}{2} + 1\right)}

and so the sum of the volumes of all even dimensional spheres is

\sum_{k=0}^\infty \mathrm{vol}(2k) = \sum_{k=0}^\infty \frac{\pi^k}{k!} = \exp(\pi)

But what if you wanted to sum the volumes of all odd dimensional unit spheres? Or all dimensions, even and odd?

The answers to all these questions are easy in terms of the Mittag-Leffler function that I blogged about a while back. This function is defined as

E_{\alpha, \beta}(x) = \sum_{k=0}^\infty \frac{x^k}{\Gamma(\alpha k+\beta)}

and reduces to the exponential function when α = β = 1.

The sum of the volumes of all unit spheres of all dimensions is E1/2, 1(√π). And from the post mentioned above,

E_{1/2, 1}(x) = \exp(x^2) \, \mbox{erfc}(-x)

where erfc is the complementary error function. So the sum of all volumes of spheres is exp(π) erfc(-√π).

Now erfc(-√π) ≈ 1.9878 and so this says the sum of the volumes of spheres of all dimensions is about twice the sum of the even dimensional spheres alone. And so the sum of the odd dimensional unit sphere volumes is almost the same as the sum of the even dimensional ones.

By the way, you could easily answer other questions about sums of sphere volumes in terms of the Mittag-Leffler function. For example, if you want to add up the volumes in all dimensions that are a multiple of 3, you get E3/2, 13/2).

Related posts

Bessel function crossings

The previous post looked at the angles that graphs make when they cross. For example, sin(x) and cos(x) always cross with the same angle. The same holds for sin(kx) and cos(kx) since the k simply rescales the x-axis.

The post ended with wondering about functions analogous to sine and cosine, such as Bessel functions. This post will look at that question in more detail. Specifically we’ll look at the functions Jν and Yν.

Because these two Bessel functions satisfy the same second order linear homogeneous differential equation, the Strum separation theorem says that their zeros are interlaced: between each pair of consecutive zeros of Jν is exactly one zero of Yν, and between each pair of consecutive zeros of Yν there is exactly one zero of Jν.

Plotting Bessel functions J_3 and Y_3

In the following Python code, we find zeros of Jν, then look in between for places where Jν and Yν cross. Next we find the angle the two curves make at each intersection and plot the angles.

    from scipy.special import jn_zeros, jv, yv
    from scipy.optimize import bisect
    from numpy import empty, linspace, arccos
    import matplotlib.pyplot as plt
    
    n = 3 # bessel function order
    N = 100 # number of zeros
    
    z = jn_zeros(n, N) # Zeros of J_n
    crossings = empty(N-1)
    
    f = lambda x: jv(n,x) - yv(n,x)    
    for i in range(N-1):
        crossings[i] = bisect(f, z[i], z[i+1])
    
    def angle(n, x):
        # Derivatives of J_nu and Y_nu
        dj = 0.5*(jv(n-1,x) - jv(n+1,x))
        dy = 0.5*(yv(n-1,x) - yv(n+1,x))
        
        top = 1 + dj*dy
        bottom = ((1 + dj**2)*(1 + dy**2))**0.5
        return arccos(top/bottom)
        
    y = angle(n, crossings)
    plt.plot(y)
    plt.xlabel("Crossing number")
    plt.ylabel("Angle in radians")
    plt.show()

This shows that the angles steadily decrease, apparently quadratically.

Angles of crossing of J_3 and Y_3

This quadratic behavior is what we should expect from the asymptotics of Jν and Yν: For large arguments they act like shifted and rescaled versions of sin(x)/√x. So if we looked at √xJν and √xYν rather than Jν and Yν we’d expect the angles to reach some positive asymptote, and they do, as shown below.

Angles of crossing of √x J_3 and √xY_3

Related posts

New expansions of confluent hypergeometric function

Last week Bujanda et al published a paper [1] that gives new expansions for the confluent hypergeometric function. I’ll back up explain what that means before saying more about the new paper.

Hypergeometric functions

Hypergeometric functions are something of a “grand unified theory” of special functions. Many functions that come up in application are special cases of hypergeometric function, as shown in the diagram below. (Larger version here.)

special function relationships

I give a brief introduction to hypergeometric functions in these notes (4-page pdf).

Confluent hypergeometric functions

The confluent hypergeometric function corresponds to Hypergeometric 1F1 in the chart above [2]. In Mathematica it goes by Hypergeometric1F1 and in Python goes by scipy.special.hyp1f1.

This function is important in probability because you can use it to compute the CDFs for the normal and gamma distributions.

New expansions

Bujanda et al give series expansions of the confluent hypergeometric functions in terms of incomplete gamma functions. That may not sound like progress because the incomplete gamma functions are still special functions, but the confluent hypergeometric function is a function of three variables M(ab; z) whereas the incomplete gamma function γ(sz) is a function of two variables.

They also give expansions in terms of elementary functions which may be of more interest. The authors give the series expansion below.

M_n(a, b; z) = \frac{\Gamma(b)}{\Gamma(a)\,\Gamma(b-a)} \sum_{k=0}^{n-1} A_k(a, b)\, F_k(z)

The A functions are given recursively by

and

A_n(a, b) = \frac{2}{n} \left( (2a - b)A_{n-1}(a, b) + 2(n-b)A_{n-2}(a,b) \right) 

for n > 1.

The F functions are given by

F_n(z) = \frac{n!}{(-z)^{n+1}} \Big( e_n(z/2) - \exp(z) e_n(-z/2) \Big)

where

e_n(z) = \sum_{k=0}^n \frac{z^k}{k!}

The authors give approximation error bounds in [1]. In the plot below we show that n = 3 makes a good approximation for M(3, 4.1, z). The error converges to zero uniformly as n increases.

Related posts

[1] Blanca Bujanda, José L. López, and Pedro J. Pagola. Convergence expansions of the confluent hypergeometric functions in terms of elementary functions. Mathematics of computation. DOI https://doi.org/10.1090/mcom/3389

[2] “Confluent” is a slightly archaic name, predating a more systematic naming for hypergeometric functions. The name mFn means the hypergeometric function has m upper parameters and n lower parameters. The confluent hypergeometric function has one of each, so it corresponds to 1F1. For more details, see these notes.

Gamma gamma gamma!

There are several things in math and statistics named gamma. Three examples are

  • the gamma function
  • the gamma constant
  • the gamma distribution

This post will show how these are related. We’ll also look at the incomplete gamma function which connects with all the above.

The gamma function

The gamma function is the most important function not usually found on a calculator. It’s the first “advanced” function you’re likely to learn about. You might see it in passing in a calculus class, in a homework problem on integration by parts, but usually not there’s not much emphasis on it. But it comes up a lot in application.

\Gamma(x) = \int_0^\infty t^{x-1 } e^{-t}\, dt

You can think of the gamma function as a way to extend factorial to non-integer values. For non-negative integers n, Γ(n + 1) = n!.

(Why is the argument n + 1 to Γ rather than n? There are a number of reasons, historical and practical. Short answer: some formulas turn out simpler if we define Γ that way.)

The gamma constant

The gamma constant, a.k.a. Euler’s constant or the Euler-Mascheroni constant, is defined as the asymptotic difference between harmonic numbers and logarithms. That is,

\gamma = \lim_{n\to\infty} \left( 1 + \frac{1}{2} + \frac{1}{3} + \cdots + \frac{1}{n}- \log n\right)

The constant γ comes up fairly often in applications. But what does it have to do with the gamma function? There’s a reason the constant and the function are both named by the same Greek letter. One is that the gamma constant is part of the product formula for the gamma function.

\frac{1}{\Gamma(x)} = x e^{\gamma x} \prod_{r=1}^\infty \left(1 + \frac{x}{r} \right) e^{-x/r}

If we take the logarithm of this formula and differentiation we find out that

\frac{\Gamma'(1)}{\Gamma(1)} = -\gamma

The gamma distribution

If you take the integrand defining the gamma function and turn it into a probability distribution by normalizing it to integrate to 1, you get the gamma distribution. That is, a gamma random variable with shape k has probability density function (PDF) given by

f(x) = \frac{1}{\Gamma(k)} x^{k-1} e^{-x}

More generally you could add a scaling parameter to the gamma distribution in the usual way. You could imaging the scaling parameter present here but set to 1 to make things simpler.

The incomplete gamma function

The incomplete gamma function relates to everything above. It’s like the (complete) gamma function, except the range of integration is finite. So it’s now a function of two variables, the extra variable being the limit of integration.

\gamma(s, x)= \int_0^x t^{s-1} e^{-t} \, dt

(Note that now x appears in the limit of integration, not the exponent of t. This notation is inconsistent with the definition of the (complete) gamma function but it’s conventional.)

It uses a lower case gamma for its notation, like the gamma constant, and is a generalization of the gamma function. It’s also essentially the cumulative distribution function of the gamma distribution. That is, the CDF of a gamma random variable with shape s is γ(sx) / Γ(s).

The function γ(sx) / Γ(s) is called the regularized incomplete gamma function. Sometimes the distinction between the regularized and unregularized versions is not explicit. For example, in Python, the function gammainc does not compute the incomplete gamma function per se but the regularized incomplete gamma function. This makes sense because the latter is often more convenient to work with numerically.

Related posts

Fourier series for Jacobi functions

I’ve mentioned a couple times that Jacobi elliptic functions are generalizations of sines and cosines. In an earlier post I showed how you can make sn and cn from sine and cosine by a nonlinear rescaling of the input. In this post I’ll look at a linear scaling of the input and a sum of sines or cosines.

This post described how the Jacobi functions depend on a parameter m. As this parameter goes to zero, the functions sn and cn converge to sine and cosine respectively. Here we’ll look more quantitatively at what happens as m varies between 0 and 1.

The Jacobi functions are elliptic functions, so they’re periodic in two directions in the complex plane. The quarter period along the real axis is called K and is given as a function of m by the following integral.

K(m) = \int_0^{\pi/2} \frac{d\theta}{\sqrt{1-m\sin^2\theta}}

You can see that as m goes to zero, K goes to π/2, and so we have a function with period 2π.

Along the imaginary axis, the quarter period is given by K‘ = K(1-m). So as m goes to zero, K‘ goes to infinity. Said another way, the sn and cn functions lose periodicity in the imaginary direction as they converge to sine and cosine. Their fundamental rectangle gets stretched into an infinite vertical strip.

We said we’d get more quantitative, and to do that we introduce a function with the quaint name nome. The nome of a Jacobi elliptic function is defined as

q = \exp(-\pi K'/K)

Since K and K‘ are both functions of m, q is a function of m, which we plot below.

nome plot

The nome is important because it quantifies the size of Fourier coefficients.

We mentioned a linear scaling at the top of the post. That scaling is v = πu/(2K). With that change of variables, we can write down Fourier series for sn, cn, and dn in terms of q.

\begin{align*} \mathrm{sn}(u) &=\frac{2\pi}{K\sqrt{m}} \sum_{n=0}^\infty \frac{q^{n+1/2}}{1-q^{2n+1}} \sin ((2n+1)v) \\ \mathrm{cn}(u)&=\frac{2\pi}{K\sqrt{m}} \sum_{n=0}^\infty \frac{q^{n+1/2}}{1+q^{2n+1}} \cos ((2n+1)v)\\ \mathrm{dn}(u)&=\frac{\pi}{2K} + \frac{2\pi}{K} \sum_{n=1}^\infty \frac{q^{n}}{1+q^{2n}} \cos (2nv). \end{align*}

These equations are complicated, but we can begin to understand them by simply looking at the size of the coefficients.

If q is small, the denominators of the coefficients are approximately 1, so we can concentrate on the numerators, which are a geometric sequence. The coefficients go to zero rapidly, and so a small number of terms in the Fourier series are enough to approximate the Jacobi functions well.

As m goes to 1, so does q(m), but the function q(m) waits until the last minute to catch up with m because it has such a steep slope near 1. This means the function q(m) can be small even when m is large.

In the plot below, m = 0.99, and yet sn is approximated fairly well with only two sine terms. This is because q(0.99) = 0.262. For smaller m, say 0.8, it’s hard to distinguish sn and its two-term sine series approximation visually.

Two-term Fourier approximation to sn

Note that Fourier series for sn and cn only contain odd multiples of v. It’s not obvious why this is so. Because sn is an odd function, its Fourier series only contains sine terms. But this doesn’t explain why it should only contain odd frequencies. This gives our approximation extra accuracy. Our two-term approximation is effectively a three-term approximation, with the middle term being zero.

Related posts