# Unit speed curve parameterization

A friend asked me a question the other day that came out of a graphics application. He needed to trace out an ellipse in such a way that the length of curve traced out each second was constant. For a circle, the problem is simple: (cos(t), sin(t)) will trace out a circle covering a constant amount of arc length per unit time. The analogous parameterization for an ellipse, (a cos(t), b sin(t)) will move faster near the longer semi-axis and slower near the shorter one.

There’s a general solution to the problem for any curve. Given a parameterization p(t), where p is vector-valued, the length covered from time 0 to time t is

If you change the time parameterization by inverting this function, solving for t as a function of s, then the total length of curve traversed by p(t(s)) up to time s is s. This is called either the unit speed parameterization or parameterization by arc length.

The hard part is inverting s(t). If you had to find a unit speed parameterization in a calculus class, the problem would be carefully designed so the function s(t) is easy to invert. Real applications don’t usually work out so easily.

## Digression on elliptic integrals and elliptic functions

My response to my friend’s question was that there probably is a closed-form unit speed parameterization, but it would probably involve elliptic functions. He didn’t need much resolution, and decided to do something ad hoc.

Starting with the parameterization  p(t) = (a cos(t), b sin(t)), the arc length s(t) is given by a special function known as an “incomplete elliptic integral of the second kind.” I remembered that the Jacobi elliptic functions were related to the inverses of elliptic integrals, so my hunch was that you could make a unit speed parameterization using Jacobi elliptic functions. Maybe you can, but it’s not as easy as I thought because the Jacobi functions are related to the inverses of elliptic integrals of the first kind.

Elliptic integrals are so named because they were first identified by computing arc length for a (portion of) an ellipse. Elliptic functions were discovered by inverting elliptic integrals, but not the same class of elliptic integrals that give the arc length of an ellipse. (There may well be a transformation that makes these more directly related, but I’m not aware of it.)

Incidentally, elliptic curves are related to elliptic functions, but they are not ellipses! There is a connection from ellipses to elliptic curves, but it’s historical and indirect.

What if we had a more general curve than an ellipse, say something parameterized by cubic splines? Cubic splines are piecewise cubic polynomials, patched together in such a way that the first and second derivatives are continuous across the patches. We can find length of a spline by finding the length of each polynomial patch.

If p(t) is the parameterization of a curve in 2 or 3 dimensions (or really any number of dimensions) and each component of p is a cubic polynomial, then each component of the derivative of p is a quadratic polynomial, and so the sum of the squares of the components is a fourth degree polynomial. So finding the arc length involves integrating the square root of a fourth degree polynomial. This makes it an elliptic integral!

Unfortunately, knowing that the arc length of a cubic spline corresponds to an elliptic integral is not so useful because it could be any type of elliptic integral, depending on its parameters. You’d have to do some work first to put it into a form where you could call on elliptic integrals to finish your problem.

## Numerically computing arc length and unit speed parameterization

The elliptic integral path is something of a dead end. It could still be useful if you needed high accuracy, or if you had some restriction on the class of curves you’re interested in. But in general, you’d need to use numerical integration to find the arc length.

You could also find unit-speed parameterizations numerically, using root-finding to invert s(t) at particular points. Since s is an increasing function of t, you could use a bisection method, which is not the most efficient approach but very robust.

It takes a fair amount of computation to carry root finding where each function evaluation requires computing a numerical integral. But this would work, and depending on your context it could be efficient enough.

If you needed more efficiency, say for a real-time embedded system, you could take a different approach. Your spline is probably an approximation to something else, and so your solution only needs to be as accurate as the spline approximation. This gives you the wiggle room to do something more efficient. You might change your parameterization slightly to make the arc length calculations easier, or come up with a way to approximately invert the arc length function, something that takes advantage of your particular problem and error tolerance.

# Surprising curves with sine and sn

In the previous post I said that the Jacobi functions are like trig functions. That’s true if you look along the real axis. If you look at the rest of the complex plane you’ll see how they can be very different.

The sine function is periodic along the real axis, but it grows exponentially along the imaginary axis. I mentioned parenthetically last time that the Jacobi functions are not just periodic but doubly periodic. That means that not only are they periodic as you move along the real axis, they’re periodic when you move along any line in the complex plane [1].We’ll illustrate this with some plots.

It will make our code more compact if we first define a function to split a complex number into its real and imaginary parts.

    pair[z_] := {Re[z], Im[z]}

Here’s what it looks like when you plot the real and imaginary parts of the sine function along the line (10 + 0.5i)t.


ParametricPlot[
pair[ Sin[(0.5 I + 10)t] ],
{t, 0, 10},
PlotRange -> All
]


By contrast, here’s a plot of the sn function along the line 1.25 + it.

    ParametricPlot[
pair[ JacobiSN[1.25 + I t, 0.5] ],
{t, 0, 10}
]


It’s a closed loop because sn is periodic in every direction. (By the way, the curve looks like an egg. More posts about egg-shaped curves starting here.)

When you plot sn along various lines you’ll always get closed curves, but you won’t always get such round curves. Here’s an example that doesn’t look like a closed loop because the curve turns around sharply at each end.


ParametricPlot[
pair[ JacobiCN[ (I + 0.5) t, 0.5] ],
{t, 0, 10}
]


To show that it really is periodic, we’ll add a vertical time dimension to visualize how the curve is traced out over time.

    triple[z_, t_] = {Re[z], Im[z], t}
ParametricPlot3D[
triple[JacobiSN[(I + 0.5) t, 0.5], 0.1 t],
{t, 0, 20}
]

Here’s another curve plotting sn along a different line.

    ParametricPlot[
pair[ JacobiSN[(0.9 + I) t, 0.5] ],
{t, 0, 55},
PlotRange -> All,
PlotPoints -> 100
]


As before, we can add a time dimension to imagine how the curve is traced out.

    ParametricPlot3D[
triple[JacobiSN[(0.9 + I) t, 0.5], 0.1 t],
{t, 0, 150},
PlotRange -> All,
PlotPoints -> 200
]


Here’s a variation on the plot above that stretches the vertical axis so you can better see what’s going on.


ParametricPlot3D[
triple[JacobiSN[(0.9 + I) t, 0.5], 0.2 t],
{t, 0, 150},
PlotRange -> All,
PlotPoints -> 200
]


[1] To be more precise, elliptic functions are periodic in two linearly independent directions, and thus in any direction that’s an integer linear combination of those two directions. So they’re exactly periodic in a countable number of directions almost periodic in every other direction.

# System of Jacobi elliptic functions

Jacobi’s elliptic functions are sorta like trig functions. His functions sn and cn have names that reminiscent of sine and cosine for good reason. These functions come up in applications such as the nonlinear pendulum (i.e. when θ is too
large to assume θ is a good enough approximation to sin θ) and in conformal mapping.

I ran across an article [1] yesterday that shows how Jacobi’s three elliptic functions—sn, cn, and dn—could be defined by one dynamical system

with initial conditions x(0) = 0, y(0) = 1, and z(0) = 1.

The parameter k is the modulus. (In Mathematica’s notation below, k² is the modulus.) As k decreases to 0, sn converges to sine, cn to cosine, and dn to 1. As k increases to 1, sn converges tanh, and cn and dn converge to sech. So you could think of k as a knob you turn to go from being more like circular functions (ordinary trig functions) to more like hyperbolic functions.

Since we have a dynamical system, let’s plot the solution, varying the modulus each time.The Jacobi functions are periodic (in fact they’re doubly periodic) and so the plots will be closed loops.

f[t_, m_] = {JacobiSN[t, m], JacobiCN[t, m], JacobiDN[t, m]}
ParametricPlot3D[f[t, 0.707], {t, 0, 10}, AspectRatio -> 1]


ParametricPlot3D[f[t, 0.99], {t, 0, 20}, AspectRatio -> 1]


ParametricPlot3D[f[t, 0.01], {t, 0, 20}, AspectRatio -> 1]


Note that this last plot is nearly flat because the modulus is small and so z is nearly constant. The small modulus also makes the phase portrait nearly circular because x is approximately sine and y is approximately cosine.

[1] Kenneth R. Meyer. Jacobi Elliptic Functions from a Dynamical Systems Point of View. The American Mathematical Monthly, Vol. 108, No. 8 (Oct., 2001), pp. 729-737

# Approximating gamma ratios

Ratios of gamma functions come up often in applications. If the two gamma function arguments differ by an integer, then it’s easy to calculate their ratio exactly by using (repeatedly if necessary) the fact at Γ(x + 1) = x Γ(x).

If the arguments differ by 1/2, there is no closed formula, but the there are useful approximations. I’ve needed something like this a few times lately.

The simplest approximation is

You could motivate or interpret this as saying Γ(x + 1/2) is approximately the geometric mean between Γ(x + 1) and Γ(x). As we’ll see in the plot below, this approximation is good to a couple significant figures for moderate values of x.

There is another approximation that is a little more complicated but much more accurate.

The following plot shows the relative error in both approximations.

By the way, the first approximation above is a special case of the more general approximation

Source:  J. S. Frame. An Approximation to the Quotient of Gamma Function. The American Mathematical Monthly, Vol. 56, No. 8 (Oct., 1949), pp. 529-535

# Hypergeometric functions are key

From Orthogonal Polynomials and Special Functions by Richard Askey:

At first the results we needed were in the literature but after a while we ran out of known results and had to learn something about special functions. This was a very unsettling experience for there were very few places to go to really learn about special functions. At least that is what we thought. Actually there were many, but the typical American graduate education which we had did not include anything about hypergeometric functions. And hypergeometric functions are the key to this subject, as I have found out after many years of fighting them.

Emphasis added.

Askey’s book was written in 1975, and he was describing his experience from ten years before that. Special functions, and in particular hypergeometric functions, went from being common knowledge among mathematicians at the beginning of the 20th century to being arcane by mid century.

I learned little about special functions and nothing about hypergeometric functions as a graduate student. I first ran into hypergeometric functions reading in Concrete Mathematics how they are used in combinatorics and in calculating sums in closed form. Then when I started working in statistics I found that they are everywhere.

Hypergeometric functions are very useful, but not often taught anymore. Like a lot of useful mathematics, they fall between two stools. They’re considered too advanced or arcane for the undergraduate curriculum, and not a hot enough area of research to be part of the graduate curriculum.

Related posts:

# Orthogonal polynomials and the beta distribution

This post shows a connection between three families of orthogonal polynomials—Legendre, Chebyshev, and Jacobi—and the beta distribution.

## Legendre, Chebyshev, and Jacobi polynomials

A family of polynomials Pk is orthogonal over the interval [-1, 1] with respect to a weight w(x) if

whenever mn.

If w(x) = 1, we get the Legendre polynomials.

If w(x) = (1 – x²)-1/2 we get the Chebyshev polynomials.

These are both special cases of the Jacobi polynomials which have weight w(x) = (1- x)α (1 + x)β. Legendre polynomials correspond to α = β = 0, and Chebyshev polynomials correspond to α = β = -1/2.

## Connection to beta distribution

The weight function for Jacobi polynomials is a rescaling of the density function of a beta distribution. The change of variables x = 1 – 2u shows

The right side is proportional to the expected value of f(1 – 2X) where X is a random variable with a beta(α + 1, β+1) distribution. So for fixed α and β, if mn and X has a beta(α + 1, β+1) distribution, then the expected value of Pm(1 – 2X) Pn(1 – 2X) is zero.

While we’re at it, we’ll briefly mention two other connections between orthogonal polynomials and probability: Laguerre polynomials and Hermite polynomials.

## Laguerre polynomials

The Laguerre polynomials are orthogonal over the interval [0, ∞) with weight w(x) = xα exp(-x), which is proportional to the density of a gamma random variable with shape α+1 and scale 1.

## Hermite polynomials

There are two minor variations on the Hermite polynomials, depending on whether you take the weight to be exp(-x²) or exp(-x²/2). These are sometimes known as the physicist’s Hermite polynomials and the probabilist’s Hermite polynomials. Naturally we’re interested in the latter. The probabilist’s Hermite polynomials are orthogonal over (-∞, ∞) with the standard normal (Gaussian) density as the weight.

# Length of a rose

The polar graph of r = cos(kθ) is called a rose. If k is even, the curve will trace out 2k petals as θ runs between 0 and 2π. If k is odd, it will trace out k petals, tracing each one twice. For example, here’s a rose with k = 5.

(I rotated the graph 36° so it would be symmetric about the vertical axis rather than the horizontal axis.)

The arc length of a curve in polar coordinates is given by

and so we can use this find the length. The integral doesn’t have a closed form in terms of elementary functions. Instead, the result turns out to use a special function E(x), the “complete elliptic integral of the second kind,” defined by

Here’s the calculation for the length of a rose:

So the arc length of the rose r = cos(kθ) with θ running from 0 to 2π is 4 E(-k² + 1). You can calculate E in SciPy with scipy.special.ellipe.

If we compute the length of the rose at the top of the post, we get 4 E(-24) = 21.01. Does that pass the sniff test? Each petal goes from r = 0 out to r = 1 and back. If the petal were a straight line, this would have length 2. Since the petals are curved, the length of each is a little more than 2. There are five petals, so the result should be a little more than 10. But we got a little more than 20. How can that be? Since 5 is odd, the rose with k = 5 traces each petal twice, so we should expect a value of a little more than 20, which is what we got.

As k gets larger, the petals come closer to being straight lines. So we should expect that 4E(-k² + 1) approaches 4k as k gets large. The following plot of E(-k² + 1) – k provides empirical support for this conjecture by showing that the difference approaches 0, and gives an idea of the rate of convergence. It should be possible to prove that, say, that E(-k²) asymptotically approaches k, but I haven’t done this.

Related posts:

# Denver airport, Weierstrass, and A&S

Last night I was driving toward the Denver airport and the airport reminded me of the cover of Abramowitz and Stegun’s Handbook of Mathematical Functions.

Here’s the airport:

And here’s the book cover:

I’ve written about the image on book cover before. Someone asked me what function it graphed and I decided it was probably the Weierstrass ℘ function.

For more on Weierstrass’ elliptic function and why I think that’s what’s on the cover of A&S, see this post.

Photo of Denver airport via Wikipedia.

# Function on cover of Abramowitz & Stegun

Someone mailed me this afternoon asking if I knew what function was graphed on the cover of Abramowitz and Stegun’s famous Handbook of Mathematical Functions.

Here’s a close-up of the graph from a photo of my copy of A&S.

This was my reply.

It looks like a complex function of a complex variable. I assume the height is the magnitude and the markings on the graph are the phase. That would make it an elliptic function because it’s periodic in two directions.

It has one pole and one zero in each period. I think elliptic functions are determined, up to a constant, by their periods, zeros, and poles, so it should be possible to identify the function.

In fact, I expect it’s the Weierstrass p function. More properly, the Weierstrass ℘ function, sometimes called Weierstass’ elliptic function. (Some readers will have a font installed that will properly render ℘ and some not. More on the symbol ℘ here.)

Related posts:

# Bessel series for a constant

Fourier series express functions as a sum of sines and cosines of different frequencies. Bessel series are analogous, expressing functions as a sum of Bessel functions of different orders.

Fourier series arise naturally when working in rectangular coordinates. Bessel series arise naturally when working in polar coordinates.

The Fourier series for a constant is trivial. You can think of a constant as a cosine with frequency zero.

The Bessel series for a constant is not as simple, but more interesting. Here we have

Since

we can write the series above more symmetrically as

Related posts:

# Approximate inverse of the gamma function

The other day I ran across a blog post by Brian Hayes that linked to an article by David Cantrell on how to compute the inverse of the gamma function. Cantrell gives an approximation in terms of the Lambert W function.

In this post we’ll write a little Python code to kick the tires on Cantrell’s approximation. The post also illustrates how to do some common tasks using SciPy and matplotlib.

Here are the imports we’ll need.

      import matplotlib.pyplot as plt
from scipy import pi, e, sqrt, log, linspace
from scipy.special import lambertw, gamma, psi
from scipy.optimize import root


First of all, the gamma function has a local minimum k somewhere between 1 and 2, and so it only makes sense to speak of its inverse to the left or right of this point. Gamma is strictly increasing for real values larger than k.

To find k we look for where the derivative of gamma is zero. It’s more common to work with the derivative of the logarithm of the gamma function than the derivative of the gamma function itself. That works just as well because gamma has a minimum where its log has a minimum. The derivative of the log of the gamma function is called ψ and is implemented in SciPy as scipy.special.psi. We use the function scipy.optimize.root to find where ψ is zero.

The root function returns more information than just the root we’re after. The root(s) are returned in the arrayx, and in our case there’s only one root, so we take the first element of the array:

      k = root(psi, 1.46).x[0]


Now here is Cantrell’s algorithm:

      c = sqrt(2*pi)/e - gamma(k)

def L(x):
return log((x+c)/sqrt(2*pi))

def W(x):
return lambertw(x)

def AIG(x):
return L(x) / W( L(x) / e) + 0.5


Cantrell uses AIG for Approximate Inverse Gamma.

How well goes this algorithm work? For starters, we’ll see how well it does when we do a round trip, following the exact gamma with the approximate inverse.

      x = linspace(5, 30, 100)
plt.plot(x, AIG(gamma(x)))
plt.show()


This produces the following plot:

We get a straight line, as we should, so next we do a more demanding test. We’ll look at the absolute error in the approximate inverse. We’ll use a log scale on the x-axis since gamma values get large quickly.

      y = gamma(x)
plt.plot(y, x- AIG(y))
plt.xscale("log")
plt.show()


This shows the approximation error is small, and gets smaller as its argument increases.

Cantrell’s algorithm is based on an asymptotic approximation, so it’s not surprising that it improves for large arguments.

Related posts:

# Mittag-Leffler function and probability distribution

The Mittag-Leffler function is a generalization of the exponential function. Since k!= Γ(k + 1), we can write the exponential function’s power series as

and we can generalize this to the Mittag-Leffler function

which reduces to the exponential function when α = β = 1. There are a few other values of α and β for which the Mittag-Leffler function reduces to more familiar functions. For example,

and

where erfc(x) is the complementary error function.

## History

Mittag-Leffler was one person, not two. When I first saw the Mittag-Leffler theorem in complex analysis, I assumed it was named after two people, Mittag and Leffler. But the theorem and the function discussed here are named after one man, the Swedish mathematician Magnus Gustaf (Gösta) Mittag-Leffler (1846–1927).

The function that Mr. Mittag-Leffler originally introduced did not have a β parameter; that generalization came later. The function Eα is Eα, 1.

## Mittag-Leffler probability distributions

Just as you can make a couple probability distributions out of the exponential function, you can make a couple probability distributions out of the Mittag-Leffler function.

### Continuous Mittag-Leffler distribution

The exponential function exp(-x) is positive over [0, ∞) and integrates to 1, so we can define a probability distribution whose density (PDF) function is f(x) = exp(-x) and whose distribution function (CDF) is F(x) = 1 – exp(-x). The Mittag-Leffler distribution has CDF is 1 – Eα(-xα) and so reduces to the exponential distribution when α = 1. For 0 < α < 1, the Mittag-Leffler distribution is a fat-tailed generalization of the exponential. [1]

### Discrete Mittag-Leffler distribution

The Poisson distribution comes from taking the power series for exp(λ), normalizing it to 1, and using the kth term as the probability mass for k. That is,

The analogous discrete Mittag-Leffler distribution [2] has probability mass function

## Fractional differential equations

In addition to probability and statistics, the the Mittag-Leffler function comes up in fractional calculus. It plays a role analogous to that of the exponential distribution in classical calculus. Just as the solution to the simple differential equation

is exp(ax), for 0 < μ < 1, the solution to the fractional differential equation

is axμ-1 Eμ, μ(a xμ). Note that this reduces to exp(ax) when μ = 1. [3]

## References

[1] Gwo Dong Lin. Journal of Statistical Planning and Inference 74 (1998) 1–9, On the Mittag–Leffler distributions

[2] Subrata Chakraborty, S. H. Ong. Mittag-Leffler function distribution: A new generalization of hyper-Poisson distribution. arXiv:1411.0980v1

[3] Keith Oldham, Jan Myland, Jerome Spanier. An Atlas of Functions. Springer.

# Cornu’s spiral

Cornu’s spiral is the curve parameterized by

where C and S are the Fresnel functions, sometimes called the Fresnel cosine integral and Fresnel sine integral. Here’s a plot of the spiral.

Both Fresnel functions approach ½ as t → ∞ and so the curve slowly spirals toward (½, ½) in the first quadrant. And by symmetry, because both functions are odd, the curve spirals toward (-½, -½) in the third quadrant.

Here’s the Python code used to make the plot.

    from scipy.special import fresnel
from scipy import linspace
import matplotlib.pyplot as plt

t = linspace(-7, 7, 1000)
y, x = fresnel(t)

plt.plot(x, y)
plt.axes().set_aspect("equal")
plt.show()


The SciPy function fresnel returns both Fresnel functions at the same time. It returns them in the order (S, C) so the code reverses the order of these to match the Cornu curve.

One interesting feature of Cornu’s spiral is that its curvature increases linearly with time. This is easy to verify: because of the fundamental theorem of calculus, the Fresnel functions reduce to sines and cosines when you take derivatives, and you can show that the curvature at time t equals πt.

How fast does the curve spiral toward (½, ½)? Since the curvature at time t is πt, that says that at time t the curve is instantaneously bending like a circle of radius 1/πt. So the radius of the spiral is decreasing like 1/πt.

Cornu’s spiral was actually discovered by Euler. Cornu was an engineer who independently discovered the curve much later. Perhaps because Cornu used the curve in applications, his name is more commonly associated with the curve. At least I’ve more often seen it named after Cornu. This is an example of Stigler’s law that things are usually not named after the first person to discover them.

* * *

For daily posts on analysis, follow @AnalysisFact on Twitter.

# Energy in frequency modulated signals

In an earlier post we proved that if you modulate a cosine carrier by a sine signal you get a signal whose sideband amplitudes are given by Bessel functions. Specifically:

When β = 0, we have the unmodulated carrier, cos(2π fct), on both sides. When β is positive but small, J0(β) is near 1, and so the frequency component corresponding to the carrier is only slightly diminished. Also, the sideband amplitudes, the values of Jn(β) for n ≠ 0, are small and decay rapidly as |n| increases. As β increases, the amplitude of the carrier component decreases, the sideband amplitudes increase, and the sidebands decay more slowly.

We can be much more precise: the energy in the modulated signal is the same as the energy in the unmodulated signal. As β increases, more enery transfers to the sidebands, but the total energy stays the same. This conservation of energy result applies to more complex signals than just pure sine waves, but it’s easier to demonstrate in the case of a simple signal.

## Proof

To prove the energy stays constant, we show that the sum of the squares of the coefficients of the cosine components is the same for the modulated and unmodulated signal.The unmodulated signal is just cos(2π fct), and so the only coefficient is 1. That means we have to prove

This is a well-known result. For example, it is equation 9.1.76 in Abramowitz and Stegun. We’ll show how to prove it from first principles. We’ll actually prove a more general result, the Newmann-Schläffi addition formula, then show our result follows easily from that.

### Newmann-Schläffi addition formula

Whittaker and Watson define the Bessel functions by their generating function:

This means that when you expand the expression on the left as a power series in t, whatever is multiplied by tn is Jn(z) by definition. (There are other ways of defining the Bessel functions, but this way leads quickly to what we want to prove.)

We begin by factoring the Bessel generating function applied to zw.

Next we expand both sides as power series.

and look at the terms involving tn on both sides. On the left this is Jn(zw). On the right, we multiply two power series. We will get a term containing tn whenever we multiply terms tj and tk where j and k sum to n.

The equation above is the Newmann-Schläffi addition formula.

### Sum of squared coefficients

To prove that the sum of the squared sideband coefficients is 1,  we apply the addition formula with n = 0, z = β, and w = -β.

This proves what we were after:

We used a couple facts in the last step that we haven’t discussed. The first was that J0(0) = 1. This follows from the generating function by setting z to 0 and taking the limit as t → 0. The second was that Jm(-β) = Jm(β). You can also see this from the generating function since negating z has the same effect as swapping t and 1/t.

# Analyzing an FM signal

Frequency modulation combines a signal with a carrier wave by changing (modulating) the carrier wave’s frequency.

Starting with a cosine carrier wave with frequency fc Hz and adding a signal with amplitude β and frequency fm Hz results in the combination

The factor β is known as the modulation index.

We’d like to understand this signal in terms of cosines without any frequency modulation. It turns out the result is a set of cosines weighted by Bessel functions of β.

## Component amplitudes

We will prove the equation above, but first we’ll discuss what it means for the amplitudes of the cosine components.

For small values of β, Bessel functions decay quickly, which means the first cosine component will be dominant. For larger values of β, the Bessel function values increase to a maximum then decay like one over the square root of the index. To see this we compare the coefficients for modulation index β = 0.5 and β = 5.0.

First, β = 0.5:

and now for β = 5.0:

For fixed β and large n we have

and so the sideband amplitudes eventually decay very quickly.

Update: See this post for what the equation above says about energy moving from the carrier to sidebands.

## Proof

To prove the equation above, we need three basic trig identities

and three Bessel function identities

The Bessel function identities above can be found in Abramowitz and Stegun as equations 9.1.42, 9.1.43, and 9.1.5.

And now the proof. We start with

and apply the sum identity for cosines to get

Now let’s take the first term

and apply one of our Bessel identities to expand it to

which can be simplified to

where the sum runs over all even integers, positive and negative.

Now we do the same with the second half of the cosine sum. We expand

to

which simplifies to

where again the sum is over all (odd this time) integers. Combining the two halves gives our result