Computing logs with the AGM

In the previous post I said that Jonathan and Peter Borwein figured out how to use the rapid convergence of the AGM to compute various functions, including logarithms. This post will show how to compute logarithms using the AGM.

First I need to define the Jacobi theta functions

\begin{align*} \theta_2(q) &= \sum_{n\in\mathbb{Z}} q^{(n + 1/2)^2} \\ \theta_3(q) &= \sum_{n\in\mathbb{Z}} q^{n^2} \end{align*}

Note that if q is very small, the series above converge very quickly. We will pick q so small that θ2(q) and θ3(q) are trivial to compute with sufficient accuracy.

Suppose we want to compute the natural log of 10 to a thousand digits. We can use the following equation from [1].

\log(1/q) = \frac{\pi/4}{\text{AGM}\Big(\,\theta^2_2(q^4),\, \theta^2_3(q^4)\,\Big)}

We want q to be small, so we want 1/q to be large. We set q = 10n to compute log 10n and divide the result by n.

We can show from the definition of the theta functions that

\begin{align*} \theta_2^2(q^4) &= 4q^2 + {\cal O}(q^{10}) \\ \theta_3^2(q^4) &= 1 + 4q^4 + {\cal O}(q^8) \\ \end{align*}

Since we want 1000 digit accuracy, we can set q = 10-250. Then θ22(q4) = 4×10-500 and θ32(q4) = 1 to about 2000 digits of accuracy.

We can calculate the result we want with just 20 iterations of the AGM as the following bc code shows.

    define agm(a, b, n) {
        auto i, c, d
        for (i = 1; i <= n; i++) {
            c = (a + b)/2
            d = sqrt(a*b)
            a = c
            b = d
        }
        return a
    }

    scale=2000
    # a(1) = arctan(1) = pi/4
    x = a(1)/(agm(4*10^-500, 1, 20)*250) - l(10) # error
    l(x)/l(10) # log_10(error)

This returns -999.03…, so we were basically able to get 1000 digits, though the last digit or two might be dodgy. Let’s try again with a smaller value of q, setting 10-300.

    x = a(1)/(agm(4*10^-600,1, 20)*300) - l(10); l(x)/l(10)

This returns -1199.03… and so we have about 1200 digits. We were using 2000 digit precision in our calculations, but our approximation θ32(q4) = 1 wasn’t good to quite 2000 digits. Using a smaller value of q fixed that.

Related posts

[1] The Borwein brothers, Pi and the AGM. arXiv

The magic AGM box

Suppose you are visited by aliens from halfway across the galaxy. After asking you a lot of questions, they give you a parting gift, little black boxes can compute

xx²/2 + x³/3 – …

with unbelievable speed and accuracy. You say thank you and your visitors vanish.

You get back home and wonder what you can do with your black boxes. The series the boxes compute looks vaguely familiar, but you can’t remember what it is. You call up a friend and he tells you that it’s the Taylor series for log(1 + x).

OK, so now what?

Your friend tells you he can predict what the boxes will output. He tells you, for example, that if you enter 0.5 it will output 0.4054651081081644. You try it, and your friend is right. At least partly. If you ask your box to give you only 16 digits, it will give you exactly what your friend said it would. But you could ask it for a thousand digits or a million digits or a billion digits, something your friend cannot do, at least not quickly.

Then you realize your friend has things backward. The way to exploit these boxes is not to compute logs on your laptop to predict their output, but to use the boxes instead of your laptop.

So you have a way to compute logs. You can bootstrap that to compute inverse logs, i.e. exp(x). And you can bootstrap that to compute sines and cosines. You try to compute anything you can starting from logs.

Enter the AGM

The preceding story was an introduction to the AGM, the arithmetic-geometric mean. It is the limit of alternatingly taking ordinary and geometric means. More on that here. What I want to focus on here is that the AGM can be computed extremely quickly.

Each iteration in the process of computing the AGM doubles the number of correct figures in the answer. Suppose you want to compute its output to a billion decimal places, and you’ve calculated a million decimal places. You need to compute 999,000,000 more decimal places, but you’re nearly there! Ten more steps and you’ll have all billion decimal places.

If you want to compute something to millions of digits, it would make sense to try to compute it in terms of the AGM. This was the research program of the brothers Jonathan and Peter Borwein. Much of this research was codified in their book Pi and the AGM. They used the AGM to compute π to crazy precision, but that wasn’t their goal per se.

Computing π was a demonstration project for a deeper agenda. While describing the work of the Borwein brothers, Richard Brent said

… theorems about π are often just the tips of “mathematical icebergs”—much of interest lies hidden beneath the surface.

The AGM of x and y equals

\frac{\pi(x+y)}{4\,K\left( \dfrac{x-y}{x+y}\right)}

where K is the “complete elliptic integral of the first kind.” You might reasonably think “Great. I’ll keep that in mind if I ever need to compute the compute elliptic integral of the first kind, whatever that is.” But K is like the log function in the story above, something that can be bootstrapped to compute other things.

The aliens Gauss, Lagrange, and Legendre gave the Borweins the AGM black box, and the Borweins figured out how to use it to compute π, but also log, exp, cos, etc. The Borwein algorithms may not be the most efficient if you only want, say, 16 digits of precision. But as you need more precision, eventually they become the algorithms to use.

See the next post for an example using the AGM to compute logarithms to 1000 digits.

And see the one after that for a way to compute π with the AGM.

Related posts

Lambert W strikes again

I was studying a statistics paper the other day in which the author said to solve

t log( 1 + n/t ) = k

for t as part of an algorithm. Assume 0 < k < n.

Is this well posed?

First of all, can this equation be solved for t? Second, if there is a solution, is it unique?

It’s not hard to show that as a function of t, the left side approaches 0 as t approaches 0, and it approaches n as t goes to infinity. So there is a solution for all k between 0 and n. The restriction on k is necessary since the left side cannot exceed n.

With a little more work one can show that the derivative is always positive, so the left side is a monotonically increasing function, and so the solution given each value of k is unique.

Analytic solution

Now if we fix n, we can think of the equation above as defining t as a function of k. Can we solve for t exactly? I suspected that there might be a solution in terms of the Lambert W function because if you divide by t and exponentiate, you get an equation that smells like the equation

z = w exp(w)

defining the function W(z). It turns out this is indeed the case.

If we ask Mathematica

    Solve[t Log[1 + n/t] ==  k, t]

we get

t\to -\frac{k}{ W\left(-\cfrac{k \exp(-k/n)}{n}\right)+\cfrac{k}{n}}

Great! There’s a closed-form solution, if you accept using the W function as being closed form.

Problems with the solution

I found the solution in Mathematica, but I tested it in Python to make sure both platforms define W the same way.

    from numpy import log, exp
    from scipy.special import lambertw

    def f(t, n):
        return t*log(1 + n/t)

    def g(k, n):
        r = k/n
        return -k/(lambertw(-r*exp(-r)) + r)

    n, k = 10, 8
    t = g(k, n)
    print(f(t, n))

This should print k, so it prints 8, right? No, it prints 10.

What’s up with that?

If we look back at the equation for t above, we see that the W function is being evaluated at x exp(x) where x = –k/n, so we should get –k/n back since W(x exp(x)) = x by definition. But that means our denominator is zero, and so the equation doesn’t make sense!

Things are getting worse. At first we had a wrong value, but at least it was finite!

The problem is not a difference between Mathematica and Python.

Resolution

The problem is we’ve glossed over a branch cut of the W function. To make a long story short, we were using the principle branch of the W function, but we should have used a different branch.

Let’s go back to where I asked Mathematica

    Solve[t Log[1 + n/t] == k, t]

I ran the solution through TeXForm to get the TeX code that produced the image for the solution equation. I made a few aesthetic changes to the TeX, but it was essentially Mathematica’s output.

Without the TeXForm, Mathematica’s solution was in terms of ProductLog, not in terms of W; the TeXForm function turned ProductLog into W. If you look up ProductLog, it will tell you

ProductLog[z] gives the principal solution for w in z = wew.

The principle solution. So we should be on the alert for difficulties with branches. There are two real solutions to z = wew for some values of z, and we have to choose the right one. For example, if z = -0.1, the w could be -0.1118 or -3.5772.

Mathematica gave me the wrong branch. But to be fair, it did try to warn me.

Solve::ifun: Inverse functions are being used by Solve, so some solutions may not be found; use Reduce for complete solution information.

The solution is to use the -1 branch. In Mathematica’s notation, the branch comes before the argument. In SciPy, the branch comes second. So to fix the code above, we need to change

    lambertw(-r*exp(-r))

to

   lambertw(-r*exp(-r), -1)

and then the code will be correct.

If x is negative, and we use the -1 branch of W, then

W-1(x exp(x)) ≠ x

and so we’re not dividing by zero in our solution.

More posts with Lambert W

Simple approximation for surface area of an ellipsoid

After writing the previous post, I wondered where else you might be able to use r-means to create accurate approximations. I thought maybe this would apply to the surface area of an ellipsoid, and a little searching around showed that Knud Thomsen thought of this in 2004.

The general equation for the surface of an ellipsoid is

frac{x^2}{a^2} + frac{y^2}{b^2} + frac{z^2}{c^2} = 1

If two of the denominators {a, b, c} are equal then there are formulas for the area in terms of elementary functions. But in general, the area requires special functions known as “incomplete elliptic integrals.” For more details, see this post.

Here I want to focus on Knud Thomsen’s approximation for the surface area and its connection to the previous post.

The previous post reported that the perimeter of an ellipse is 2πr where r is the effective radius. An ellipse doesn’t have a radius unless it’s a circle, but we can define something that acts like a radius by defining it to be the perimeter over 2π. It turns out that

r \approx \mathfrak{M}_{3/2}(x)

makes a very good approximation for the effective radius where x = (a, b).

Here

\mathfrak{M}_r(x) = \left( \frac{1}{n} \sum_{i=1}^n x_i^r \right)^{1/r}

using the notation of Hardy, Littlewood, and Pólya.

This suggests we define an effective radius for an ellipsoid and look for an approximation to the effective radius in terms of an r-mean. The area of a sphere is 4πr², so we define the effective radius of an ellipsoid as the square root of its surface area divided by 4π. So the effective radius of a sphere is its radius.

Thomsen found that

r^2 \approx \mathfrak{M}_p(x)

where x = (ab, bc, ac) and p = 1.6. More explicitly,

\text{Area} \approx 4\pi \left(\frac{(ab)^{1.6} + (bc)^{1.6} + (ac)^{1.6}}{3} \right )^{1/1.6}

Note that it’s the square of the effective radius that is an r-mean, and that the argument to the mean is not simply (a, b, c). I would have naively tried looking for a value of p so that the r-mean of (a, b, c) gives a good approximation to the effective radius. In hindsight, it makes sense in terms of dimensional analysis that the inputs to the mean have units of area, and so the output has units of area.

The maximum relative error in Thomsen’s approximation is 1.178% with p = 1.6. You can tweak the value of p to reduce the worst-case error, but the value of 1.6 is optimal for approximately spherical ellipsoids.

For an example, let’s compute the surface area of Saturn, the planet with the largest equatorial bulge in our solar system. Saturn is an oblate spheroid with equatorial diameter about 11% greater than its polar diameter.

Assuming Saturn is a perfect ellipsoid, Thomsen’s approximation over-estimates its surface area by about 4 parts per million. By comparison, approximating Saturn as a sphere under-estimates its area by 2 parts per thousand. The code for these calculations, based on the code here, is given at the bottom of the post.

Related posts

Appendix: Python code

from numpy import pi, sin, cos, arccos
from scipy.special import ellipkinc, ellipeinc

# Saturn in km
equatorial_radius = 60268
polar_radius = 54364

a = b = equatorial_radius
c = polar_radius

phi = arccos(c/a)
m = 1

temp = ellipeinc(phi, m)*sin(phi)**2 + ellipkinc(phi, m)*cos(phi)**2
ellipsoid_area = 2*pi*(c**2 + a*b*temp/sin(phi))

def rmean(r, x):
    n = len(x)
    return (sum(t**r for t in x)/n)**(1/r)

approx_ellipsoid_area = 4*pi*rmean(1.6, (a*b, b*c, a*c))

r = (a*b*c)**(1/3)
sphere_area = 4*pi*r**2

def rel_error(exact, approx):
    return (exact-approx)/approx

print( rel_error(ellipsoid_area, sphere_area) )
print( rel_error(ellipsoid_area, approx_ellipsoid_area) )

Simple approximation for perimeter of an ellipse

The perimeter of an ellipse cannot be computed in closed form. That is, no finite combination of elementary functions will give you the exact value. But we will present a simple approximation that is remarkably accurate.

So this post has two parts: exact calculation, and simple approximation.

Exact perimeter

The perimeter can be computed exactly in terms of an elliptic integral. The name’s not a coincidence: elliptic integrals are so named because they were motivated by trying to find the perimeter of an ellipse.

Let a be the length of the major semi-axis and b the length of the minor semi-major axis. So a > b, and if our ellipse is actually a circle, a = b = r.

If ε is the eccentricity of our ellipse,

ε² = 1 – b² / a²

and the perimeter is give by

p = 4a E(ε²)

where E is the “complete elliptic integral of the second kind.” This function has popped up several times on this blog, most recently in the post about why serpentine walls save bricks.

You could calculate the perimeter of an ellipse in Python as follows.

    from scipy.special import ellipe

    def perimeter(a, b):
        assert(a > b > 0)
        return 4*a*ellipe(1 - (b/a)**2)

Simple approximation

But what if you don’t have a scientific library like SciPy at hand? Is there a simple approximation to the perimeter?

In fact there is. In [1] the authors present the approximation

p \approx 2 \pi \left(\frac{a^{3/2} + b^{3/2}}{2} \right )^{2/3}

If we define the effective radius as r = p/2π, then the approximation above says

r \approx \mathfrak{M}_{3/2}(x)

in the notation of Hardy, Littlewood, and Pólya, where x = (a, b).

We could code this up in Python as follows.

    def rmean(r, x):
        n = len(x)
        return (sum(t**r for t in x)/n)**(1/r)

    def approx(a, b):
        return 2*pi*rmean(1.5, (a, b))

It would have been less code to write up approx directly without defining rmean, but we’ll use rmean again in the next post.

For eccentricity less than 0.05, the error is less than 10-15, i.e. the approximation is correct to all the precision in a double precision floating point number. Even for fairly large eccentricity, the approximation is remarkably good.

The eccentricity of Pluto’s orbit is approximately 0.25. So the approximation above could compute the perimeter of Pluto’s orbit to nine significant figures, if the orbit were exactly an ellipse. A bigger source of error would be the failure of the orbit to be perfectly elliptical.

If an ellipse has major axes several times longer than its minor axis, the approximation here is less accurate, but still perhaps good enough, depending on the use. There is an approximation due to Ramanujan that works well even for much more eccentric ellipses, though it’s a little more complicated.

To my mind, the most interesting thing here is the connection to r-norms. I’ll explore that more in the next post, applying r-norms to the surface area of an ellipsoid.

[1] Carl E. Linderholm and Arthur C. Segal. An Overlooked Series for the Elliptic Perimeter. Mathematics Magazine, June 1995, pp. 216-220

Bessel determinants

The Bessel functions J and Y are analogous to sine and cosine. Bessel functions come up in polar coordinates the way sines and cosines come up in rectangular coordinates. There are J and Y functions of various orders, conventionally written with a subscript ν.

Plotting Bessel functions J_3 and Y_3

I recently ran across a curious set of relations between these functions and their derivatives. Let’s start with the following terms defined by 2 by 2 determinants.

\begin{align*} p_\nu &= \begin{vmatrix} J_\nu(a) & J_\nu(b) \\ Y_\nu(a) & Y_\nu(b) \\ \end{vmatrix} \\ q_\nu &= \begin{vmatrix} J_\nu(a) & J'_\nu(b) \\ Y_\nu(a) & Y'_\nu(b) \\ \end{vmatrix} \\ r_\nu &= \begin{vmatrix} J'_\nu(a) & J_\nu(b) \\ Y'_\nu(a) & Y_\nu(b) \\ \end{vmatrix} \\ s_\nu &= \begin{vmatrix} J'_\nu(a) & J'_\nu(b) \\ Y'_\nu(a) & Y'_\nu(b) \\ \end{vmatrix} \end{align*}

There are a lot of symmetries in the definitions above. First, every term has a subscript ν, so you can ignore those for now.

Second, every determinant has J‘s on the top row and Y‘s on the bottom.

 \begin{vmatrix} J & J \\ Y & Y \\ \end{vmatrix}

Third, every determinant has a‘s in the first column and b‘s in the second.

 \begin{vmatrix} a & b \\ a & b \\ \end{vmatrix}

And finally, the primes, indicating derivatives, have the following pattern.

 \begin{align*} p &= \begin{vmatrix} \phantom{\cdot} & \phantom{\cdot} \\ \phantom{\cdot} & \phantom{\cdot} \\ \end{vmatrix} \\ q &= \begin{vmatrix} \phantom{\cdot} & \phantom{\cdot}' \\ \phantom{\cdot} & \phantom{\cdot}' \\ \end{vmatrix} \\ r &= \begin{vmatrix} \phantom{\cdot}' & \phantom{\cdot} \\ \phantom{\cdot}' & \phantom{\cdot} \\ \end{vmatrix} \\ s &= \begin{vmatrix} \phantom{\cdot}' & \phantom{\cdot}' \\ \phantom{\cdot}' & \phantom{\cdot}' \\ \end{vmatrix} \end{align*}

Now that we have all these definitions in place, there are several identities that the p’s, q’s, r’s, and s‘s satisfy. Some of them depend on the orders, and that’s why we included the ν subscripts. You can find various relations in A&S equation 9.1.32, but I wanted to point out one that’s particularly elegant, A&S equation 9.1.34:

\begin{vmatrix} p_\nu & q_\nu \\ r_\nu & s_\nu \end{vmatrix} = \frac{4}{\pi^2ab}

This equation, a determinant of terms defined by determinants, reduces to a simple form that depends only on where the Bessel functions are evaluated, i.e. a and b, but not on their order ν.

Python example

The Python code below will show how to call Bessel functions and their derivatives and will illustrate the equations above.

    from math import pi
    from scipy.special import jv, yv, jvp, yvp
    
    def example(n, a, b):
    
        ja = jv(n, a)
        jb = jv(n, b)
        ya = yv(n, a)
        yb = yv(n, b)
        
        jap = jvp(n, a)
        jbp = jvp(n, b)
        yap = yvp(n, a)
        ybp = yvp(n, b)
        
        p = ja  * yb  - jb  * ya
        q = ja  * ybp - jbp * ya
        r = jap * yb  - jb  * yap
        s = jap * ybp - jbp * yap
        
        print(p*s - q*r)
        print(4/(pi*pi*a*b))
    
    example(3, 16, 21)

This prints

    0.0012062045671706876
    0.0012062045671706878

The two results differ slightly in the last decimal place due to rounding error.

The order of a Bessel function can be any real number [1], and the arguments can be any complex number. Here’s another example with more general arguments.

    example(3.14, 16-3j, 21+1.2j)    

This prints

    (0.0011738907214344785 + 0.00015140286689885318j)
    (0.0011738907214345800 + 0.00015140286689880630j)

with the real and imaginary parts agreeing to 15 decimal places.

Related posts

[1] You can define Bessel functions with complex order, but SciPy doesn’t support that.

Superfactorial

The factorial of a positive integer n is the product of the numbers from 1 up to and including n:

n! = 1 × 2 × 3 × … × n.

The superfactorial of n is the product of the factorials of the numbers from 1 up to and including n:

S(n) = 1! × 2! × 3! × … × n!.

For example,

S(5) = 1! 2! 3! 4! 5! = 1 × 2 × 6 × 24 × 120 = 34560.

Here are three examples of where superfactorial pops up.

Vandermonde determinant

If V is the n by n matrix whose ij entry is ij-1 then its determinant is S(n-1). For instance,

\begin{vmatrix} 1 & 1 & 1 & 1 \\ 1 & 2 & 4 & 8 \\ 1 & 3 & 9 & 27 \\ 1 & 4 & 16& 64 \end{vmatrix} = S(3) = 3!\, 2!\, 1! = 12

V is an example of a Vandermonde matrix.

Permutation tensor

One way to define the permutation symbol uses superfactorial:

\epsilon_{i_1 i_2 \cdots i_n} = \frac{1}{S(n-1)} \prod_{1 \leq k < j \leq n} (i_j - i_k)

Barnes G-function

The Barnes G-function extends superfactorial to the complex plane analogously to how the gamma function extends factorial. For positive integers n,

G(n) = \prod_{k=1}^{n-1}\Gamma(k) = S(n-2)

Here’s plot of G(x)

produced by

    Plot[BarnesG[x], {x, -2, 4}]

in Mathematica.

More posts related to factorial

Negative space graph

Here is a plot of the first 30 Chebyshev polynomials. Notice the interesting patterns in the white space.

Plot of first 30 Chebyshev polynomials

Forman Acton famously described Chebyshev polynomials as “cosine curves with a somewhat disturbed horizontal scale.” However, plotting  cosines with frequencies 1 to 30 gives you pretty much a solid square. Something about the way Chebyshev polynomials disturb the horizontal scale creates the interesting pattern in negative space. (The distortion is different for each polynomial; otherwise the cosine picture would be a rescaling of the Chebyshev picture.)

I found the example above in a book that referenced a book by Theodore Rivlin. There’s a new edition of Rivlin’s book coming out in August, so maybe it will say something about the gaps.

Update: Here are analogous graphs for Legendre polynomials, plotting the even and odd ordered polynomials separately. They also have conspicuous holes, but they don’t fill the unit square the way Chebyshev polynomials do.

Even order Legendre polynomials

Odd order Legendre polynomials

More on Chebyshev polynomials

To integrate the impossible integral

Don Quixote

In the Broadway musical Man of La Mancha, Don Quixote sings

To dream the impossible dream
To fight the unbeatable foe
To bear with unbearable sorrow
To run where the brave dare not go

Yesterday my daughter asked me to integrate the impossible integral, and this post has a few thoughts on the quixotic quest to run where the brave calculus student dare not go.

The problem apparently required computing the indefinite integral of the square root of sine:

\int \sqrt{\sin x} \,dx

I say apparently for reasons that will soon be clear.

My first thought was that some sort of clever u-substitution would work. This was coming from a homework problem, so I was in the mindset of expecting every problem to be doable. If I ran across this integral while I had my consultant hat on rather than my father/tutor hat, I would have thought “It probably doesn’t have an elementary closed form because most integrals don’t.”

It turns out I should have had my professional hat on, because the integral does indeed not have an elementary closed form. There is a well-known function which is an antiderivative of √sin, but it’s not an elementary function. (“Well-known” relative to professional mathematicians, not relative to calculus students.)

You can evaluate the integral as

\int \sqrt{\sin x} \,dx = -2 E\left(\frac{\pi - 2x}{4}\, \middle|\, 2 \right) + C

where E is Legendre’s “elliptic integral of the second kind.”

I assume it wasn’t actually necessary to compute the antiderivative, though I didn’t see the full problem. In the artificial world of the calculus classroom, everything works out nicely. And this is a shame.

For one thing, it creates a false impression. Most integrals are “impossible” in the sense of not having an elementary closed form. Students who go on to use calculus for anything more than artificial homework problems may incorrectly assume they’ve done something wrong when they encounter an integral in the wild.

For another, it’s a missed opportunity. Or maybe two or three missed opportunities. These may or may not be appropriate to bring up, depending on how receptive the audience is. (But we said at the beginning we would “run where the brave calculus student dare not go.”)

It’s a missed opportunity to emphasize what integrals really are, to separate meaning from calculation technique. Is the integral of √sin impossible? Not in the sense that it doesn’t exist. Yes, there are functions whose derivative is √sin, at least if we limit ourselves to a range where sine is not negative. No, the integral does not exist in the sense of a finite combination of functions a calculus student would have seen.

It’s also a missed opportunity to show that we can define new functions as the solution to problems we can’t solve otherwise. The elliptic integrals mentioned above are functions defined in terms of integrals that cannot be computed in more elementary terms. By giving the function a name, we can compare where it comes up in other contexts. For example, I wrote a few months ago about the problem of finding the length of a serpentine wall. This also pops out of a calculus problem that doesn’t have an elementary solution, and in fact it also has a solution in terms of elliptic integrals.

Finally, it’s a missed opportunity to give a glimpse of the wider mathematical landscape. If not everything elementary function has an elementary antiderivative, do they at least have antiderivatives in terms of special functions such as elliptic integrals? No, but that’s a good question.

Any continuous function has an antiderivative, and that antiderivative might be an elementary function, or it might be a combination of elementary functions and familiar special functions. Or it might be an obscure special function, something not exactly common, but something that has been named and studied before. Or it might be a perfectly valid function that hasn’t been given a name yet. Maybe it’s too specialized to deserve a name, or maybe you’ve discovered something that comes up repeatedly and deserves to be cataloged.

This touches more broadly on the subject of what functions exist versus what functions have been named. Students implicitly assume these two categories are the same. Here’s an example of the same phenomenon in probability. It also touches on the gray area between what has been named and what hasn’t, and how you decide whether something deserves to be named.

Update: The next post gives a simple approximation to the integral in this post.

Chebyshev’s other polynomials

There are two sequences of polynomials named after Chebyshev, and the first is so much more common that when authors say “Chebyshev polynomial” with no further qualification, they mean Chebyshev polynomials of the first kind. These are denoted with Tn, so they get Chebyshev’s initial [1]. The Chebyshev polynomials of the second kind are denoted Un.

Chebyshev polynomials of the first kind are closely related to cosines. It would be nice to say that Chebyshev polynomials of the second kind are to sines what Chebyshev polynomials of the first kind are to cosines. That would be tidy, but it’s not true. There are relationships between the two kinds of Chebyshev polynomials, but they’re not that symmetric.

It is true that Chebyshev polynomials of the second kind satisfy a relation somewhat analogous to the relation

\cos n\theta = T_n(\cos\theta)

for his polynomials of the first kind, and it involves sines:

\frac{\sin n\theta}{\sin \theta} = U_{n-1}(\cos\theta)

We can prove this with the equation we’ve been discussing in several posts lately, so there is yet more juice to squeeze from this lemon.

Once again we start with the equation

\cos n\theta + i\sin n\theta = \sum_{j=0}^n {n\choose j} i^j(\cos\theta)^{n-j} (\sin\theta)^j

and take the complex part of both sides. The odd terms of the sum contribute to the imaginary part, so we can assume j = 2k + 1. We make the replacement

(\sin\theta)^{2k+1} = (1 - \cos^2\theta)^k \sin\theta

and so we’re left with a polynomial in cos θ, except for an extra factor of sin θ in every term.

This shows that sin nθ / sin θ, is a polynomial in cos θ, and in fact a polynomial of degree n-1. Given the theorem that

\frac{\sin n\theta}{\sin \theta} = U_{n-1}(\cos\theta)

it follows that the polynomial in question must be Un-1.

More special function posts

[1]Chebyshev has been transliterated from the Russian as Chebysheff, Chebychov, Chebyshov, Tchebychev, Tchebycheff, Tschebyschev, Tschebyschef, Tschebyscheff, Chebychev, etc. It is conventional now to use “Chebyshev” as the name, at least in English, and to use “T” for the polynomials.