Solutions to tan(x) = x

I read something recently that said in passing that the solutions to the equation tan x = x are the zeros of the Bessel function J3/2. That brought two questions to mind. First, where have I seen the equation tan x = x before? And second, why should its solutions be the roots of a Bessel function.

The answer to the first question is that I wrote about the local maxima of the sinc function three years ago. That post shows that the derivative of the sinc function sin(x)/x is zero if and only if x is a fixed point of the tangent function.

As for why that should be connected to zeros a Bessel function, that one’s pretty easy. In general, Bessel functions cannot be expressed in terms of elementary functions. But the Bessel functions whose order is an integer plus ½ can.

For integer n,

J_{n+{\frac{1}{2}}}(x)= (-1)^n \left(\frac{2}{\pi}\right)^{\frac{1}{2}} x^{n+{\frac{1}{2}}} \left(\frac{1}{x}\frac{d}{dx}\right)^n\left(\frac{\sin x}{x}\right)

So when n = 1, we’ve got the derivative of sinc right there in the definition.

Hypergeometric function of a large negative argument

It’s occasionally necessary to evaluate a hypergeometric function at a large negative argument. I was working on a project today that involved evaluating F(a, b; c; z) where z is a large negative number.

The hypergeometric function F(a, b; c; z) is defined by a power series in z whose coefficients are functions of a, b, and c. However, this power series has radius of convergence 1. This means you can’t use the series to evaluate F(a, b; c; z) for z < −1.

It’s important to keep in mind the difference between a function and its power series representation. The former may exist where the latter does not. A simple example is the function f(z) = 1/(1 − z). The power series for this function has radius 1, but the function is defined everywhere except at z = 1.

Although the series defining F(a, b; c; z) is confined to the unit disk, the function itself is not. It can be extended analytically beyond the unit disk, usually with a branch cut along the real axis for z ≥ 1.

It’s good to know that our function can be evaluated for large negative x, but how do we evaluate it?

Linear transformation formulas

Hypergeometric functions satisfy a huge number of identities, the simplest of which are known as the linear transformation formulas even though they are not linear transformations of z. They involve bilinear transformations z, a.k.a. fractional linear transformations, a.k.a. Möbius transformations. [1]

One such transformation is the following, found in A&S 15.3.4 [2].

F(a, b; c; z) = (1-z)^{-a} F\left(a, c-b; c; \frac{z}{z-1} \right)

If z < 1, then 0 < z/(z − 1) < 1, which is inside the radius of convergence. However, as z goes off to −∞, z/(z − 1) approaches 1, and the convergence of the power series will be slow.

A more complicated, but more efficient, formula is A&S 15.3.7, a linear transformation formula relates F at z to two other hypergeometric functions evaluated at 1/z. Now when z is large, 1/z is small, and these series will converge quickly.

\begin{align*} F(a, b; c; z) &= \frac{\Gamma(c) \Gamma(b-a)}{\Gamma(b) \Gamma(c-a)} \,(-z)^{-a\phantom{b}} F\left(a, 1-c+a; 1-b+a; \frac{1}{z}\right) \\ &+ \frac{\Gamma(c) \Gamma(a-b)}{\Gamma(a) \Gamma(c-b)} \,(-z)^{-b\phantom{a}} F\left(\,b, 1-c+b; 1-a+b; \,\frac{1}{z}\right) \\ \end{align*}

Related posts

[1] It turns out these transformations are linear, but not as functions of a complex argument. They’re linear as transformations on a projective space. More on that here.

[2] A&S refers to the venerable Handbook of Mathematical Functions by Abramowitz and Stegun.

Bessel zero spacing

Bessel functions are to polar coordinates what sines and cosines are to rectangular coordinates. This is why Bessel function often arise in applications with radial symmetry.

The locations of the zeros of Bessel functions are important in application, and so you can find software for computing these zeros in mathematical libraries. In days gone by you could find them in printed tables, such as Table 9.5 in A&S.

Bessel functions are solutions to Bessel’s differential equation,

x^2 y'' + x y' + (x^2 - \nu^2) y = 0

For each ν the functions Jν and Yν, known as the Bessel functions of the first and second kind respectively, form a basis for the solutions to Bessel’s equation. These functions are analogous to cosine and sine

As x → ∞, Bessel functions asymptotically behave like damped sinusoidal waves. Specifically,

\begin{align*} J_\nu(x) &\sim \sqrt{\frac{2}{\pi x}} \cos(x - \pi\nu/2 - \pi/4) \\ Y_\nu(x) &\sim \sqrt{\frac{2}{\pi x}} \sin(x - \pi\nu/2 - \pi/4) \end{align*}

So if for large x Bessel functions of order ν behave something like sin(x), you’d expect the spacing between the zeros of the Bessel functions to approach π, and this is indeed the case.

We can say more. If ν² > ¼ then the spacing between zeros decreases toward π, and if ν² < ¼ the spacing between zeros increases toward π. This is not just true of Jν and Yν but also of their linear combinations, i.e. to any solution of Bessel’s equation with parameter ν.

If you look carefully, you can see this in the plots of J0 and J1 below. The solid blue curve, the plot of J0, crosses the x-axis at points closer together than π, and dashed orange curve, the plot of J1, crosses the x-axis at points further apart than π.

For more on the spacing of Bessel zeros see [1].

Related posts

[1] F. T. Metcalf and Milos Zlamal. On the Zeros of Solutions to Bessel’s Equation. The American Mathematical Monthly, Vol. 73, No. 7, pp. 746–749

Addition theorems for Dixon functions

The last couple blog posts have been about Dixon elliptic functions, functions which are analogous in some ways to sine and cosine functions. Whereas sine and cosine satisfy a Pythagorean identity

\sin^2(z) + \cos^2(z) = 1

the Dixon functions sm and cm satisfy what you might call a Fermat identity

\text{sm}^3(z) + \text{cm}^3(z) = 1

alluding to Fermat’s last theorem.

The functions sm and cm also satisfy addition identities, but these look very different than the addition identities for sine and cosine.

\text{sm}(x + y) &= \frac
{ \text{sm}^2(x)\,\text{cm}(y)- \text{cm}(x)\,\text{sm}^2(y)}
{ \text{sm}(x)\,\text{cm}^2(y)- \text{cm}^2(x)\,\text{sm}(y)}
& \\
\text{cm}(x + y) &= \frac
{ \text{sm}(x)\,\text{cm}(x)- \text{sm}(y)\,\text{cm}(y)}
{ \text{sm}(x)\,\text{cm}^2(y)- \text{cm}^2(x)\,\text{sm}(y)}

Once you’ve seen the binomial theorem and the addition identities for trig functions, you might come away with the impression that it is common to be able to simply relate the value of a function at x + y to its values at x and at y. It is not.

There are only three classes of functions that satisfy addition theorems. (See this post for a precise definition of what is meant by an addition theorem.) And once you’ve seen the binomial theorem and the sum angle identities, you’ve seen representatives of two of the three classes. The three classes of functions with addition theorems for functions of z are

  1. Rational functions of z
  2. Rational functions of exp(λz)
  3. Elliptic functions of z

The binomial theorem is an example of the first category and sum angle identities are examples of he second category (with λ = i). Dixon functions are examples of the third category.

Conformal map between disk and equilateral triangle

The Dixon elliptic functions sm and cm are in some ways analogous to sine and cosine. However, whereas sine and cosine satisfy

\sin^2(z) + \cos^2(z) = 1

the Dixon functions satisfy

\text{sm}^3(z) + \text{cm}^3(z) = 1

The exponent 3 foreshadows the fact that these functions have a sort of three-fold symmetry. In particular, the function sm maps an equilateral triangle in the complex plane to the unit circle. The function sm gives a conformal map from the interior of this circle to the interior of the unit disk.

In this post we will work with sm−1 rather than sm, mapping the unit circle to an equilateral triangle. An advantage of working with the inverse function is that we can start with the unit circle and see what triangle it maps to; if we started with the triangle it might seem arbitrary. Also, the function sm is not commonly part of mathematical software libraries—it’s not in Mathematica or SciPy—but you can compute its inverse via

\text{sm}^{-1}(z) = {}_2F_1(\tfrac{1}{3}, \tfrac{2}{3}; \tfrac{4}{3}; z^3) \, z

using the hypergeometric function 2F1, which is a common part of mathematical libraries.

The following image shows concentric circles in the z plane and their image under sm−1 in the w plane, w = sm−1(z).

Conformal map of unit disk to equilateral triangle using the inverse of the Dixon elliptic function sm

If we were to use this in applications, we’d need to know the vertices of the image triangle so we could do a change of variables to transform this triangle into a particular triangle we’re interested in.

The centroid of the image is at the origin, and the right-most vertex is at approximately 1.7666. To be exact, the vertex is at

v = ⅓ B(⅓, ⅓)

where B is the beta function. (Notice all the 3’s in the formula for v.) The other two vertices are at exp(2π/3)v and exp(4πi/3) v.

One way this conformal map could arise in practice is solving Laplace’s equation on a triangle. You can solve Laplace’s equation on a disk in closed form, and transform that solution into a solution on the triangle.

Related posts

Python code for means

The last couple article have looked at various kinds of mean. The Python code for four of these means is trivial:

gm  = lambda a, b: (a*b)**0.5
am  = lambda a, b: (a + b)/2
hm  = lambda a, b: 2*a*b/(a+b)
chm = lambda a, b: (a**2 + b**2)/(a + b)

But the arithmetic-geometric mean (AGM) is not trivial:

from numpy import pi
from scipy.special import ellipk

agm = lambda a, b: 0.25*pi*(a + b)/ellipk((a - b)**2/(a + b)**2) 

The arithmetic-geometric mean is defined by iterating the arithmetic and geometric means and taking the limit. This iteration converges very quickly, and so writing code that directly implements the definition is efficient.

But the AGM can also be computed via a special function K, the “complete elliptic integral of the first kind,” which makes the code above more compact. This is conceptually nice because we can think of the AGM as a simple function, not an iterative process.

But how is K evaluated? In some sense it doesn’t matter: it’s encapsulated in the SciPy library. But someone has to write SciPy. I haven’t looked at the SciPy source code, but usually K is calculated numerically using the AGM because, as we said above, the AGM converges very quickly.

Bell curve meme: How to calculate the AGM? The left and right tails say to use a while loop. The middle says to evaluate a complete ellliptic integral of the first kind.

This fits the pattern of a bell curve meme: the novice and expert approaches are the same, but for different reasons. The novice uses an iterative approach because that directly implements the definition. The expert knows about the elliptic integral, but also knows that the iterative approach suggested by the definition is remarkably efficient and eliminates the need to import a library.

Although it’s easy to implement the AGM with a while loop, the code above does have some advantages. For one thing, it pushes the responsibility for validation and exception handling onto the library. On the other hand, the code is easy to get wrong because there are two conventions on how to parameterize K and you have to be sure to use the same one your library uses.

Addition theorems

Earlier this week I wrote about several ways to generalize trig functions. Since trig functions have addition theorems like

\begin{align*} \sin(\theta \pm \varphi) &= \sin\theta \cos\varphi \pm \cos\theta \sin\varphi \\ \cos(\theta \pm \varphi) &= \cos\theta \cos\varphi \mp \sin\theta \sin\varphi \\ \tan(\theta \pm \varphi) &= \frac{\tan\theta \pm \tan\varphi}{1 \mp \tan\theta \tan\varphi} \end{align*}

a natural question is whether generalized trig functions also have addition theorems.

Hyperbolic functions have well-known addition theorems analogous to the addition theorems above. This isn’t too surprising since circular and hyperbolic functions are fundamentally two sides of the same coin.

I mentioned that the lemniscate functions satisfy many identities but didn’t give any examples. Here are addition theorems satisfied by the lemniscate sine sl and the lemniscate cosine cl.

\begin{aligned} \text{cl}\,(x+y) &= \frac{\text{cl}\,x\, \text{cl}\,y - \text{sl}\,x\, \text{sl}\,y} {1 + \text{sl}\,x\, \text{cl}\,x\, \text{sl}\,y\, \text{cl}\,y} \\ \text{sl}\,(x+y) &= \frac{\text{sl}\,x\, \text{cl}\,y + \text{cl}\,x\, \text{sl}\,y} {1 - \text{sl}\,x\, \text{cl}\,x\, \text{sl}\,y\, \text{cl}\,y} \end{aligned}

Addition theorems for sinp and friends are harder to come by. In [1] the authors say “no addition formula for sinp is known to us” but they did come up with a double-argument theorem for a special case of sinp,q:

\sin_{4/3, 4}(2x) = \frac{2 \sin_{4/3, 4}(x)\, (\cos_{4/3, 4}(x))^{1/3}}{\left( 1 + 4(\sin_{4/3, 4}(x))^4 \,(\cos_{4/3, 4}(x))^{4/3} \right)^{1/2}}

There is a deep reason why the lemniscate and hyperbolic functions have addition theorems and sinp does not, namely a theorem of Weierstrass. This theorem says that a meromorphic function has an algebraic addition theorem if and only if it is an elliptic function of z, a rational function of z, or a rational function of exp(λz).

The leminscate functions have addition theorems because they are elliptic functions. Circular and hyperbolic functions have addition theorems because they are rational functions of exp(iz). But sinp does not have an addition theorem because it is not elliptic, rational, or a rational function of exp(λz). It’s possible that sinp has some sort of addition theorem that falls outside of Weiersrass’ theorem, i.e. an addition theorem using a non-algebraic function.

You may have noticed that the addition rule for sine involves not only sine but also cosine. But using the Pythagorean identity we can turn an addition rule involving sines and cosines into one only involving sines. Similarly, we can use a Pythagorean-like theorem to turn the identities involving sl and cl into identities involving only one of these functions.

Elliptic functions satisfy addition theorems, and functions satisfying addition theorems are elliptic (or the other two cases of Weierstrass’ theorem). Rational functions of x and rational functions of exp(λz) are easy to spot, so if you see an unfamiliar function that has an algebraic addition theorem, you know it’s an elliptic function. If you saw the addition theorems for sl and cl before knowing what these functions are, you could say to yourself that these are probably elliptic functions.

You may see other theorems called addition theorems. For example, the gamma function satisfies an addition theorem, although it is not elliptic or rational. But this is a restricted kind of addition theorem: it applies to x + 1 and not to general x + y. Also, the Bessel functions have addition theorems, but these theorems involve infinite sums; they are not algebraic addition theorems.

[1] David E. Edmunds, Petr Gurka, Jan Lang. Properties of generalized trigonometric functions. Journal of Approximation Theory 164 (2012) 47–56.

p-norm trig functions and “squigonometry”

This is the fourth post in a series on generalizations of sine and cosine.

The first post looked at defining sine as the inverse of the inverse sine. The reason for this unusual approach is that the inverse sine is given in terms of an arc length and an integral. We can generalize sine by generalizing this arc length and/or generalizing the integral.

The first post mentioned that you could generalize the inverse sine by replacing “2” with “p” in an integral. Specifically, the function

F_p(x) = \int_0^x (1 - |t|^p)^{-1/p} \,dt

is the inverse sine when p = 2 and in general is the inverse of the function sinp. Unfortunately, there two different ways to define sinp. We next present a generalization that includes both definitions as special cases.

Edmunds, Gurka, and Lang [1] define the function

F_{p,q}(x) = \int_0^x (1 - t^q)^{-1/p} \,dt

and define sinp,q to be its inverse.

The definition of sinp at the top of the post corresponds to sinp,q with p = q in the definition of Edmunds et al.

The other definition, and the one we’ll use for the rest of the post, corresponds to sinr,s where s = p and r = (p − 1)/p.

This second definition sinp has a geometric interpretation analogous to that in the previous post for hyperbolic functions [2]. That is, we start at (1, 0) and move clockwise along the p-norm circle until we sweep out an area of α/2. When we have swept out that much area, we are at the point (cosp α, sinp α).

When p = 4, the p-norm circle is also known as a “squircle,” and the p-norm sine and cosine analogs are sometimes placed under the heading “squigonometry.”

Previous posts in the series

[1] David E. Edmunds, Petr Gurka, Jan Lang. Properties of generalized trigonometric functions. Journal of Approximation Theory 164 (2012) 47–56.

[2] Chebolu et al. Trigonometric functions in the p-norm

Lemniscate functions

In the previous post I said that you could define the inverse sine as the function that gives the arc length along a circle, then define sine to be the inverse of the inverse sine. The purpose of such a backward definition is that it generalizes to other curves besides the circle. For example, it generalizes to the lemniscate, a curve studied by Bernoulli.

The leminiscate in rectangular coordinates satisfies

(x^2 + y^2)^2 = x^2 - y^2

and in polar coordinates

r^2 = \cos 2\theta

The function arcsl(x), analogous to arcsin(x), is defined as the length of the arc along the leminiscate from the origin to the point (x, y). The length of the arc from (x, y) to the x-axis is arccl(x).

\begin{align*} \mbox{arcsl}(x) &= \int_0^x \frac{dt}{\sqrt{1 - t^4}} \\ \mbox{arccl}(x) &= \int_x^1 \frac{dt}{\sqrt{1 - t^4}} \\ \end{align*}

The lemniscate sine, sl, is the inverse of arcsl, and the lemniscate cosine, cl, is the inverse of arccl. These functions were first studied by Giulio Fagnano three centuries ago.

The lemniscate functions sl and cl are elliptic functions, and so they have a lot of nice properties and satisfy a lot of identities. See Wikipedia, for example. Update: see this follow up post on addition theorems.

Lemniscate constant

As mentioned in the previous post, generalizations of the sine and cosine functions have corresponding generalizations of π.

Just as the period of sine and cosine is 2π, the period of lemninscate sine and lemniscate cosine is 2ϖ.

The number ϖ is called the lemniscate constant. It is written with Unicode character U+03D6, GREEK SMALL LETTER OMEGA PI. The LaTeX command command is \upvarpi.

The lemnmiscate constant ϖ is related to Gauss’ constant G by ϖ = πG.

The area of a squircle is √2 ϖ.

There is also a connection to the beta function: 2ϖ = B(1/4, 1/2).

Generalized trigonometry

In a recent post I mentioned in passing that trigonometry can be generalized from functions associated with a circle to functions associated with other curves. This post will go into that a little further.

The equation of the unit circle is

x^2 + y^2 = 1

and so in the first quadrant

y = \sqrt{1 - x^2}

The length of an arc from (1, 0) to (cos θ, sin θ) is θ. If we write the arc length as an integral we have

\int_0^{\sin \theta} (1 -t^2)^{-1/2} \,dt = \theta

and so

F(x) = \int_0^x (1 - t^2)^{-1/2} \,dt

is the inverse sine of x. Sine is the inverse of the inverse of sine, so we could define the sine function to be the inverse of F.

This would be a complicated way to define the sine function, but it suggests ways to create variations on sine: take the length of an arc along a curve other than the circle, and call the inverse of this function a new kind of sine. Or tinker with the integral defining F, whether or not the resulting integral corresponds to the length along a familiar curve, and use that to define a generalized sine.

Example: sinp

We can replace the 2’s in the integral above with p‘s, defining Fp as

F_p(x) = \int_0^x (1 - |t|^p)^{-1/p} \,dt

and defining sinp to be the inverse of Fp. When p = 2, sinp(x) = sin(x). This idea goes back to E. Lungberg in 1879.

The function sinp has its applications. For example, just as the sine function is an eigenfunction of the Laplacian, sinp is an eigenfunction of the p-Laplacian.

We can extend sinp to be a periodic function with period 4Fp(1). The constants πp are defined as 2Fp(1) so that sinp has period πp and π2 = π.

Future posts

I intend to explore several generalizations of sine and cosine. What happens if you replace a circle with an ellipse or a hyperbola? Or a squircle? How do these variations on sine and cosine compare to the originals? Do they satisfy analogous identities? How do they appear in applications? I’d like to address some of these questions in future posts.