Computing sine and cosine of complex arguments with only real functions

Suppose you have a calculator or math library that only handles real arguments but you need to evaluate sin(3 + 4i). What do you do?

If you’re using Python, for example, and you don’t have NumPy installed, you can use the built-in math library, but it will not accept complex inputs.

>>> import math
>>> math.sin(3 + 4j)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: must be real number, not complex

You can use the following identities to calculate sine and cosine for complex arguments using only real functions.

\begin{align*} \sin(x + iy) &= \sin x \cosh y + i \cos x \sinh y \\ \cos(x + iy) &= \cos x \cosh y - i \sin x \sinh y \\ \end{align*}

The proof is very simple: just use the addition formulas for sine and cosine, and the following identities.

\begin{align*} \sin iz &= i \sinh z \\ \cos iz &= \cosh z \end{align*}

The following code implements sine and cosine for complex arguments using only the built-in Python functions that accept real arguments. It then tests these against the NumPy versions that accept complex arguments.

from math import *
import numpy as np

def complex_sin(z):
    x, y = z.real, z.imag
    return sin(x)*cosh(y) + 1j*cos(x)*sinh(y)

def complex_cos(z):
    x, y = z.real, z.imag
    return cos(x)*cosh(y) - 1j*sin(x)*sinh(y)

z = 3 + 4j
mysin = complex_sin(z)
mycos = complex_cos(z)
npsin = np.sin(z)
npcos = np.cos(z)
assert(abs(mysin - npsin) < 1e-14)
assert(abs(mycos - npcos) < 1e-14)

Related posts

Lebesgue constants

I alluded to Lebesgue constants in the previous post without giving them a name. There I said that the bound on order n interpolation error has the form

ch^{n+1} + \lambda \delta

where h is the spacing between interpolation points and δ is the error in the tabulated values. The constant c depends on the function f being interpolated, and to a lesser extent on n. The constant λ is independent of f but depends on n and on the relative spacing between the interpolation nodes. This post will look closer at λ.

Given a set of n + 1 nodes T

a = x_0 < x_1 < x_2 < \cdots < x_{n-1} < x_n = b

define

\ell_j(x) := \prod_{\begin{smallmatrix}i=0\\ j\neq i\end{smallmatrix}}^{n} \frac{x-x_i}{x_j-x_i}

Then the Lebesgue function is defined by

\lambda_n(x) = \sum_{j=0}^n |\ell_j(x)|

and the Lebesgue constant for the grid is the maximum value of the Lebesgue function

\Lambda_n(T)=\max_{x\in[a,b]} \lambda_n(x)

The values of Λ are difficult to compute, but there are nice asymptotic expressions for Λ when the grid is evenly spaced:

\Lambda_n \sim \frac{2^{n+1}}{n \log n}

When the grid points are at the roots of a Chebyshev polynomial then

\Lambda_n \approx \frac{2}{\pi} \log(n + 1) + 1

The previous post mentioned the cases n = 11 and n = 29 for evenly spaced grids. The corresponding values of Λ are approximately 155 and 10995642. So 11th order interpolation is amplifying the rounding error in the tabulated points by a factor of 155, which might be acceptable. But 29th order interpolation is amplifying the rounding error by a factor of over 10 million.

The corresponding values of Λ for Chebyshev-spaced nodes are 2.58 and 3.17. Chebyshev spacing is clearly better for high-order interpolation, when you have that option.

How much precision can you squeeze out of a table?

Richard Feynman said that almost everything becomes interesting if you look into it deeply enough. Looking up numbers in a table is certainly not interesting, but it becomes more interesting when you dig into how well you can fill in the gaps.

If you want to know the value of a tabulated function between values of x given in the table, you have to use interpolation. Linear interpolation is often adequate, but you could get more accurate results using higher-order interpolation.

Suppose you have a function f(x) tabulated at x = 3.00, 3.01, 3.02, …, 3.99, 4.00 and you want to approximate the value of the function at π. You could approximate f(π) using the values of f(3.14) and f(3.15) with linear interpolation, but you could also take advantage of more points in the table. For example, you could use cubic interpolation to calculate f(π) using f(3.13), f(3.14), f(3.15), and f(3.16). Or you could use 29th degree interpolation with the values of f at 3.00, 3.01, 3.02, …, 3.29.

The Lagrange interpolation theorem lets you compute an upper bound on your interpolation error. However, the theorem assumes the values at each of the tabulated points are exact. And for ordinary use, you can assume the tabulated values are exact. The biggest source of error is typically the size of the gap between tabulated x values, not the precision of the tabulated values. Tables were designed so this is true [1].

The bound on order n interpolation error has the form

c hn + 1 + λ δ

where h is the spacing between interpolation points and δ is the error in the tabulated values. The value of c depends on the derivatives of the function you’re interpolating [2]. The value of λ is at least 1 since λδ is the “interpolation” error at the tabulated points.

The accuracy of an interpolated value cannot be better than δ in general, and so you pick the value of n that makes c hn + 1 less than δ. Any higher value of n is not helpful. And in fact higher values of n are harmful since λ grows exponentially as a function of n [3].

See the next post for mathematical details regarding the λs.

Examples

Let’s look at a specific example. Here’s a piece of a table for natural logarithms from A&S.

Here h = 10−3, and so linear interpolation would give you an error on the order of h² = 10−6. You’re never going to get error less than 10−15 since that’s the error in the tabulated values, so 4th order interpolation gives you about as much precision as you’re going to get. Carefully bounding the error would require using the values of c and λ above that are specific to this context. In fact, the interpolation error is on the order of 10−8 using 5th order interpolation, and that’s the best you can do.

I’ll briefly mention a couple more examples from A&S. The book includes a table of sine values, tabulated to 23 decimal places, in increments of h = 0.001 radians. A rough estimate would suggest 7th order interpolation is as high as you should go, and in fact the book indicates that 7th order interpolation will give you 9 figures of accuracy,

Another table from A&S gives values of the Bessel function J0 in with 15 digit values in increments of h = 0.1. It says that 11th order interpolation will give you four decimal places of precision. In this case, fairly high-order interpolation is useful and even necessary. A large number of decimal places are needed in the tabulated values relative to the output precision because the spacing between points is so wide.

Related posts

[1] I say were because of course people rarely look up function values in tables anymore. Tables and interpolation are still widely used, just not directly by people; computers do the lookup and interpolation on their behalf.

[2] For functions like sine, the value of c doesn’t grow with n, and in fact decreases slowly as n increases. But for other functions, c can grow with n, which can cause problems like Runge phenomena.

[2] The constant λ grows exponentially with n for evenly spaced interpolation points, and values in a table are evenly spaced. The constant λ grows only logarithmically for Chebyshev spacing, but this isn’t practical for a general purpose table.

From Mendeleev to Fourier

The previous post looked at an inequality discovered by Dmitri Mendeleev and generalized by Andrey Markov:

Theorem (Markov): If P(x) is a real polynomial of degree n, and |P(x)| ≤ 1 on [−1, 1] then |P′(x)| ≤ n² on [−1, 1].

If P(x) is a trigonometric polynomial then Bernstein proved that the bound decreases from n² to n.

Theorem (Bernstein): If P(x) is a trigonometric polynomial of degree n, and |P(z)| ≤ 1 on [−π, π] then |P′(x)| ≤ n on [−π, π].

Now a trigonometric polynomial is a truncated Fourier series

T(x) = a_0 + \sum_{n=1}^N a_n \cos nx + \sum_{n=1}^N b_n \sin nx

and so the max norm of the T′ is no more than n times the max norm of T.

This post and the previous one were motivated by Terence Tao’s latest post on Bernstein theory.

Related posts

Mendeleev’s inequality

Dmitri Mendeleev is best known for creating the first periodic table of chemical elements. He also discovered an interesting mathematical theorem. Empirical research led him to a question about interpolation, which in turn led him to a theorem about polynomials and their derivatives.

I ran across Mendeleev’s theorem via a paper by Boas [1]. The opening paragraph describes what Mendeleev was working on.

Some years after the chemist Mendeleev invented the periodic table of the elements he made a study of the specific gravity of a solution as a function of the percentage of the dissolved substance. This function is of some practical importance: for example, it is used in testing beer and wine for alcoholic content, and in testing the cooling system of an automobile for concentration of anti-freeze; but present-day physical chemists do not seem to find it as interesting as Mendeleev did.

Mendeleev fit his data by patching together quadratic polynomials, i.e. he used quadratic splines. A question about the slopes of these splines lead to the following.

Theorem (Mendeleev): Let P(x) be a quadratic polynomial on [−1, 1] such that |P(x)| ≤ 1. Then |P′(x)| ≤ 4.

Mendeleev showed his result to mathematician Andrey Markov who generalized it to the following.

Theorem (Markov): If P(x) is a real polynomial of degree n, and |P(x)| ≤ 1 on [−1, 1] then |P′(x)| ≤ n² on [−1, 1].

Both inequalities are sharp with equality if and only if P(x) = ±Tn(x), the nth Chebyshev polynomial. In the special case of Mendeleev’s inequality, equality holds for

T2(x) = 2x² − 1.

Andrey Markov’s brother Vladimir proved an extension of Andrey’s theorem to higher derivatives,

Related posts

[1] R. P. Boas, Jr. Inequalities for the Derivatives of Polynomials. Mathematics Magazine, Vol. 42, No. 4 (Sep., 1969), pp. 165–174

A lesser-known characterization of the gamma function

The gamma function Γ(z) extends the factorial function from integers to complex numbers. (Technically, Γ(z + 1) extends factorial.) There are other ways to extend the factorial function, so what makes the gamma function the right choice?

The most common answer is the Bohr-Mollerup theorem. This theorem says that if f: (0, ∞) → (0, ∞) satisfies

  1. f(x + 1) = x f(x)
  2. f(1) = 1
  3. log f is convex

then f(x) = Γ(x). The theorem applies on the positive real axis, and there is a unique holomorphic continuation of this function to the complex plane.

But the Bohr-Mollerup theorem is not the only theorem characterizing the gamma function. Another theorem was by Helmut Wielandt. His theorem says that if f is holomorphic in the right half-plane and

  1. f(z + 1) = z f(z)
  2. f(1) = 1
  3. f(z) is bounded for {z: 1 ≤ Re z ≤ 2}

then f(x) = Γ(x). In short, Wielandt replaces the log-convexity for positive reals with the requirement that f is bounded in a strip of the complex plane.

You might wonder what is the bound alluded to in Wielandt’s theorem. You can show from the integral definition of Γ(z) that

|Γ(z)| ≤ |Γ(Re z)|

for z in the right half-plane. So the bound on the complex strip {z: 1 ≤ Re z ≤ 2} equals the bound on the real interval [1, 2], which is 1.

Tighter bounds on alternating series remainder

The alternating series test is part of the standard calculus curriculum. It says that if you truncate an alternating series, the remainder is bounded by the first term that was left out. This fact goes by in a blur for most students, but it becomes useful later if you need to do numerical computing.

To be more precise, assume we have a series of the form

  \sum_{i=1}^\infty (-1)^i a_i

where the ai are positive and monotonically converge to zero. Then the tail of the series is bounded by its first term:

\left|R_n\right| = \left| \sum_{i=n+1}^\infty (-1)^i a_i \right| \leq a_{n+1}

The more we can say about the behavior of the ai the more we can say about the remainder. So far we’ve assumed that these terms go monotonically to zero. If their differences

\Delta a_i = a_i - a_{i+1}

also go monotonically to zero, then we have an upper and lower bound on the truncation error:

\frac{a_{n+1}}{2} \leq |R_n| \leq \frac{a_n}{2}

If the differences of the differences,

\Delta^2 a_i = \Delta (\Delta a_i)

also converge monotonically to zero, we can get a larger lower bound and a smaller upper bound on the remainder. In general, if the differences up to order k of the ai go to zero monotonically, then the remainder term can be bounded as follows.

\frac{a_{n+1}}{2}
+\frac{\Delta a_{n+1}}{2^2}
+\cdots+
\frac{\Delta^k a_{n+1}}{2^{k+1}}
< \left|R_n\right| <
\frac{a_n}{2}
-\left\{
\frac{\Delta a_n}{2^2}
+\cdots+
\frac{\Delta^k a_n}{2^{k+1}}
\right\}.

Source: Mark B. Villarino. The Error in an Alternating Series. American Mathematical Monthly, April 2018, pp. 360–364.

Related posts

Powers don’t clear fractions

If a number has a finite but nonzero fractional part, so do all its powers. I recently ran across a proof in [1] that is shorter than I expected.

Theorem: Suppose r is a real number that is not an integer, and the decimal part of r terminates. Then rk is not an integer for any positive integer k.

Proof: The number r can be written as a reduced fraction a / 10m for some positive m. If s = rk were an integer, then

10mk s = ak.

Now the left side of this equation is divisible by 10 but the right side is not, and so s = rk must not be an integer. QED.

The only thing special about base 10 is that we most easily think in terms of base 10, but you could replace 10 with any other base.

What about repeating decimals, like 1/7 = 0.142857142857…? They’re only repeating decimals in our chosen base. Pick the right base, i.e. 7 in this case, and they terminate. So the theorem above extends to repeating decimals.

[1] Eli Leher. √2 is Not 1.41421356237 or Anything of the Sort. The American Mathematical Monthly, Vol. 125, No. 4 (APRIL 2018), page 346.

Inverse cosine

In the previous two posts, we looked at why Mathematica and SymPy did not simplify sinh(arccosh(x)) to √(x² − 1) as one might expect. After understanding why sinh(arccosh(x)) doesn’t simplify nicely, it’s natural to ask why sin(arccos(x)) does simplify nicely.

In this post I sketched a proof of several identities including

sin(arccos(x)) = √(1 − x²)

saying the identities could be proved geometrically. Let x be between 0 and 1. Construct right triangle with hypotenuse of length 1 and one side of length x. Call the acute angle formed by these two sides θ. Then cos θ = x, and so arccos(x) = θ, and sin θ = √(1 − x²). This proves the identity above, but only for 0 < x < 1.

If we make branch cuts along (−∞, −1] and [1, ∞) we can extend arccos(z) uniquely by analytic continuation. We can extend the definition to the branch cuts by continuity, but from one direction. We either have to choose the extension to be continuous from above the branch cuts or from below; we have to choose one or the other because the two limits are not equal. As far as I know, everyone chooses continuity from above, i.e. continuity with quadrant II, by convention.

In any case, we can define arccos(z) for any complex number z, and the result is a number whose cosine is z. Therefore the square of its cosine is z², and the square of its sine is 1 − z². So we have

sin²(arccos(z)) = 1 − z².

But does that mean

sin(arccos(z)) = √(1 − z²)?

Certainly it does if we’re working with real numbers, but does it for complex numbers?

Recall what square root means for complex numbers: it is the analytic function with branch cut (−∞, 0] that agrees with the real square root function along the real line, and is defined along the branch cut to be continuous with quadrant II. Since

sin(arccos(z)) = √(1 − z²)

for real −1 < z < 1, the two sides of the equation are equal on a set with a limit point, and so as analytical functions they are equal on their common domain.

The only remaining detail is whether the two functions are equal along the branch cut (−∞, −1] where we’ve extended the function by continuity. As

Since we’ve defined both the arccos and square root functions by continuous extension from the second quadrant, equality on the branch cut also follows by continuity.

llms

Simplifying expressions in SymPy

The previous post looked at why Mathematica does not simplify the expression Sinh[ArcCosh[x]] the way you might think it should. This post will be a sort of Python analog of the previous post.

SymPy is a Python library that among other things will simplify mathematical expressions. As before, we seek to verify the entries in the table below, this time using SymPy.

\renewcommand{\arraystretch}{2.2} \begin{array}{c|c|c|c} & \sinh^{-1} & \cosh^{-1} & \tanh^{-1} \\ \hline \sinh & x & \sqrt{x^{2}-1} & \dfrac{x}{\sqrt{1-x^2}} \\ \hline \cosh & \sqrt{x^{2} + 1} & x & \dfrac{1}{\sqrt{1 - x^2}} \\ \hline \tanh & \dfrac{x}{\sqrt{x^{2}+1}} & \dfrac{\sqrt{x^{2}-1}}{x} & x \\ \end{array}

Here’s the code:

from sympy import *

x = symbols('x')

print( simplify(sinh(asinh(x))) )
print( simplify(sinh(acosh(x))) )
print( simplify(sinh(atanh(x))) )
print( simplify(cosh(asinh(x))) )
print( simplify(cosh(acosh(x))) )
print( simplify(cosh(atanh(x))) )
print( simplify(tanh(asinh(x))) )
print( simplify(tanh(acosh(x))) )
print( simplify(tanh(atanh(x))) )

As before, the results are mostly as we’d expect:

x
sqrt(x - 1)*sqrt(x + 1)
x/sqrt(1 - x**2)
sqrt(x**2 + 1)
x
1/sqrt(1 - x**2)
x/sqrt(x**2 + 1)
sqrt(x - 1)*sqrt(x + 1)/x
x

Also as before, sinh(acosh(x)) and tanh(acosh(x)) return more complicated expressions than in the table above. Why doesn’t

√(x − 1) √(x + 1)

simplify to

√(x² − 1)

as you’d expect? Because the equation

√(x − 1) √(x + 1) = √(x² − 1)

does not hold for all x. See the previous post for the subtleties of defining arccosh and sqrt for complex numbers. The equation above does not hold, for example, when x = −2.

As in Mathematica, you can specify the range of variables in SymPy. If we specify that x ≥ 0 we get the result we expect. The code

x = symbols('x', real=True, nonnegative=True)
print( simplify(sinh(acosh(x))) )

prints

sqrt(x**2 - 1)

as expected.