When zeros at natural numbers implies zero everywhere

Suppose a function f(z) equals 0 at z = 0, 1, 2, 3, …. Under what circumstances might you be able to conclude that f is zero everywhere?

Clearly you need some hypothesis on f. For example, the function sin(πz) is zero at every integer but certainly not constantly zero.

Carlson’s theorem says that if f is analytic and bounded for z with non-negative real part, and equals zero at non-negative integers, then f is constantly zero.

Carlson’s theorem doesn’t apply to sin(πz) because this function is not bounded in the complex plane. It is bounded on the real axis, but that’s not enough. The identity

sin(z) = ( exp(iz) – exp(-iz) ) / 2i

shows that the sine function grows exponentially in the vertical direction.

Liouville’s theorem says that if a function is analytic and bounded everywhere then it must be constant. Carleson’s theorem does not require that the function f be bounded everywhere but in the right half-plane.

In fact, the boundedness requirement can be weakened to requiring f(z) be O( exp(k|z|) ) for some k < π. This, in combination with having zeros at 0, 1, 2, 3, …. is enough to conclude that f is zero.

Related posts

Conformal map between disk and equilateral triangle

The Dixon elliptic functions sm and cm are in some ways analogous to sine and cosine. However, whereas sine and cosine satisfy

\sin^2(z) + \cos^2(z) = 1

the Dixon functions satisfy

\text{sm}^3(z) + \text{cm}^3(z) = 1

The exponent 3 foreshadows the fact that these functions have a sort of three-fold symmetry. In particular, the function sm maps an equilateral triangle in the complex plane to the unit circle. The function sm gives a conformal map from the interior of this circle to the interior of the unit disk.

In this post we will work with sm−1 rather than sm, mapping the unit circle to an equilateral triangle. An advantage of working with the inverse function is that we can start with the unit circle and see what triangle it maps to; if we started with the triangle it might seem arbitrary. Also, the function sm is not commonly part of mathematical software libraries—it’s not in Mathematica or SciPy—but you can compute its inverse via

\text{sm}^{-1}(z) = {}_2F_1(\tfrac{1}{3}, \tfrac{2}{3}; \tfrac{4}{3}; z^3) \, z

using the hypergeometric function 2F1, which is a common part of mathematical libraries.

The following image shows concentric circles in the z plane and their image under sm−1 in the w plane, w = sm−1(z).

Conformal map of unit disk to equilateral triangle using the inverse of the Dixon elliptic function sm

If we were to use this in applications, we’d need to know the vertices of the image triangle so we could do a change of variables to transform this triangle into a particular triangle we’re interested in.

The centroid of the image is at the origin, and the right-most vertex is at approximately 1.7666. To be exact, the vertex is at

v = ⅓ B(⅓, ⅓)

where B is the beta function. (Notice all the 3’s in the formula for v.) The other two vertices are at exp(2π/3)v and exp(4πi/3) v.

One way this conformal map could arise in practice is solving Laplace’s equation on a triangle. You can solve Laplace’s equation on a disk in closed form, and transform that solution into a solution on the triangle.

Related posts

Rectangles to Rectangles

There is a conformal map between any two simply connected open proper subsets of the complex plane. This means, for example, there is a one-to-one analytic map from the interior of a square onto the interior of a a circle. Or from the interior of a triangle onto the interior of a pentagon. Or from the Mickey Mouse logo to the Batman logo (see here).

So we can map (the interior of) a rectangle conformally onto a very different shape. Can we map a rectangle onto a rectangle? Yes, clearly we can do this with a linear polynomial, f(z) = az + b. Are there any other possibilities? Surprisingly, the answer is no: if an analytic function takes any rectangle to another rectangle, that analytic function must be a linear polynomial.

Since a linear polynomial is the composition of a scaling, a rotation, and a translation, this says that if a conformal map takes a rectangle to a rectangle, it must take it to a similar rectangle.

These statements are proved in [1]. Furthermore, the authors prove that “An analytic function mapping some closed convex n-gon R onto another closed convex n-gon S is a linear polynomial.”

More posts on conformal mapping

[1] Joseph Bak and Pisheng Ding. Shape Distortion by Analytic Functions. The American Mathematical Monthly. Feb. 2009, Vol. 116, No. 2.

Bounding complex roots by a positive root

Suppose you have an nth degree polynomial with complex coefficients

p(z) = anzn + an-1zn-1 + … + a0

and you want to find some circle that is guaranteed to contain all the zeros of p.

Cauchy found such a circle in 1829. The zeros of p lie inside the circle |z| ≤ r where r is the unique positive root of

f(z) = |an|zn − |an-1|zn-1 − … − |a0|

This value of r is known as the Cauchy radius of the polynomial p.

This may not seem like much of an improvement: you started with wanting to find the roots of an nth degree polynomial and you end with finding the roots of an nth degree polynomial. But Cauchy’s theorem reduces the problem of finding all roots of a complex polynomial to finding one root of a real polynomial. Furthermore, the positive root we’re after is guaranteed to be unique.

If a0 = 0 then p(z) has a factor of z and so we can reduce the problem to bounding the zeros of p(z)/z. Otherwise, f(0) < 0. Eventually f(z) must be positive because the zn term will overtake the rest of the terms for large enough z. So we only need to find some value of z where f(z) > 0 and then we could use the bisection method to find r.

Since our goal is to bound the zeros of p, we don’t need to find r exactly: an upper bound on r will do, though the smaller the upper bound the better. The bisection method gives us a sequence of upper bounds, so we could work in rational arithmetic and have rigorously provable upper bounds.

As for how to find a real value of z where f is positive, we could try z = 2k for successive value of k until we find one that works.

For example, let’s bound the roots of

p(z) = 12z5 + 2z2 + 23i = 0.

Cauchy’s theorem says we need to find the unique positive root of

f(z) = 12z5 − 2z2 − 23.

Now f(0) = −23 and f(2) = 353. So we know right away that the roots of p have absolute value less than 2.

Next we evaluate f(1), which turns out to be −13, and so the Cauchy radius is larger than 1. This doesn’t necessarily mean that p has a root with absolute value greater than 1, only that the Cauchy radius is greater than 1. An upper bound on the Cauchy radius is an upper bound on the absolute values of the roots of p; a lower bound on the Cauchy radius is not necessarily a lower bound on the largest root.

Carrying out two steps of the bisection method by hand was easy, but let’s automate the process of carrying it out further.

>>> from scipy.optimize import bisect
>>> bisect(lambda x: 12*x**5 - 2*x*x - 23, 1, 2)
1.1646451258329762

So Python tells us r = 1.1646451258329762.

Here’s a plot of the roots and the Cauchy radius.

In this example the roots of p are located very near a circle with the Cauchy radius. The roots range in absolute value between 1.1145600699993699 and 1.1634197192917954. The roots nearly lie in a circle because the quadratic term in our polynomial is small and so we are approximately finding the fifth roots of −23i.

Let’s do another example with randomly generated coefficients to get a better idea of how Cauchy’s theorem works in general. The coefficients of our polynomial, from 0th to 5th, are

0.126892 + 0.689356i,  -0.142366 + 0.260969, – 0.918873 + 0.489906i,  0.0599824 – 0.679312i,  – 0.222055 + 0.273651, + 0.154408 + 0.733325i

The roots have absolute value between 0.7844606228243709 and 1.2336256274024142, and the Cauchy radius is 1.5088421845957782. Here’s a plot.

Related posts

Schwarz lemma, Schwarz-Pick theorem, and Poincare metric

Let D be the open unit disk in the complex plane. The Schwarz lemma says that if f is an analytic function from D to D with f(0) = 0, then

|f(z)| \leq |z|

for all z in D. The lemma also says more, but this post will focus on just this portion of the theorem.

The Schwarz-Pick theorem generalizes the Schwarz lemma by not requiring the origin to be fixed. That is, it says that if f is an analytic function from D to D then

\left| \frac{f(z) - f(w)}{1 - f(z)\,\overline{f(w)}} \right| \leq \left| \frac{z - w}{1 - z\,\overline{w}}\right|

The Schwarz-Pick theorem also concludes more, but again we’re focusing on part of the theorem here. Note that if f(0) = 0 then the Schwarz-Pick theorem reduces to the Schwarz lemma.

The Schwarz lemma is a sort of contraction theorem. Assuming f(0) = 0, the lemma says

|f(z) - f(0)| \leq |z - 0|

This says applying f to a point cannot move the point further from 0. That’s interesting, but it would be more interesting if we could say f is a contraction in general, not just with respect to 0. That is indeed what the Schwarz-Pick theorem does, though with respect to a new metric.

For any two points z and w in the open unit disk D, define the Poincaré distance between z and w by

d(z,w) = \tanh^{-1}\left( \left| \frac{z - w}{1 - z\overline{w}}\right| \right)

It’s not obvious that this is a metric, but it really is. As is often the case, most of the properties of a metric are simple to confirm, but the proving the triangle inequality is the hard part.

If we apply the monotone function tanh-1 to both sides of the Schwarz-Pick theorem, then we have that any analytic function f from D to D is a contraction on D with respect to the Poincaré metric.

Here we’re using “contraction” in the lose sense. It would be more explicit to say that f is a non-expansive map. Applying f to a pair of points may not bring the points closer together, but it cannot move them any further apart (with respect to the Poincaré metric).

By using the Poincaré metric, we turn the unit disk into a hyperbolic space. That is D with the metric d is a model of the hyperbolic plane.

Related posts

When a function cannot be extended

The relation between a function and its power series is subtle. In a calculus class you’ll see equations of the form “series = function” which may need some footnotes. Maybe the series only represents the function over part of its domain: the function extends further than the power series representation.

Starting with the power series, we can ask whether the function it represents extends further than the series representation.

This video does a nice job of explaining why a particular function cannot be extended beyond the disk on which the series converges.

Toward the end, the video explains how its main example is a member of a broader class of functions that have no analytic continuation. The technical term, which the video does not use, is lacunary series [1]. When the gaps in a power series grow faster than linearly, the series cannot be extended beyond its radius of convergence.

Lacunary series make interesting images since the behavior of the function becomes complicated toward the edge of the domain. The video gives some nice examples. The image above comes from this post and the following image comes from this post.

Differential equations

The video mentions Hadamard’s gap theorem. I believe his gap theorem was a spin-off of his work on Laplace’s equation. See this post on Hadamard’s counterexample to the Dirichlet principle for the Laplacian.

The motivation for a LOT of classical math was differential equations. I didn’t realize this as a student. Years later I’d run into something and think “So that is why this person was interested in that problem,” such as why Hadamard would care about whether power series could be extended.

Hadamard wanted to solve a differential equation on a disk with boundary conditions specified on the rim. It’s going to be a problem if the series representation of the solution doesn’t extend to the rim.

Related posts

[1] Lacuna is the Latin word for a hole or a pit. The word came to be use metaphorically for a gap, such as a gap in a manuscript. Later mathematicians used this term for power series with increasing gaps between non-zero terms.

Today’s star

Exponential sum of the day 10/2/2023

The star-like image above is today’s exponential sum.

The exponential sum page on my site generates a new image each day by putting the numbers of the day’s month, day, and year into the equation

\sum_{n=0}^N \exp\left( 2\pi i \left( \frac{n}{m} + \frac{n^2}{d} + \frac{n^3}{y} \right ) \right )

and connecting the partial sums in the complex plane. Here m is the month, d is the day, and y is the last two digits of the year.

Some people have asked why I use American date order: month, day, year. The flippant answer is I use American date order because I’m American. But I did experiment with other date orders, and I prefer the sequence of images produced by the order above. There’s more contrast between consecutive images by associating the day with the quadratic term rather than the linear term inside the exponential.

The exponential sum page is about six years old [1], and I still enjoy checking in on it each day. Short of making the plot, it’s not possible to imagine what an image will look like based on the date, other than the very rough rule that larger numbers tend to produce more complicated images. For example, images are much more intricate on New Year’s Eve than on New Year’s Day.

The images are often highly symmetric, as today’s image is. But occasionally they have no symmetry, as will be the case on 10/10/23.

The page lets you scroll back and forth by day, but you can put in any parameters you’d like by editing the page URL. For example, the link to today’s image is

   https://www.johndcook.com/expsum/?y=23&m=10&d=2

but you can change y, m, and d to any numbers you wish. There’s nothing that constrains m, for example, to be a number between 1 and 12. You could set it to 17 if you’d like. And although thirty days hath September, you can see what the image for September 31st would have looked like.

[1] The page was launched October 9, 2017, so its sixth anniversary is a week from today.

Continued fractions as matrix products

A continued fraction of the form

\cfrac{a_1}{b_1 + \cfrac{a_2}{b_2 + \cfrac{a_3}{b_3 + \ddots}}}

with n terms can be written as the composition

f_1 \circ f_2 \circ f_3 \circ \cdots \circ f_n

where

f_i(z) = \frac{a_1}{b_i + z}

As discussed in the previous post, a Möbius transformation can be associated with a matrix. And the composition of Möbius transformations is associated with the product of corresponding matrices. So the continued fraction at the top of the post is associated with the following product of matrices.

\begin{pmatrix} 0 & a_1 \\ 1 & b_1\end{pmatrix} \begin{pmatrix} 0 & a_2 \\ 1 & b_2\end{pmatrix} \begin{pmatrix} 0 & a_3 \\ 1 & b_3\end{pmatrix} \cdots \begin{pmatrix} 0 & a_n \\ 1 & b_n\end{pmatrix}

The previous post makes precise the terms “associated with” above: Möbius transformations on the complex plane ℂ correspond to linear transformations on the projective plane P(ℂ). This allows us to include ∞ in the domain and range without resorting to hand waving.

Matrix products are easier to understand than continued fractions, and so moving to the matrix product representation makes it easier to prove theorems.

Related posts

Fractional linear and linear

A function of the form

g(z) = \frac{az + b}{cz + d}

where adbc ≠ 0 is sometimes called a fractional linear transformation or a bilinear transformation. I usually use the name Möbius transformation.

In what sense are Möbius transformations linear transformations? They’re nonlinear functions unless b = c = 0. And yet they’re analogous to linear transformations. For starters, the condition adbc ≠ 0 appears to be saying that a determinant is non-zero, i.e. that a matrix is non-singular.

The transformation g is closely associated with the matrix

\begin{pmatrix} a & b \\ c & d \end{pmatrix}

but there’s more going on than a set of analogies. The reason is that Möbius transformation are linear transformations, but not on the complex numbers ℂ.

When you’re working with Möbius transformations, you soon want to introduce ∞. Things get complicated if you don’t. Once you add ∞ theorems become much easier to state, and yet there’s a nagging feeling that you may be doing something wrong by informally introducing ∞. This feeling is justified because tossing around ∞ without being careful can lead to wrong conclusions.

So how can we rigorously deal with ∞? We could move from numbers (real or complex) to pairs of numbers, as with fractions. We replace the complex number z with the equivalence class of all pairs of complex numbers whose ratio is z. The advantage of this approach is that you get to add one special number, the equivalence class of all pairs whose second number is 0, i.e. fractions with zero in the denominator. This new number system is called P(ℂ), where “P” stands for “projective.”

Möbius transformations are projective linear transformations. They’re linear on P(ℂ), though not on ℂ.

When we multiply the matrix above by the column vector (z 1)T we get

\begin{pmatrix} a & b \\ c & d \end{pmatrix} \begin{pmatrix} z \\ 1 \end{pmatrix} = \begin{pmatrix} az + b \\ cz + d \end{pmatrix}

and since our vectors are essentially fractions, the right hand side corresponds to g(z) if the second component of the vector, cz + d, is not zero.

If cz + d = 0, that’s OK. Everything is fine while we’re working in P(ℂ), but we get an element of P(ℂ) that does not correspond to an element of ℂ, i.e. we get ∞.

We’ve added ∞ to the domain and range of our Möbius transformations without any handwaving. We’re just doing linear algebra on finite complex numbers.

There’s a little bit of fine print. In P(ℂ) we can’t allow both components of a pair to be 0, and non-zero multiples of the same vector are equivalent, so we’re not quite doing linear algebra. Strictly speaking a Möbius transformation is a projective linear transformation, not a linear transformation.

It takes a while to warm up to the idea of moving from complex numbers to equivalence classes of pairs of complex numbers. The latter seems unnecessarily complicated. And it often is unnecessary. In practice, you can work in P(ℂ) by thinking in terms of ℂ until you need to have to think about ∞. Then you go back to thinking in terms of P(ℂ). You can think of P(ℂ) as ℂ with a safety net for working rigorously with ∞.

Textbooks usually introduce higher dimensional projective spaces before speaking later, if ever, of one-dimensional projective space. (Standard notation would write P¹(ℂ) rather than P(ℂ) everywhere above.) But one-dimensional projective space is easier to understand by analogy to fractions, i.e. fractions whose denominator is allowed to be zero, provided the numerator is not also zero.

I first saw projective coordinates as an unmotivated definition. “Good morning everyone. We define Pn(ℝ) to be the set of equivalence classes of ℝn+1 where ….” There had to be some payoff for this added complexity, but we were expected to delay the gratification of knowing what that payoff was. It would have been helpful if someone had said “The extra coordinate is there to let us handle points at infinity consistently. These points are not special at all if you present them this way.” It’s possible someone did say that, but I wasn’t ready to hear it at the time.

Related posts

Geometric mean on unit circle

Warm up

The geometric mean of two numbers is the square root of their product. For example, the geometric mean of 9 and 25 is 15.

More generally, the geometric mean of a set of n numbers is the nth root of their product.

Alternatively, the geometric mean of a set of n numbers the exponential of their average logarithm.

\left(\prod_{i=1}^n x_i\right)^{1/n} = \exp\left(\frac{1}{n} \sum_{i=1}^n \log x_i\right)

The advantage of the alternative definition is that it extends to integrals. The geometric mean of a function over a set is the exponential of the average value of its logarithm. And the average of a function over a set is its integral over that set divided by the measure of the set.

Mahler measure

The Mahler measure of a polynomial is the geometric mean over the unit circle of the absolute value of the polynomial.

M(p) = \exp\left( \int_0^1 \log \left|p(e^{2\pi i \theta})\right| \, d\theta\right)

The Mahler measure equals the product of the absolute values of the leading coefficient and roots outside the unit circle. That is, if

p(z) = a \prod_{i=1}^n(z - a_i)

then

M(p) = |a| \prod_{i=1}^n\max(1, |a_i|)

Example

Let p(z) = 7(z − 2)(z − 3)(z + 1/2). Based on the leading coefficient and the roots, we would expect M(p) to be 42. The following Mathematica code shows this is indeed true by returning 42.

    z = Exp[2 Pi I theta]
    Exp[Integrate[Log[7 (z - 2) (z - 3) (z + 1/2)], {theta, 0, 1}]]

Multiplication and heights

Mahler measure is multiplicative: for any two polynomials p and q, the measure of their product is the product of their measures.

M(pq) = M(p)\,M(q)

A few days ago I wrote about height functions for rational numbers. Mahler measure is a height function for polynomials, and there are theorems bounding Mahler measure by other height functions, such as the sum or maximum of the absolute values of the coefficients.

Related posts