More Laguerre images

A week or two ago I wrote about Laguerre’s root-finding method and made some associated images. This post gives a couple more examples.

Laguerre’s method is very robust in the sense that it is likely to converge to a root, regardless of the starting point. However, it may be difficult to predict which root the method will end up at. To visualize this, we color points according to which root they converge to.

First, let’s look at the polynomial

(x − 2)(x − 4)(x − 24)

which clearly has roots at 2, 4, and 24. We’ll generate random starting points and color them blue, orange, or green depending on whether they converge to 2, 4, or 24. Here’s the result.

To make this easier to see, let’s split it into each color: blue, orange, and green.

Now let’s change our polynomial by moving the root at 4 to 4i.

(x − 2)(x − 4i)(x − 24)

Here’s the combined result.

And here is each color separately.

As we explained last time, the area taken up by the separate colors seems to exceed the total area. That is because the colors are so intermingled that many of the dots in the images cover some territory that belongs to another color, even though the dots are quite small.

Laguerre’s root finding method

Edmond Laguerre (1834–1886) came up with a method for finding zeros of polynomials. Unlike Newton’s method for finding zeros of general functions, Laguerre’s method is specialized for polynomials. Laguerre’s method converges an order of magnitude faster than Newton’s method, i.e. the error is cubed on each step rather than squared.

The most interesting thing about Laguerre’s method is that it nearly always converges to a root, no matter where you start. Newton’s method, on the other hand, converges quickly if you start sufficiently close to a root. If you don’t start close enough to a root, the method might cycle or go off to infinity.

The first time I taught numerical analysis, the textbook we used began the section on Newton’s method with the following nursery rhyme:

There was a little girl,
Who had a little curl,
Right in the middle of her forehead.
When she was good,
She was very, very good,
But when she was bad, she was horrid.

When Newton’s method is good, it is very, very good, but when it is bad it is horrid.

Laguerre’s method is not well understood. Experiments show that it nearly always converges, but there’s not much theory to explain what’s going on.

The method is robust in that it is very likely to converge to some root, but it may not converge to the root you expected unless you start sufficiently close. The rest of the post illustrates this.

Let’s look at the polynomial

p(x) = 3 + 22x + 20x² + 24x³.

This polynomial has roots at 0.15392, and at -0.3397 ± 0.83468i.

We’ll generate 2,000 random starting points for Laguerre’s method and color its location according to which root it converges to. Points converging to the real root are colored blue, points converging to the root with positive imaginary part are colored orange, and points converging to the root with negative imaginary part are colored green.

Here’s what we get:

This is hard to see, but we can tell that there aren’t many blue dots, about an equal number of orange and green dots, and the latter are thoroughly mixed together. This means the method is unlikely to converge to the real root, and about equally likely to converge to either of the complex roots.

Let’s look at just the starting points that converge to the real root, its basin of attraction. To get more resolution, we’ll generate 100,000 starting points and make the dots smaller.

The convergence region is pinched near the root; you can start fairly near the root along the real axis but converge to one of the complex roots. Notice also that there are scattered points far from the real root that converge to that point.

Next let’s look at the points that converge to the complex root in the upper half plane.

Note that the basin of attraction appears to take up over half the area. But the corresponding basin of attraction for the root in the lower half plane also appears to take up over half the area.

They can’t both take up over half the area. In fact. both take up about 48%. But the two regions are very intertwined. Due to the width of the dots used in plotting, each green dot covers a tiny bit of area that belongs to orange, and vice versa. That is, the fact that both appear to take over half the area shows how commingled they are.

Related posts

The essence of chaos

logistic bifurcation diagram

Linear systems can show sensitive dependence on initial conditions, but they cannot be chaotic. Only nonlinear systems can be chaotic.

George Datseris and Ulrich Parlitz explain this well in their book Nonlinear Dynamics:

… Sensitive dependence is not sufficient for a definition of chaos. … the state space is first stretched and then folded within itself. This is the defining characteristic of chaotic systems. The stretching part is where sensitive dependence comes from … as state space volumes increase in size … nonlinear dynamics takes over and folds the state space back in on itself. … This is why only nonlinear systems can be chaotic: linear systems with positive Lyapunov exponents always escape to infinity.

Related posts

Hénon’s dynamical system

This post will reproduce a three plots from a paper of Hénon on dynamical systems from 1969 [1].

Let α be a constant, and pick some starting point in the plane, (x0, y0), then update x and y according to

xn+1 = xn cos α − (ynxn²) sin α
yn+1 = xn sin α + (ynxn²) cos α

When I tried out Hénon’s system for random starting points, I didn’t get plots that were as interesting as those in the original paper. Hénon clearly did a lot of experiments to produce the plots chosen for his paper. Because his system is chaotic, small changes to the initial conditions can make enormous differences to the resulting plots.

Hénon lists the starting points used for each plot, and the number of iterations for each. I imagine the number of iterations was chosen in order to stop when points wandered too far from the origin. Even so, I ran into some overflow problems.

I suspect the original calculations were carried out using single precision (32-bit) floating point numbers, while my calculations were carried out with double precision (64-bit) floating point numbers [2]. Since the system is chaotic, this minor difference in the least significant bits could make a difference.

Here is my reproduction of Hénon’s Figure 4:

Here is my reproduction of Figure 5:

And finally Figure 7:

Update: You could use colors to distinguish the trajectories of different starting points. Here’s a color version of Figure 5:

Related posts

For more posts like this, see Dynamical systems & chaos.

Python code

Here is the code that produced the plots. The data starting points and number of iterations were taken from Table 1 of [1].

import matplotlib.pyplot as plt

# Hénon's dynamical system
# M. Hénon. Numerical study of squadratic area-preserving mappings. Quarterly of Applied Mathematics. October 1969 pp. 291--312

def makeplot(c, data):
    s = (1 - c*c)**0.5
    for d in data:
        x, y, n = d
        for _ in range(n):
            if abs(x) < 1 and abs(y) < 1:
                x, y = x*c - (y - x**2)*s, x*s + (y - x**2)*c
                plt.scatter(x, y, s=1, color='b')
    plt.gca().set_aspect("equal")
    plt.show()

fig4 = [
    [0.1,   0,    200],
    [0.2,   0,    360],
    [0.3,   0,    840],
    [0.4,   0,    871],
    [0.5,   0,    327],
    [0.58,  0,   1164],
    [0.61,  0,   1000],
    [0.63,  0.2,  180],
    [0.66,  0.22, 500],
    [0.66,  0,    694],
    [0.73,  0,    681],
    [0.795, 0,    657],
]

makeplot(0.4, fig4)

fig5 = [
    [0.2,   0,    651],
    [0.35,  0,    187],
    [0.44,  0,   1000],
    [0.60, -0.1, 1000],
    [0.68,  0,    250],
    [0.718, 0,   3000],
    [0.75,  0,   1554],
    [0.82,  0,    233],
]

makeplot(0.24, fig5)


fig7 = [
    [0.1, 0, 182],
    [0.15, 0, 1500],
    [0.35, 0, 560],
    [0.54, 0, 210],
    [0.59, 0, 437],
    [0.68, 0, 157],
]

makeplot(-0.01, fig7, "fig7.png")

[1] Michel Hénon. Numerical study of quadratic area-preserving mappings. Quarterly of Applied Mathematics. October 1969 pp. 291–312

[2] “Single” and “double” are historical artifacts. “Double” is now ordinary, and “single” is exceptional.

Newton’s method: The Good, The Bad, and The Ugly

This post will give examples where Newton’s method gives good results, bad results, and really bad results.

Our example problem is to solve Kepler’s equation

M = Ee sin E

for E, given M and e, assuming 0 ≤ M ≤ π and 0 < e < 1. We will apply Newton’s method using E = M as our starting point.

The Good

A couple days ago I wrote about solving this problem with Newton’s method. That post showed that for e < 0.5 we can guarantee that Newton’s method will converge for any M, starting with the initial guess E = M.

Not only does it converge, but it converges monotonically: each iteration brings you closer to the exact result, and typically the number of correct decimal places doubles at each step.

You can extend this good behavior for larger values of e by choosing a better initial guess for E, such as Machin’s method, but in this post we will stick with the starting guess E = M.

The Bad

Newton’s method often works when it’s not clear that it should. There’s no reason why we should expect it to converge for large values of e starting from the initial guess E = M. And yet it often does. If we choose e = 0.991 and M = 0.13π then Newton’s method converges, though it doesn’t start off well.

In this case Newton’s method converges to full floating point precision after a few iterations, but the road to convergence is rocky. After one iteration we have E = 5, even though we know a priori that E < π. Despite initially producing a nonsensical result, however, the method does converge.

The Ugly

The example above was taken from [1]. I wasn’t able to track down the original paper, but I saw it referenced second hand. The authors fixed M = 0.13π and let e = 0.991, 0.992, and 0.993. The results for e = 0.993 are similar to those for 0.991 above. But the authors reported that the method apparently doesn’t converge for e = 0.992. That is, the method works for two nearby values of e but not for a value in between.

Here’s what we get for the first few iterations.

The method produces a couple spectacularly bad estimates for a solution known to be between 0 and π, and generally wanders around with no sign of convergence. If we look further out, it gets even worse.

The previous bad iterates on the order of ±1000 pale in comparison to iterates roughly equal to ±30,000. Remarkably though, the method does eventually converge.

Commentary

The denominator in Newton’s method applied to Kepler’s equation is 1 – e cos E. When e is on the order of 0.99, this denominator can occasionally be on the order of 0.01, and dividing by such a small number can pull wild iterates closer to where they need to be. In this case Newton’s method is wandering around randomly, but if it ever wanders into a basin of attraction for the root, it will converge.

For practical purposes, if Newton’s method does not converge predictably then it doesn’t converge. You wouldn’t want to navigate a spacecraft using a method that in theory will eventually converge though you have no idea how long that might take. Still, it would be interesting to know whether there are any results that find points where Newton’s method applied to Kepler’s equation never converges even in theory.

The results here say more about Newton’s method and dynamical systems than about practical ways of solving Kepler’s equation. It’s been known for about three centuries that you can do better than always starting Newton’s method with E = M when solving Kepler’s equation.

[1] Charles, E. D. and Tatum, J.B. The convergence of Newton-Raphson iteration with Kepler’s Equation. Celestial Mechanics and Dynamical Astronomy. 69, 357–372 (1997).

Adding a tiny trap to stop chaos

The tent map is a famous example of a chaotic function. We will show how a tiny modification of the tent maps retains continuity of the function but prevents chaos.

The tent map is the function f: [0, 1] → [0, 1] defined by

f(x) = \left\{ \begin{array}{ll} 2x & \mbox{if } x \leq 1/2 \\ 2 - 2x & \mbox{if } x \geq 1/2 \end{array} \right.

This map has an unstable fixed point at x0 = 2/3.  If you start exactly at x0, then you’ll stay there forever as you keep iteratively applying f. But if you start a tiny bit away from x0, you’ll move away from your starting point and bounce around throughout the interval [0, 1].

Now pick some positive ε less than 1/6 [1] and modify the function f on the interval

I =[x0 – ε, x0 + ε]

as described in [2]. To do this, we define fε to be the same as f outside the interval I. We then create a flat spot in the middle of the interval by defining fε to be x0 on

[x0 – ε/2, x0 + ε/2]

and extend fε linearly on the rest of I.

Here’s a plot of fε with ε = 0.05.

Now here’s a cobweb plot of the iterates of f0.05 starting at π – 3.

The iterates of fε always converge to x0 in finite time. If the iterates ever wander into the interval [x0 – ε/2, x0 + ε/2]  then they get trapped at x0. And because the tent map is ergodic, nearly all sequences will wander into the entrapping interval.

Here’s Python code to make the construction of fε explicit.

    def tent(x):
        if x < 0.5:
            return 2*x 
        else:
            return 2*(1 - x) 
    
    def interpolate(x, x1, y1, x2, y2):
        "Linearly interpolate with f(x1) = y1 and f(x2) = y2"
        m = (y2 - y1)/(x2 - x1)
        return y1 + m*(x - x1)
    
    def tent2(x, epsilon):
        x0 = 2/3
        if x < x0 - epsilon or x > x0 + epsilon:
            return tent(x)
    
        x1 = x0 - epsilon/2
        x2 = x0 + epsilon/2    
        
        if x > x1 and x < x2:
            return x0
        
        return interpolate(x, x1, tent(x1), x2, tent(x2))

Revenge of floating point

In theory, almost all starting points should lead to sequences that converge to x0 in finite time. But it took a fair amount of trial and error to come up the plot above that illustrates this. For most starting points that I tried, the sequence of iterates converged to 0.

Now 0 is an unstable fixed point. This is easy to see: if x ever gets close to 0, the next iterate is 2x, twice as far from 0. Iterates keep getting doubled until they're larger than 1/2. How can this be?

Computers can only represent fractions with some power of 2 in the denominator. And I believe the tent map (not the modified tent map with a trap) will always converge to 0 when starting with a floating point number.

Here are the iterates of the tent map starting at (the floating point representation of) π - 3:

0 0.28318530717958623
1 0.5663706143591725
2 0.8672587712816551
3 0.26548245743668986
...
29 0.13050460815429688
30 0.26100921630859375
31 0.5220184326171875
32 0.955963134765625
33 0.08807373046875
34 0.1761474609375
35 0.352294921875
36 0.70458984375
37 0.5908203125
38 0.818359375
39 0.36328125
40 0.7265625
41 0.546875
42 0.90625
43 0.1875
44 0.375
45 0.75
46 0.5
47 1.0
48 0.0

Note that the sequence doesn't sneak up on 0: it leaps from 1 to 0. So this does not contract the argument above that points near zero are repelled away.

Related posts: More on dynamical systems

[1] Why less than 1/6? So we stay in the same branch of the definition of f. The distance from x0 to 1/2 is 1/6.

[2] Achim Clausing, Ducci Matrices. American Mathematical Monthly, December 2018.

The tent map

Yesterday I said that Lyapunov exponents can’t be calculated exactly except in the case of toy problems. One such toy model is the tent map.

x_{n+1}= \left\{ \begin{array}{ll} 2rx_n & \mbox{if } x_n < 1/2 \\ 2r - 2rx_n & \mbox{if } x_n \geq 1/2 \end{array} \right.

The graph of function on the right hand side looks a tent. It’s zero at x = 0 and at x = 1 and rises to a height of r in the middle. The Lyapunov exponent of this system has been calculated [1] to be

λ = log 2r.

For r < 1/2 the Lyapunov exponent is negative and the system is stable.

For r > 1/2 the Lyapunov exponent is positive and the system is chaotic. The larger r is, the faster uncertainty in the future values of x grows. In fact the uncertainty grows in proportion to (2r)n.

Suppose r = 1 and we know the floating point representation of x0. Suppose we compute the iterations of the tent map exactly. There is no rounding error, and the only uncertainty comes from the initial uncertainty in x0. Assuming we’re using an IEEE 754 double precision number, our initial uncertainty is 2-53. (See details here.)

We lose one bit of precision in the value of x at each iteration. After 53 iterations, we’ve lost all information: the true value of x53 could be anywhere in [0, 1], and our calculated value gives us no clue where in the interval x actually is.

Here’s a cobweb plot of the iterations starting with x0 = 4 – π.

The solid black lines are the tent map. Vertical lines connect each iterate to its next value. Horizontal lines bring each iterate back to the line y = x to start the next iteration.

Incidentally, after 49 iterations the computed value x reaches 0 and stays there. This would not happen if we started with exactly x0 = 4 – π and carried out each iteration in exact arithmetic because then all the values of x are irrational.

Related posts: Chaos and dynamical systems

[1] Lichtenberg and Lieberman. Regular Stochastic Motion. Springer-Verlag 1983.

Everywhere chaotic map on the sphere

Let f be a complex-valued function of a complex variable. The Julia set of f is the set of points where f is chaotic. Julia sets are often complicated fractals, but they can be simple. In this post I want to look at the function

f(z) = (z² + 1)² / 4z(z² – 1).

I learned about this function from the latest 3Blue1Brown video. This function is a Lattès example. (I misread this at first and thought it was Latte’s example, but it is one of a family of examples by the French mathematician Samuel Lattès.)

What makes this function interesting is that its Julia set is the entire plane. That is, iterates of the function are sensitive to starting point no matter where you start.

I wanted to play around with this function, but plotting iterates would not be practical if they wander all over the complex plane: no matter how big a region I plot, the points will leave that region, and possibly leave quickly. So instead of looking at iterates on the complex plane, I’ll look project them onto a sphere so we can see them all at once.

Riemann sphere

This is a good time to come clean about a detail I left out. From the beginning I should have said that f is a map not from the complex plane ℂ to itself but from the Riemann sphere

ℂ ∪ {∞}

to itself. I didn’t gloss over much, just one point, the point at infinity. In our example, we’ll define f(0) = ∞ and the resulting extension is continuous as a map from the sphere to itself.

Not only will moving to the sphere make things easier to see, it’s actually where our function naturally lives.

Stereographic projection

Imagine a unit sphere sitting on top of the complex plane. Stereographic projection maps every point on the sphere, except the north pole, to a point in the plane. Draw a line between the north pole and a point on the sphere, and its stereographic projection is where the line intersects the plane. The north pole itself has nowhere to go. When we extend the complex plane by adding a point ∞, we can map the north pole there.

We’re actually interested in the inverse of stereographic projection because we want to go from the plane to the sphere. Let’s define a function p[u, v] that carries out our inverse stereographic projection in Mathematica.

    p[u_, v_] := ResourceFunction[
        "InverseStereographicProjection"][{u, v}]

Plotting iterates

We want to pick some arbitrary starting point z0 and look at the sequence

z0, f(z0), f(f(z0)), f(f(f(z0))), …

We can do this in Mathematica with the NestList function. It takes three arguments: the function to iterate, a starting point, and the number of elements in the sequence to produce. For example, if we define

    latte[z_] := (z^2 + 1)^2/(4 z (z^2 - 1))

then

    NestList[latte, 0.1 + 2 I, 5]

gives us the first five elements in the sequence above.

The projection function p[u, v] above takes x and y coordinates, not complex numbers, so we define our own that takes complex numbers.

    projection[z_] := p[Re[z], Im[z]]

Now we’re can plot our iterates. I chose 10 + 1.9i as a starting point based on today’s date, October 19, but you could pick any non-zero starting point. And that’s kinda the point: any place you start will have the same chaotic dynamics [1].

Here’s our plotting code.

    ListPointPlot3D[
        Map[projection, NestList[latte, 10 + 1.9 I, 1000]], 
        AspectRatio -> 1]

And here’s the plot:

We could plot a sphere with

    SphericalPlot3D[1, {t, 0, Pi}, {ph, 0, 2 Pi}]

and stack the two plots to make it clear that the points are indeed on a sphere.

Related posts

[1] If you start with 0, you’ll next go to ∞ and stay there. But 0 and ∞ are parts of the Julia set too because the points in the neighborhoods of these points lead to chaos, no matter how small you take the open neighborhoods to be.

3D bifurcation diagram

The following 2D bifurcation diagram is famous. You’ve probably seen it elsewhere.

logistic bifurcation diagram

If you have seen it, you probably know that it has something to do with chaos, iterated functions, fractals, and all that. If you’d like to read in more detail about what exactly the plot means, see this post.

I was reading Michael Trott’s Mathematica Guidebook for Numerics and ran across a 3D version of the above diagram. I’d never seen that before. He plots the iterations of the system

x ← a – y(β x + (1 – β) y)
yx + y²/100

The following plot is for β = 1/4 using the code included in the book. (I changed the parameter α in the book to β because the visual similarity between a and α was a little confusing.)

Trott cites four papers regarding this iteration. I looked at a couple of the papers, and they contain similar systems but aren’t quite the same. Maybe his example is a sort of synthesis of the examples he found in the literature.

Related posts: More on dynamical systems

Mathematical stability vs numerical stability

Is 0 a stable fixed point of f(x) = 4x (1-x)?

If you set x = 0 and compute f(x) you will get exactly 0. Apply f a thousand times and you’ll never get anything but zero.

But this does not mean 0 is a stable attractor, and in fact it is not stable.

It’s easy to get mathematical stability and numerical stability confused, because the latter often illustrates the former. In this post I point out that the points on a particular unstable orbit cannot be represented exactly in floating point numbers, so iterations that start at the floating  point representations of these points will drift away.

In the example here, 0 can be represented exactly, and so we do have a computationally stable fixed point. But 0 is not a stable fixed point in the mathematical sense. Any starting point close to 0 but not exactly 0 will eventually go all over the place.

If we start at some very small ε > 0, then f(ε) ≈ 4ε. Every iteration multiplies the result by approximately 4, and eventually the result is large enough that the approximation no longer holds.

For example, suppose we start with x = 10-100. After 100 iterations we have

x = 4100 10-100 = 1.6 × 10-40.

If we were to plot this, we wouldn’t see any movement. But by the time we iterate our function 200 times, the result is chaotic. I get the following: