# The essence of chaos Linear systems can show sensitive dependence on initial conditions, but they cannot be chaotic. Only nonlinear systems can be chaotic.

George Datseris and Ulrich Parlitz explain this well in their book Nonlinear Dynamics:

… Sensitive dependence is not sufficient for a definition of chaos. … the state space is first stretched and then folded within itself. This is the defining characteristic of chaotic systems. The stretching part is where sensitive dependence comes from … as state space volumes increase in size … nonlinear dynamics takes over and folds the state space back in on itself. … This is why only nonlinear systems can be chaotic: linear systems with positive Lyapunov exponents always escape to infinity.

# Hénon’s dynamical system

This post will reproduce a three plots from a paper of Hénon on dynamical systems from 1969 .

Let α be a constant, and pick some starting point in the plane, (x0, y0), then update x and y according to

xn+1 = xn cos α − (ynxn²) sin α
yn+1 = xn sin α + (ynxn²) cos α

When I tried out Hénon’s system for random starting points, I didn’t get plots that were as interesting as those in the original paper. Hénon clearly did a lot of experiments to produce the plots chosen for his paper. Because his system is chaotic, small changes to the initial conditions can make enormous differences to the resulting plots.

Hénon lists the starting points used for each plot, and the number of iterations for each. I imagine the number of iterations was chosen in order to stop when points wandered too far from the origin. Even so, I ran into some overflow problems.

I suspect the original calculations were carried out using single precision (32-bit) floating point numbers, while my calculations were carried out with double precision (64-bit) floating point numbers . Since the system is chaotic, this minor difference in the least significant bits could make a difference.

Here is my reproduction of Hénon’s Figure 4: Here is my reproduction of Figure 5: And finally Figure 7: Update: You could use colors to distinguish the trajectories of different starting points. Here’s a color version of Figure 5: ## Python code

Here is the code that produced the plots. The data starting points and number of iterations were taken from Table 1 of .

import matplotlib.pyplot as plt

# Hénon's dynamical system
# M. Hénon. Numerical study of squadratic area-preserving mappings. Quarterly of Applied Mathematics. October 1969 pp. 291--312

def makeplot(c, data):
s = (1 - c*c)**0.5
for d in data:
x, y, n = d
for _ in range(n):
if abs(x) < 1 and abs(y) < 1:
x, y = x*c - (y - x**2)*s, x*s + (y - x**2)*c
plt.scatter(x, y, s=1, color='b')
plt.gca().set_aspect("equal")
plt.show()

fig4 = [
[0.1,   0,    200],
[0.2,   0,    360],
[0.3,   0,    840],
[0.4,   0,    871],
[0.5,   0,    327],
[0.58,  0,   1164],
[0.61,  0,   1000],
[0.63,  0.2,  180],
[0.66,  0.22, 500],
[0.66,  0,    694],
[0.73,  0,    681],
[0.795, 0,    657],
]

makeplot(0.4, fig4)

fig5 = [
[0.2,   0,    651],
[0.35,  0,    187],
[0.44,  0,   1000],
[0.60, -0.1, 1000],
[0.68,  0,    250],
[0.718, 0,   3000],
[0.75,  0,   1554],
[0.82,  0,    233],
]

makeplot(0.24, fig5)

fig7 = [
[0.1, 0, 182],
[0.15, 0, 1500],
[0.35, 0, 560],
[0.54, 0, 210],
[0.59, 0, 437],
[0.68, 0, 157],
]

makeplot(-0.01, fig7, "fig7.png")


 Michel Hénon. Numerical study of quadratic area-preserving mappings. Quarterly of Applied Mathematics. October 1969 pp. 291–312

 “Single” and “double” are historical artifacts. “Double” is now ordinary, and “single” is exceptional.

# Newton’s method: The Good, The Bad, and The Ugly

This post will give examples where Newton’s method gives good results, bad results, and really bad results.

Our example problem is to solve Kepler’s equation

M = Ee sin E

for E, given M and e, assuming 0 ≤ M ≤ π and 0 < e < 1. We will apply Newton’s method using E = M as our starting point.

## The Good

A couple days ago I wrote about solving this problem with Newton’s method. That post showed that for e < 0.5 we can guarantee that Newton’s method will converge for any M, starting with the initial guess E = M.

Not only does it converge, but it converges monotonically: each iteration brings you closer to the exact result, and typically the number of correct decimal places doubles at each step.

You can extend this good behavior for larger values of e by choosing a better initial guess for E, such as Machin’s method, but in this post we will stick with the starting guess E = M.

Newton’s method often works when it’s not clear that it should. There’s no reason why we should expect it to converge for large values of e starting from the initial guess E = M. And yet it often does. If we choose e = 0.991 and M = 0.13π then Newton’s method converges, though it doesn’t start off well. In this case Newton’s method converges to full floating point precision after a few iterations, but the road to convergence is rocky. After one iteration we have E = 5, even though we know a priori that E < π. Despite initially producing a nonsensical result, however, the method does converge.

## The Ugly

The example above was taken from . I wasn’t able to track down the original paper, but I saw it referenced second hand. The authors fixed M = 0.13π and let e = 0.991, 0.992, and 0.993. The results for e = 0.993 are similar to those for 0.991 above. But the authors reported that the method apparently doesn’t converge for e = 0.992. That is, the method works for two nearby values of e but not for a value in between.

Here’s what we get for the first few iterations. The method produces a couple spectacularly bad estimates for a solution known to be between 0 and π, and generally wanders around with no sign of convergence. If we look further out, it gets even worse. The previous bad iterates on the order of ±1000 pale in comparison to iterates roughly equal to ±30,000. Remarkably though, the method does eventually converge.

## Commentary

The denominator in Newton’s method applied to Kepler’s equation is 1 – e cos E. When e is on the order of 0.99, this denominator can occasionally be on the order of 0.01, and dividing by such a small number can pull wild iterates closer to where they need to be. In this case Newton’s method is wandering around randomly, but if it ever wanders into a basin of attraction for the root, it will converge.

For practical purposes, if Newton’s method does not converge predictably then it doesn’t converge. You wouldn’t want to navigate a spacecraft using a method that in theory will eventually converge though you have no idea how long that might take. Still, it would be interesting to know whether there are any results that find points where Newton’s method applied to Kepler’s equation never converges even in theory.

The results here say more about Newton’s method and dynamical systems than about practical ways of solving Kepler’s equation. It’s been known for about three centuries that you can do better than always starting Newton’s method with E = M when solving Kepler’s equation.

 Charles, E. D. and Tatum, J.B. The convergence of Newton-Raphson iteration with Kepler’s Equation. Celestial Mechanics and Dynamical Astronomy. 69, 357–372 (1997).

# Adding a tiny trap to stop chaos

The tent map is a famous example of a chaotic function. We will show how a tiny modification of the tent maps retains continuity of the function but prevents chaos.

The tent map is the function f: [0, 1] → [0, 1] defined by This map has an unstable fixed point at x0 = 2/3.  If you start exactly at x0, then you’ll stay there forever as you keep iteratively applying f. But if you start a tiny bit away from x0, you’ll move away from your starting point and bounce around throughout the interval [0, 1].

Now pick some positive ε less than 1/6  and modify the function f on the interval

I =[x0 – ε, x0 + ε]

as described in . To do this, we define fε to be the same as f outside the interval I. We then create a flat spot in the middle of the interval by defining fε to be x0 on

[x0 – ε/2, x0 + ε/2]

and extend fε linearly on the rest of I.

Here’s a plot of fε with ε = 0.05. Now here’s a cobweb plot of the iterates of f0.05 starting at π – 3. The iterates of fε always converge to x0 in finite time. If the iterates ever wander into the interval [x0 – ε/2, x0 + ε/2]  then they get trapped at x0. And because the tent map is ergodic, nearly all sequences will wander into the entrapping interval.

Here’s Python code to make the construction of fε explicit.

    def tent(x):
if x < 0.5:
return 2*x
else:
return 2*(1 - x)

def interpolate(x, x1, y1, x2, y2):
"Linearly interpolate with f(x1) = y1 and f(x2) = y2"
m = (y2 - y1)/(x2 - x1)
return y1 + m*(x - x1)

def tent2(x, epsilon):
x0 = 2/3
if x < x0 - epsilon or x > x0 + epsilon:
return tent(x)

x1 = x0 - epsilon/2
x2 = x0 + epsilon/2

if x > x1 and x < x2:
return x0

return interpolate(x, x1, tent(x1), x2, tent(x2))


## Revenge of floating point

In theory, almost all starting points should lead to sequences that converge to x0 in finite time. But it took a fair amount of trial and error to come up the plot above that illustrates this. For most starting points that I tried, the sequence of iterates converged to 0.

Now 0 is an unstable fixed point. This is easy to see: if x ever gets close to 0, the next iterate is 2x, twice as far from 0. Iterates keep getting doubled until they're larger than 1/2. How can this be?

Computers can only represent fractions with some power of 2 in the denominator. And I believe the tent map (not the modified tent map with a trap) will always converge to 0 when starting with a floating point number.

Here are the iterates of the tent map starting at (the floating point representation of) π - 3:

0 0.28318530717958623
1 0.5663706143591725
2 0.8672587712816551
3 0.26548245743668986
...
29 0.13050460815429688
30 0.26100921630859375
31 0.5220184326171875
32 0.955963134765625
33 0.08807373046875
34 0.1761474609375
35 0.352294921875
36 0.70458984375
37 0.5908203125
38 0.818359375
39 0.36328125
40 0.7265625
41 0.546875
42 0.90625
43 0.1875
44 0.375
45 0.75
46 0.5
47 1.0
48 0.0

Note that the sequence doesn't sneak up on 0: it leaps from 1 to 0. So this does not contract the argument above that points near zero are repelled away.

Related posts: More on dynamical systems

 Why less than 1/6? So we stay in the same branch of the definition of f. The distance from x0 to 1/2 is 1/6.

 Achim Clausing, Ducci Matrices. American Mathematical Monthly, December 2018.

# The tent map

Yesterday I said that Lyapunov exponents can’t be calculated exactly except in the case of toy problems. One such toy model is the tent map. The graph of function on the right hand side looks a tent. It’s zero at x = 0 and at x = 1 and rises to a height of r in the middle. The Lyapunov exponent of this system has been calculated  to be

λ = log 2r.

For r < 1/2 the Lyapunov exponent is negative and the system is stable.

For r > 1/2 the Lyapunov exponent is positive and the system is chaotic. The larger r is, the faster uncertainty in the future values of x grows. In fact the uncertainty grows in proportion to (2r)n.

Suppose r = 1 and we know the floating point representation of x0. Suppose we compute the iterations of the tent map exactly. There is no rounding error, and the only uncertainty comes from the initial uncertainty in x0. Assuming we’re using an IEEE 754 double precision number, our initial uncertainty is 2-53. (See details here.)

We lose one bit of precision in the value of x at each iteration. After 53 iterations, we’ve lost all information: the true value of x53 could be anywhere in [0, 1], and our calculated value gives us no clue where in the interval x actually is.

Here’s a cobweb plot of the iterations starting with x0 = 4 – π. The solid black lines are the tent map. Vertical lines connect each iterate to its next value. Horizontal lines bring each iterate back to the line y = x to start the next iteration.

Incidentally, after 49 iterations the computed value x reaches 0 and stays there. This would not happen if we started with exactly x0 = 4 – π and carried out each iteration in exact arithmetic because then all the values of x are irrational.

Related posts: Chaos and dynamical systems

 Lichtenberg and Lieberman. Regular Stochastic Motion. Springer-Verlag 1983.

# Everywhere chaotic map on the sphere

Let f be a complex-valued function of a complex variable. The Julia set of f is the set of points where f is chaotic. Julia sets are often complicated fractals, but they can be simple. In this post I want to look at the function

f(z) = (z² + 1)² / 4z(z² – 1).

I learned about this function from the latest 3Blue1Brown video. This function is a Lattès example. (I misread this at first and thought it was Latte’s example, but it is one of a family of examples by the French mathematician Samuel Lattès.)

What makes this function interesting is that its Julia set is the entire plane. That is, iterates of the function are sensitive to starting point no matter where you start.

I wanted to play around with this function, but plotting iterates would not be practical if they wander all over the complex plane: no matter how big a region I plot, the points will leave that region, and possibly leave quickly. So instead of looking at iterates on the complex plane, I’ll look project them onto a sphere so we can see them all at once.

## Riemann sphere

This is a good time to come clean about a detail I left out. From the beginning I should have said that f is a map not from the complex plane ℂ to itself but from the Riemann sphere

ℂ ∪ {∞}

to itself. I didn’t gloss over much, just one point, the point at infinity. In our example, we’ll define f(0) = ∞ and the resulting extension is continuous as a map from the sphere to itself.

Not only will moving to the sphere make things easier to see, it’s actually where our function naturally lives.

## Stereographic projection

Imagine a unit sphere sitting on top of the complex plane. Stereographic projection maps every point on the sphere, except the north pole, to a point in the plane. Draw a line between the north pole and a point on the sphere, and its stereographic projection is where the line intersects the plane. The north pole itself has nowhere to go. When we extend the complex plane by adding a point ∞, we can map the north pole there.

We’re actually interested in the inverse of stereographic projection because we want to go from the plane to the sphere. Let’s define a function p[u, v] that carries out our inverse stereographic projection in Mathematica.

    p[u_, v_] := ResourceFunction[
"InverseStereographicProjection"][{u, v}]

## Plotting iterates

We want to pick some arbitrary starting point z0 and look at the sequence

z0, f(z0), f(f(z0)), f(f(f(z0))), …

We can do this in Mathematica with the NestList function. It takes three arguments: the function to iterate, a starting point, and the number of elements in the sequence to produce. For example, if we define

    latte[z_] := (z^2 + 1)^2/(4 z (z^2 - 1))

then

    NestList[latte, 0.1 + 2 I, 5]

gives us the first five elements in the sequence above.

The projection function p[u, v] above takes x and y coordinates, not complex numbers, so we define our own that takes complex numbers.

    projection[z_] := p[Re[z], Im[z]]

Now we’re can plot our iterates. I chose 10 + 1.9i as a starting point based on today’s date, October 19, but you could pick any non-zero starting point. And that’s kinda the point: any place you start will have the same chaotic dynamics .

Here’s our plotting code.

    ListPointPlot3D[
Map[projection, NestList[latte, 10 + 1.9 I, 1000]],
AspectRatio -> 1]


And here’s the plot: We could plot a sphere with

    SphericalPlot3D[1, {t, 0, Pi}, {ph, 0, 2 Pi}]

and stack the two plots to make it clear that the points are indeed on a sphere. ## Related posts

 If you start with 0, you’ll next go to ∞ and stay there. But 0 and ∞ are parts of the Julia set too because the points in the neighborhoods of these points lead to chaos, no matter how small you take the open neighborhoods to be.

# 3D bifurcation diagram

The following 2D bifurcation diagram is famous. You’ve probably seen it elsewhere. If you have seen it, you probably know that it has something to do with chaos, iterated functions, fractals, and all that. If you’d like to read in more detail about what exactly the plot means, see this post.

I was reading Michael Trott’s Mathematica Guidebook for Numerics and ran across a 3D version of the above diagram. I’d never seen that before. He plots the iterations of the system

x ← a – y(β x + (1 – β) y)
yx + y²/100

The following plot is for β = 1/4 using the code included in the book. (I changed the parameter α in the book to β because the visual similarity between a and α was a little confusing.) Trott cites four papers regarding this iteration. I looked at a couple of the papers, and they contain similar systems but aren’t quite the same. Maybe his example is a sort of synthesis of the examples he found in the literature.

Related posts: More on dynamical systems

# Mathematical stability vs numerical stability

Is 0 a stable fixed point of f(x) = 4x (1-x)?

If you set x = 0 and compute f(x) you will get exactly 0. Apply f a thousand times and you’ll never get anything but zero.

But this does not mean 0 is a stable attractor, and in fact it is not stable.

It’s easy to get mathematical stability and numerical stability confused, because the latter often illustrates the former. In this post I point out that the points on a particular unstable orbit cannot be represented exactly in floating point numbers, so iterations that start at the floating  point representations of these points will drift away.

In the example here, 0 can be represented exactly, and so we do have a computationally stable fixed point. But 0 is not a stable fixed point in the mathematical sense. Any starting point close to 0 but not exactly 0 will eventually go all over the place.

If we start at some very small ε > 0, then f(ε) ≈ 4ε. Every iteration multiplies the result by approximately 4, and eventually the result is large enough that the approximation no longer holds.

For example, suppose we start with x = 10-100. After 100 iterations we have

x = 4100 10-100 = 1.6 × 10-40.

If we were to plot this, we wouldn’t see any movement. But by the time we iterate our function 200 times, the result is chaotic. I get the following: # Does chaos imply period 3?

Sharkovskii’s theorem says that if a continuous map f from an interval I to itself has a point with period 3, then it has a point with period 5. And if it has a point with period 5, then it has points with order 7, etc. The theorem has a chain of implications that all begins with 3. If you have points with period 3, you have points of all periods.

Now suppose you want to go “upstream” in Sharkovskii’s chain of implications. If you have a point with period 5, do you have a point with period 3? The answer is no: a map can have points with period 5 without having points of period 3, though it necessarily has points with all other positive integer periods except 3.

There’s an example illustrating the claim above that I’ve seen in multiple places, but I haven’t seen it presented graphically. You could work through the example analytically, but here I present it graphically.

This is the example function written in Python.

    def f(x):
assert(1 <= x <= 5)

if x < 2:
return 1 + 2*x
if x < 3:
return 7 - x
if x < 4:
return 10 - 2*x
if x <= 5:
return 6 - x


Here’s a graphical demonstration that f has a fixed point, but no points of period 3. The only point fixed under applying f three times is the point that was already fixed under applying f once.

This graph shows that f has points with period 5: By Sharkovskii’s theorem f must have points with all other periods, except 3. Here’s a demonstration that it has points with period 6. The map f is chaotic, but it does not have a point with period 3.

## Logistic map

Let’s look at the most famous chaotic map, the logistic map.

f(x) = rx (1 – x)

where x is in [0, 1] and r is in [0. 4]. The images above shows orbits as r ranges over [0, 4]. Clearly f has points with period 2. There’s a whole interval of values of r that lead to points with period 2, roughly for r between 3 and 3.5. And we can see for r a little bigger there are points of period 4. But is there any point with period 3?

We can look for points of period 3 at the end of the plot, where r = 4, using Mathematica.

Define

    f[x_] := 4 x (1 - x)

and look for points where f³(x) = x using

    Factor[f[f[f[x]]] - x]

This shows that the solutions are the roots of

x (-3 + 4 x) (-7 + 56x – 112x² + 64 x³) (-3 + 36x – 96x² + 64x³)

The first two roots are fixed points, points of period 1, but the roots of the two cubic factors are points with period 3. The cubics clearly have all their roots in the interval [0,1] and we could find their numerical values with

    NSolve[f[f[f[x]]] == x, x]

Although the roots are fixed points, they are unstable fixed points, as demonstrated at the bottom the next post.

# Sarkovsky’s theorem

The previous post explained what is meant by period three implies chaos. This post is a follow-on that looks at Sarkovsky’s theorem, which is mostly a generalization of that theorem, but not entirely .

First of all, Mr. Sarkovsky is variously known Sharkovsky, Sharkovskii, etc. As with many Slavic names, his name can be anglicized multiple ways. You might use the regular expression Sh?arkovsk(ii|y) in a search.

The theorem in the previous post, by Li and Yorke, says that if a continuous function from a closed interval to itself has a point with period three, it has points with all positive periods. This was published in 1975.

Unbeknownst to Li and Yorke, and everyone else in the West at the time, Sarkovsky had published a more general result in 1964 in a Ukrainian journal. He demonstrated a total order on the positive integers so that the existence of a point with a given period implies the existence of points with all periods further down the sequence. The sequence starts with 3, and every other positive integer is in the sequence somewhere, so period 3 implies the rest.

Sarkivsky showed that period 3 implies period 5, period 5 implies period 7, period 7 implies period 9, etc. If a continuous map of an interval to itself has a point of odd period n > 1, it has points with order given by all odd numbers larger than n. That is, Sarkovsky’s order starts out

3 > 5 > 7 > …

The sequence continues

… 2×3 > 2×5 > 2×7 > …

then

… 2²×3 > 2²×5 > 2²×7 > …

then

… 2³×3 > 2³×5 > 2³×7 > …

and so on for all powers of 2 times odd numbers greater than 1.

The sequence ends with the powers of 2 in reverse order

… 2³ > 2² > 1.

Here’s Python code to determine whether period m implies period n, assuming m and n are not equal.

from sympy import factorint

# Return whether m comes befor n in Sarkovsky order
def before(m, n):

assert(m != n)

if m == 1 or n == 1:
return m > n

m_factors = factorint(m)
n_factors = factorint(n)

m_odd = 2 not in m_factors
n_odd = 2 not in n_factors

m_power_of_2 = len(m_factors) == 1 and not m_odd
n_power_of_2 = len(n_factors) == 1 and not n_odd

if m_odd:
return m < n if n_odd else True if m_power_of_2: return m > n if n_power_of_2 else False

# m is even and not a power of 2
if n_odd:
return False
if n_power_of_2:
return True
if m_factors < n_factors: return True if m_factors > n_factors:
return False
return m < n


 There are two parts to the paper of Li and Yorke. First, that period three implies all other periods. This is a very special case of Sarkovsky’s theorem. But Li and Yorke also proved that period three implies an uncountable number of non-periodic points, which is not part of Sarkovsky’s paper.