The contraction mapping theorem says that if a function moves points closer together, then there must be some point the function doesn’t move. We’ll make this statement more precise and give a historically important application.
Definitions and theorem
A function f on a metric space X is a contraction if there exists a constant q with 0 ≤ q < 1 such that for any pair of points x and y in X,
d( f(x), f(y) ) ≤ q d(x, y)
where d is the metric on X.
A point x is a fixed point of a function f if f(x) = x.
Banach’s fixed point theorem, also known as the contraction mapping theorem, says that every contraction on a complete metric space has a fixed point. The proof is constructive: start with any point in the space and repeatedly apply the contraction. The sequence of iterates will converge to the fixed point.
Application: Kepler’s equation
Kepler’s equation in for an object in an elliptical orbit says
M + e sin E = E
where M is the mean anomaly, e is the eccentricity, and E is the eccentric anomaly. These “anomalies” are parameters that describe the location of an object in orbit. Kepler solved for E given M and e using the contraction mapping theorem, though he didn’t call it that.
Kepler speculated that it is not possible to solve for E in closed form—he was right—and used a couple iterations [1] of
f(E) = M + e sin E
to find an approximate fixed point. Since the mean anomaly is a good approximation for the eccentric anomaly, M makes a good starting point for the iteration. The iteration will converge from any starting point, as we will show below, but you’ll get a useful answer sooner starting from a good approximation.
Proof of convergence
Kepler came up with his idea for finding E around 1620, and Banach stated his fixed point theorem three centuries later. Kepler had the idea of Banach’s theorem, but he didn’t have a rigorous formulation of the theorem or a proof.
In modern terminology, the real line is a complete metric space and so we only need to prove that the function f above is a contraction. By the mean value theorem, it suffices to show that the absolute value of its derivative is less than 1. That is, we can use an upper bound on |f ‘| as the q in the definition of contraction.
Now
f ‘ (E) = e cos E
and so
|f ‘(E)| ≤ e
for all E. If our object is in an elliptical orbit, e < 1 and so we have a contraction.
Example
The following example comes from [2], though the author uses Newton’s method to solve Kepler’s equation. This is more efficient, but anachronistic.
Consider a satellite on a geocentric orbit with eccentricity e = 0.37255. Determine the true anomaly at three hours after perigee passage, and calculate the position of the satellite.
The author determines that M = 3.6029 and solves Kepler’s equation
M + e sin E = E
for E, which she then uses to solve for the true anomaly and position of the satellite.
The following Python code shows the results of the first 10 iterations of Kepler’s equation.
from math import sin
M = 3.6029
e = 0.37255
E = M
for _ in range(10):
E = M + e*sin(E)
print(E)
This produces
3.437070
3.494414
3.474166
3.481271
3.478772
3.479650
3.479341
3.479450
3.479412
3.479425
and so it appears the iteration has converged to E = 3.4794 to four decimal places.
Note that this example has a fairly large eccentricity. Presumably Kepler would have been concerned with much smaller eccentricities. The eccentricity of Jupiter’s orbit, for example, is around 0.05. For such small values of e the iteration would converge more quickly.
Update: See this post for more efficient ways to solve Kepler’s equation.
Related posts
[1] Bertil Gustafsson saying in his book Scientific Computing: A Historical Perspective that Kepler only used two iterations. Since M gives a good starting approximation to E, two iterations would give a good answer. I imagine Kepler would have done more iterations if necessary but found empirically that two was enough. Incidentally, it appears Gustaffson has a sign error in his statement of Kepler’s equation.
[2] Euler Celestial Analysis by Dora Musielak.