Music of the spheres

The idea of “music of the spheres” dates back to the Pythagoreans. They saw an analogy between orbital frequency ratios and musical frequency ratios.

HD 110067 is a star 105 light years away that has six known planets in orbital resonance. The orbital frequencies of the planets are related to each other by small integer ratios.

The planets, starting from the star, are labeled b, c, d, e, f, and g. In 9 “years”, from the perspective of g, the planets complete 54, 36, 24, 16, 12, and 9 orbits respectively. So the ratio of orbital frequencies between each pair of consecutive planets are either 3:2 or 4:3. In musical terms, these ratios are fifths and fourths.

In the chord below, the musical frequency ratios are the same as the orbital frequency rations in the HD 110067 system.

Here’s what the chord sounds like on a piano:

hd11067.wav

Related posts

Set of orbits with the same average distance to sun

Suppose a planet is in an elliptical orbit around the sun with semimajor axis a and  semiminor axis b. Then the average distance of the planet to the sun over time equals

a(1 + e²/2)

where the eccentricity e satisfies

e² = 1 − b²/a².

You can find a proof of this statement in [1].

This post will look at the set of all orbits with a fixed average distance r to the sun. Without loss of generality we can choose our units so that r = 1.

Clearly one possibility is to set a = b = 1 so the orbit is a circle. The distance is constantly 1, so the average is 1.

We can also maintain a distance of 1 by reducing a but increasing the eccentricity e. The possible orbits of average distance 1 satisfy

a(1 + e²/2) = 1

with 0 < b ≤ a ≤ 1. A little algebra shows that

b = √(3a² – 2a),

and that 2/3 < a ≤ 1. As a approaches 2/3, b approaches 0.

Let’s put the center of our coordinate system at the sun and assume the other focus of the elliptical orbits is somewhere along the positive x-axis. When e is 0 we have a unit circle orbit. As e approaches 1, the orbits approach a horizontal line with the sun on one end.

Related posts

[1] Sherman K. Stein. “Mean Distance” in Kepler’s Third Law. Mathematics Magazine, Vol. 50, No. 3 (May, 1977), pp. 160–162

Dutton’s Navigation and Piloting

This morning Eric Berger posted a clip from The Hunt for Red October as a meme, and that made me think about the movie.

I watched Red October this evening, for the first time since around the time it came out in 1990, and was surprised by a detail in one of the scenes. I recognized one of the books: Dutton’s Navigation and Piloting.

Screen shot with Dutton's Navigation and Piloting

I have a copy of that book, the 14th edition. The spine looks exactly the same. The first printing was in 1985, and I have have the second printing from 1989. So it is probably the same edition and maybe even the same printing as in the movie. I bought the book last year because it was recommended for something I was working on. Apparently it’s quite a classic since someone thought that adding a copy in the background would help make a realistic set for a submarine.

My copy has a gold sticker inside, indicating that the book came from Fred L. Woods Nautical Supplies, though I bought my copy used from Alibris.

Here’s a clip from the movie featuring Dutton’s.

Dutton’s has a long history. From the preface:

Since the first edition of Navigation and Nautical Astronomy (as it was then titled) was written by Commander Benjamin Dutton, U. S. Navy, and published in 1926, this book has been updated and revised. The title was changed after his death to more accurately reflect its focus …

The 14th edition contains a mixture of classical and electronic navigation, navigating by stars and by satellites. It does not mention GPS; that is included in the latest edition, the 15th edition published in 2003.

Related posts

Oval orbits?

Johannes Kepler thought that planetary orbits were ellipses. Giovanni Cassini thought they were ovals. Kepler was right, but Cassini wasn’t far off.

In everyday speech, people use the words ellipse and oval interchangeably. But in mathematics these terms are distinct. There is one definition of an ellipse, and several definitions of an oval. To be precise, you have to say what kind of oval you have in mind, and in the context of this post by oval I will always mean a Cassini oval.

Ellipses and ovals each have two foci, f1 and f2. Let d1(p) and d2(p) be the distances from a point p to each of the foci. For an ellipse, the sum d1(p) + d2(p) is constant. For an oval, the product d1(p) d2(p) is constant.

In [1] the authors argue that just as planetary orbits are nearly circles, they’re also nearly ovals. This post will look at how far the earth’s orbit is from a circle and from an oval.

We need a way to specify which oval we want to compare to the ellipse of earth’s orbit. We’ll do this by equating the major and minor semi-axes of the two curves. These are usually denoted a and b, but the same variables have a different meaning in the context of ovals, so I’ll denote them by M for major and m for minor.

The equation of an ellipse is

(x/M)² + (y/m)² = 1

and the equation of an oval is

((x + a)² + y²) ((xa)² + y²) = b².

Setting x = 0 in the equation of an oval tells us

m² = ba²

and setting y = 0 tells us

M² = b + a².

So

b = (M² + m²)/2

and

a² = (M² – m²)/2.

For the earth’s orbit, M = 1.00000011 and m = 0.99986048 measured in AU, astronomical units. So or oval has parameters

a = 0.011816102

and

b = 0.99986060.

If you plot Kepler’s ellipse and Cassini’s oval for earth’s orbit at the same time, you can’t see the difference.

Planet orbits are nearly circular. If we compare a circle of radius 1 AU with Kepler’s ellipse we get a maximum error of about 1 part in 10,000.

But if we compare Cassini’s oval with Kepler’s ellipse we get a maximum error of about 1 part in 100,000,000.

 

In short, a circle is a good approximation to earth’s orbit, but a Cassini oval is four orders of magnitude better.

It would be difficult to empirically distinguish an ellipse from an oval as the shape of earth’s orbit, but theory is clearly on Kepler’s side since his ellipses fall out of Newton’s laws. Cassini’s error was more qualitative than quantitative.

More orbital mechanics posts

[1] Kepler’s ellipse, Cassini’s oval and the trajectory of planets. B Morgado1 and V Soares. 2014 Eur. J. Phys. 35 025009 DOI 10.1088/0143-0807/35/2/025009

Solar Day vs Sidereal Day

How long does it take the earth to complete one rotation on its axis? The answer depends on your frame of reference. A solar day is the time it takes for the sun to appear at the same position in the sky. A sidereal day is the time it takes for a distant star to appear in the same position. These are not the same.

This post will illustrate the difference between a solar day and a sidereal day. To make things a little simpler, assume the earth has a perfectly circular orbit around the sun, and that the earth’s axis of rotation is perpendicular to the orbital plane, i.e. there is no axial tilt.

Sidereal day

Imagine an astronomer Alice observing our solar system from the vantage point of a distant star. How distant? We’ll show below that it doesn’t make much difference, but we’ll assume for now that she is “infinitely” far away, which means “far enough away that we don’t have to worry about exactly how far.”

The time it takes for Alice to observe one rotation of the earth on its axis is 1 sidereal day = 24 sidereal hours. We will suppose that from Alice’s perspective it takes 360 (sidereal) days for the earth to orbit the sun.

Here comes the sun

Let’s set the origin of our coordinate system at the sun. Assume that at time t = 0 the earth is located at (1, 0) and that an observer Bob is at the bottom of a deep well on the equator looking up at the sun. The time between Bob’s observations of the sun is one solar day.

Twenty four (sidereal) hours later, the earth is located at (cos 1°, sin 1°) and Bob’s well is parallel to the x-axis, but not looking directly at the sun. He will be looking at the sun a few minutes later when the earth’s rotation brings the sun into view.

Just how long will Bob have to wait to see the sun again? About 4 minutes, because the sun is 1° away from his line of sight, and he needs the earth to turn 1°, which takes 24/360 hours, or 4 minutes. This is not exactly correct though, because the earth has moved in its orbit during that 4 minutes.

So a solar day is about 4 minutes longer than the time it takes for the earth to rotate on its axis (from Alice’s perspective). Even though we’ve made several simplifying assumptions, our estimate only differs from the exact value by about 4 seconds.

Equations

Let the distance from the center of the earth to the sun be 1 and the radius of the earth be ε. Let t be time in (sidereal) days. The position of the center of the earth as a function of time will be

(cos t°, sin t°)

and Bob’s position is

(cos t° − ε cos 360t° , sin t° − ε sin 360t°).

Bob sees the sun at time t = 0. When will he see the sun next? When the slope of the line from the center of the earth to his position equals the slope of the line from the sun to the center of the earth, i.e.

tan t° = tan 360t°

The value of ε doesn’t matter.

There are two solutions to this equation, one when Bob is facing the sun and another when he is on the opposite side of the earth from the sun. We know t ≈ 1 and the solution near 1 is on the correct side of the sun, i.e.

t = 1.002786 days = 1 day + 4.011142 minutes.

Distant stars

Now let’s imagine at that time t = 0, while Bob is looking up at the sun, on the opposite side of the earth Charlie at the bottom of another well looking up a star a distance R away, located at (R + 1, 0).

Charlie’s position as a function of t is

(cos t° + ε cos 360t° , sin t° + ε sin 360t°).

When will Charlie see his star next? When the line from the center of the earth through his position has the same slope as the line from Charlie to the star. This happens when

tan 360t° = sin t° / R,

the right side above being the tangent of the angle at R of a right triangle with base on the x-axis and hypotenuse running from the star to Charlie. As R goes to infinity, the right side goes to 0, and the solutions for t are integer numbers of days.

Charlie will see the star at time slightly less than 1, so let t = 1 − x. So we need to solve

tan 360(1 − x)° = − tan x° = − sin (1 − x)°/R.

Using the approximation sin θ ≈ θ ≈ tan θ for small angles θ (in radians) we have

−2πx ≈ −2π(1 − x)/(360R)

and so

x = 1/(360R + 1).

Now R is very large. For the nearest star, Proxima Centauri, R is about 270,000 AU. So x is on the order of 10−8 or smaller. This is why it doesn’t matter which star Alice is located near: a sidereal day is essentially the same whatever distant star you use as your reference point.

Sphere of influence

Apollo 11 orbit diagram via NASA

Suppose a spaceship is headed from the earth to the moon. At some point we say that the ship has left the earth’s sphere of influence is now in the moon’s sphere of influence (SOI). What does that mean exactly?

Wrong explanation #1

One way you’ll hear it described is that the moon’s sphere of influence is the point at which the earth is no longer pulling on the spaceship, but that’s nonsense. Everything has some pull on everything else, so how do you objectively say the earth’s pull is small enough that we’re now going to call it zero? And as we’ll see below, the earth’s pull is still significant even when the spaceship leaves earth’s SOI.

Wrong explanation #2

Another explanation you’ll hear is the moon’s sphere of influence is the point at which the moon is pulling on the spaceship harder than the earth is. That’s a better explanation, but still not right.

The distance from the earth to the moon is about 240,000 miles, and the radius of the moon’s SOI is about 40,000 miles. So when a spaceship first enters the moon’s SOI, it is five times closer to the moon than to the earth.

Newton’s law of gravity says gravitational force between two bodies is proportional to the product of their masses and inversely proportional to the square of the distance. The mass of the earth is about 80 times that of the moon. So at the moon’s SOI boundary, the pull of the earth is 80/25 times as great as that of the moon, about three times greater.

Correct exlanation

So what does sphere of influence mean? The details are a little complicated, but essentially the moon’s sphere of influence is the point at which it’s more accurate to say the ship is orbiting the moon than to say it is orbiting the earth.

How can we say it’s better to think of the ship orbiting the moon than the earth when the earth is pulling on the ship three times as hard as the moon is? What matters is not so much the force of earth’s gravity as the effect of that force on the equations of motion.

The motion of an object between the earth and the moon could be viewed as an orbit around earth, with the moon exerting a perturbing influence, or as an orbit around the moon, with the earth exerting a perturbing influence.

At the boundary of the moon’s SOI the effect of the earth perturbing the ship’s orbit around the moon is equal to the effect of the moon perturbing its orbit around the earth. It’s a point at which it is convenient to switch perspectives. It’s not a physical boundary [1]. Also, the “sphere” of influence is not exactly a sphere but an approximately spherical region.

The moon has an effect on the ship’s motion when it’s on our side of the moon’s SOI, and the earth still has an effect on its motion after it has crossed into the moon’s SOI.

Calculating the SOI radius

As a rough approximation, the SOI boundary is where the ratio of the distances to the two bodies, e.g. moon and earth, equals the ratio of their masses to the exponent 2/5:

r/R = (m/M)2/5.

This approximation is better when the mass M is much larger than the mass m. For the earth and the moon, the equation is good enough for back-of-the-envelope equations but not accurate enough for planning a mission to the moon. Using the round numbers in this post, the left side of the equation is 1/5 = 0.2 and the right side is (1/80)0.4 = 0.17.

Context

Everything above has been in the context of the earth-moon system. Sphere of influence is defined relative to two bodies. When we spoke of a spaceship leaving the earth’s sphere of influence, we implicitly meant that it was leaving the earth’s sphere of influence relative to the moon.

Relative to the sun, the earth’s sphere of influence reaches roughly 600,000 miles. You could calculate this distance using the equation above. A spaceship like Artemis leaves the earth’s sphere of influence relative to the moon at some point, but never leaves the earth’s sphere of influence relative to the sun.

Related posts

[1] The sphere of influence sounds analogous to a continental divide, where rain falling on one side of the line ends up in one ocean and rain falling on the other side ends up in another ocean. But it’s not that way. I suppose you could devise an experiment to determine which side of the SOI you’re on, but it would not be a simple experiment. An object placed between the earth and the moon at the SOI boundary would fall to the earth unless it had sufficient momentum toward the moon.

Lagrange’s quintic and Descartes’ rule

Do fifth degree polynomial equations come up in applications? Yes, and this post will give an example.

In general the three-body problem, describing the motion of three objects interacting under gravity, does not have a closed-form solution. However, Euler and Lagrange discovered a few special cases that do have closed-form solutions. We will look at Lagrange’s quintic, an equation that came out of Lagrange’s elaboration on Euler’s solution involving three masses moving so that they remain colinear.

Lagrange’s quintic equation is

\begin{align*} (m_1 + m_2)x^5 &+ (3m_1 + 2m_2)x^4 + (3m_1 + m_2)x^3 \\ &- (m_2 + 3m_3)x^2 - (2m_2 + 3m_3)x - (m_2+m_3) = 0 \end{align*}

where m1, m2, and m3 are the masses of the three bodies and x represents the ratio of the distances from the second body to the other two bodies.

Descartes’ rule of signs says that the equation above has exactly one positive root. This is because the signs of the coefficients only change once: the first three coefficients are positive and the next three are negative. Only positive roots are physically meaningful since x represents a ratio of (unsigned) distances, so Laplace’s quintic has a unique meaningful solution. Note that this argument places no restrictions on the relative masses of the three objects.

We can set x = −w and apply Descartes’ rule to the polynomial equation in w. This tells us that the equation has 0, 2, or 4 solutions for positive w, i.e. for negative x. But this doesn’t tell us anything we wouldn’t know from more general principles. Fifth degree polynomials have five roots, and complex roots to equations with real coefficients come in conjugate pairs, so Lagrange equation either has two complex roots (and thus two negative real roots) or four complex roots (and no negative real roots).

Related posts

Artemis lunar orbit

I haven’t been able to find technical details of the orbit of Artemis I, and some of what I’ve found has been contradictory, but here are some back-of-the-envelope calculations based on what I’ve pieced together. If someone sends me better information I can update this post.

Artemis is in a highly eccentric orbit around the moon, coming within 130 km (80 miles) of the moon’s surface at closest pass, and this orbit will take 14 days to complete. The weak link in this data is “14 days.” Surely this number has been rounded for public consumption.

If we assume Artemis is in a Keplerian orbit, i.e. we can ignore the effect of the Earth, then we can calculate the shape of the orbit using the information above. This assumption is questionable because as I understand it the reason for such an eccentric orbit has something to do with Lagrange points, which means the Earth’s gravity matters. Still, I image the effect of Earth’s gravity is a smaller source of error than the lack of accuracy in knowng the period.

Solving for axes

Artemis is orbiting the moon similarly to how the Mars Orbiter Mission orbited Mars. We can use Kepler’s equation for period T to solve for the semi-major axis a of the orbit.

T = 2π √(a³/μ)

Here μ = GM, with G being the gravitational constant and M being the mass of the moon. Now

G = 6.674 × 10−11 N m²/kg²

and

M = 7.3459 × 1022 kg.

If we assume T is 14 × 24 × 3600 seconds, then we get

a = 56,640 km

or 35,200 miles. The value of a is rough since the value of T is rough.

Assuming a Keplerian orbit, the moon is at one focus of the orbit, located a distance c from the center of the ellipse. If Artemis is 130 km from the surface of the moon at perilune, and the radius of the moon is 1737 km, then

c = a − (130 + 1737) km = 54,770 km

or 34,000 miles. The semi-minor axis b satisfies

b² = a² − c²

and so

b = 14,422 km

or 8962 miles.

Orbit shape

The eccentricity is c/a = 0.967. As I’ve written about before, eccentricity is hard to interpret intuitively. Aspect ratio is much easier to imaging than eccentricity, and the relation between the two is highly nonlinear.

Assuming everything above, here’s what the orbit would look like. The distances on the axes are in kilometers.

Artemis moon orbit

The orbit is highly eccentric: the center of the orbit is far from the foci of the orbit. But the aspect ratio is about 1/4. The orbit is only about 4 times wider in one direction than the other. It’s obviously an ellipse, but it’s not an extremely thin ellipse.

Lagrange points

In an earlier post I showed how to compute the Lagrange points for the Sun-Earth system. We can use the same equations for the Earth-Moon system.

The equations for the distance r from the Lagrange points L1 and L2 to the moon are

\frac{M_1}{(R\pm r)^2} \pm \frac{M_2}{r^2}=\left(\frac{M_1}{M_1+M_2}R \pm r\right)\frac{M_1+M_2}{R^3}

The equation for L1 corresponds to taking ± as − and the equation for L2 corresponds to taking ± as +. Here M1 and M2 are the masses of the Earth and Moon respectively, and R is the distance between the two bodies.

If we modify the code from the earlier post on Lagrange points we get

L1 = 54784 km
L2 = 60917 km

where L1 is on the near side of the moon and L2 on the far side. We estimated the semi-major axis a to be 56,640 km. This is about 3% larger than the distance from the moon to L1. So the orbit of Artemis passes near or through L1. This assumes the axis of the Artemis orbit is aligned with a line from the moon to Earth, which I believe is at least approximately correct.

Python code to solve Kepler’s equation

The previous post looked at solving Kepler’s equation using Newton’s method. The problem with using Newton’s method is that it may not converge when the eccentricity e is large unless you start very close to the solution. As discussed at the end of that post, John Machin came up with a clever way to start close. His starting point is defined as follow.

\begin{align*} n &= \sqrt{5 + \sqrt{16 + \frac{9}{e}}} \\ M &= n \left((1-e)s + \frac{e(n^2 - 1) + 1}{6}s^3 \right) \\ x_0 &= n \arcsin s \end{align*}

The variable s is implicitly defined as the root of a cubic polynomial. This could be messy. Maybe there are three real roots and we have to decide which one to use. Fortunately this isn’t the case.

The discriminant of our cubic equation is negative, so there is only one real root. And because our cubic equation for s has no s² term the expression for the root isn’t too complicated.

Here’s Python code to solve Kepler’s equation using Newton’s method with Machin’s starting point.

    from numpy import sqrt, cbrt, pi, sin, cos, arcsin, random
    
    # This will solve the special form of the cubic we need.
    def solve_cubic(a, c, d):
        assert(a > 0 and c > 0)
        p = c/a
        q = d/a
        k = sqrt( q**2/4 + p**3/27 )
        return cbrt(-q/2 - k) + cbrt(-q/2 + k)
    
    # Machin's starting point for Newton's method
    # See johndcook.com/blog/2022/11/01/kepler-newton/
    def machin(e, M):
        n = sqrt(5 + sqrt(16 + 9/e))
        a = n*(e*(n**2 - 1)+1)/6
        c = n*(1-e)
        d = -M
        s = solve_cubic(a, c, d)
        return n*arcsin(s)    
    
    def solve_kepler(e, M):
        "Find E such that M = E - e sin E."
        assert(0 <= e < 1)
        assert(0 <= M <= pi) 
        f = lambda E: E - e*sin(E) - M 
        E = machin(e, M) 
        tolerance = 1e-10 

        # Newton's method 
        while (abs(f(E)) > tolerance):
            E -= f(E)/(1 - e*cos(E))
        return E

To test this code, we’ll generate a million random values of e and M, solve for the corresponding value of E, and verify that the solution satisfies Kepler’s equation.

    random.seed(20221102)
    N = 1_000_000
    e = random.random(N)
    M = random.random(N)*pi
    for i in range(N):
        E = solve_kepler(e[i], M[i])
        k = E - e[i]*sin(E) - M[i]
        assert(abs(k) < 1e-10)
    print("Done")

All tests pass.

Machin’s starting point is very good, and could make an adequate solution on its own if e is not very large and if you don’t need a great deal of accuracy. Let’s illustrate by solving Kepler’s equation for the orbit of Mars with eccentricity e = 0.09341.

Error in Machin's starting guess as a function of M

Here the maximum error is 0.01675 radians and the average error is 0.002486 radians. The error is especially small for small values of M. When M = 1, the error is only 1.302 × 10−5 radians.

Solving Kepler’s equation with Newton’s method

Postage stamps featuring Kepler and Newton

In the introduction to his book Solving Kepler’s Equation Over Three Centuries, Peter Colwell says

In virtually every decade from 1650 to the present there have appeared papers devoted to the Kepler problem and its solution.

This is remarkable because Kepler’s equation isn’t that hard to solve. It cannot be solved in closed form using elementary functions, but it can be solved in many other ways, enough ways for Peter Colwell to write a survey about. One way to find a solution is simply to guess a solution, stick it back in, and iterate. More on that here.

Researchers keep writing about Kepler’s equation, not because it’s hard, but because it’s important. It’s so important that a slightly more efficient solution is significant. Even today with enormous computing resources at our disposal, people are still looking for more efficient solutions. Here’s one that was published last year.

Kepler’s equation

What is Kepler’s equation, and why is it so important?

Kepler’s problem is to solve

M = E - e \sin E

for E, given M and e, assuming 0 ≤ M ≤ π and 0 < e < 1.

This equation is important because it essentially tells us how to locate an object in an elliptical orbit. M is mean anomaly, e is eccentricity, and E is eccentric anomaly. Mean anomaly is essentially time. Eccentric anomaly is not exactly the position of the orbiting object, but the position can be easily derived from E. See these notes on mean anomaly and eccentric anomaly. This is because we’re using our increased computing power to track more objects, such as debris in low earth orbit or things that might impact Earth some day.

Newton’s method

A fairly obvious approach to solving Kepler’s equation is to use Newton’s method. I think Newton himself applied his eponymous method to this equation.

Newton’s method is very efficient when it works. As it starts converging, the number of correct decimal places doubles on each iteration. The problem is, however, that it may not converge. When I taught numerical analysis at Vanderbilt, I used a textbook that quoted this nursery rhyme at the beginning of the chapter on Newton’s method.

There was a little girl who had a little curl
Right in the middle of her forehead.
When she was good she was very, very good
But when she was bad she was horrid.

To this day I think of that rhyme every time I use Newton’s method. When Newton’s method is good, it is very, very good, converging quadratically. When it is bad, it can be horrid, pushing you far from the root, and pushing you further away with each iteration. Finding exactly where Newton’s method converges or diverges can be difficult, and the result can be quite complicated. Some fractals are made precisely by separating converging and diverging points.

Newton’s method solves f(x) = 0 by starting with an initial guess x0 an iteratively applies

x_{n+1} = x_n - \frac{f(x_n)}{f'(x_n)}

Notice that

A sufficient condition for Newton’s method to converge is for x0 to belong to a disk around the root where

\left| \frac{f(x)f''(x)}{f'(x)^2}\right| < 1

throughout the disk.

In our case f is the function

f(x; e, M) = x - e \sin(x) - M

where I’ve used x instead of E because Mathematica reserves E for the base of natural logarithms. To see whether the sufficient condition for convergence given above applies, let’s define

g(x; e, M) = \left| \frac{e(x - e \sin x - M) \sin x}{(1 - e\cos x)^2} \right|

Notice that the denominator goes to 0 as e approaches 1, so we should expect difficulty as e increases. That is, we expect Newton’s method might fail for objects in a highly elliptical orbit. However, we are looking at a sufficient condition, not a necessary condition. As I noted here, I used Newton’s method to solve Kepler’s equation for a highly eccentric orbit, hoping for the best, and it worked.

Starting guess

Newton’s method requires a starting guess. Suppose we start by setting E = M. How bad can that guess be? We can find out using Lagrange multipliers. We want to maximize

(E-M)^2

subject to the constraint that E and E satisfy Kepler’s equation. (We square the difference to get a differentiable objective function to maximize. Minimizing the squared difference minimizes the absolute difference.)

Lagrange’s theorem tells us

\begin{pmatrix} 2x \\ -2M\end{pmatrix} = \lambda \begin{pmatrix} 1 - e \cos x \\ -1\end{pmatrix}

and so λ = 2M and

2x = 2M(1 - e\cos x)

We can conclude that

|x - M| \leq \frac{|e \cos x|}{2} \leq \frac{e}{2} \leq \frac{1}{2}

This says that an initial guess of M is never further than a distance of 1/2 from the solution x, and its even closer when the eccentricity e is small.

If e is less than 0.55 then Newton’s method will converge. We can verify this in Mathematica with

    NMaximize[{g[x, e, M], 0 < e < 0.55, 0 <= M <= Pi, 
        Abs[M - x] < 0.5}, {x, e, M}]

which returns a maximum value of 0.93. The maximum value is an increasing function of the upper bound on e, so converting for e = 0.55 implies converging for e < 0.55. On the other hand, we get a maximum of 1.18 when we let e be as large as 0.60. This doesn’t mean Newton’s method won’t converge, but it means our sufficient condition cannot guarantee that it will converge.

A better starting point

John Machin (1686–1751) came up with a very clever, though mysterious, starting point for solving Kepler’s equation with Newton’s method. Machin didn’t publish his derivation, though later someone was able to reverse engineer how Machin must have been thinking. His starting point is as follows.

\begin{align*} n &= \sqrt{5 + \sqrt{16 + \frac{9}{e}}} \\ M &= n \left((1-e)s + \frac{e(n^2 - 1) + 1}{6}s^3 \right) \\ x_0 &= n \arcsin s \end{align*}

This produces an adequate starting point for Newton’s method even for values of e very close to 1.

Notice that you have to solve a cubic equation to find s. That’s messy in general, but it works out cleanly in our case. See the next post for a Python implementation of Newton’s method starting with Machin’s starting point.

There are simpler starting points that are better than starting with M but not as good as Machin’s method. It may be more efficient to spend less time on the best starting point and more time iterating Newton’s method. On the other hand, if you don’t need much accuracy, and e is not too large, you could use Machin’s starting point as your final value and not use Newton’s method at all. If e < 0.3, as it is for every planet in our solar system, then Machin’s starting point is accurate to 4 decimal places (See Appendix C of Colwell’s book).