A couple more variations on an ancient theme

I’ve written a couple posts on the approximation

\cos x \approx \frac{\pi^2 - 4x^2}{\pi^2 + x^2}

by the Indian astronomer Aryabhata (476–550). The approximation is accurate for x in [−π/2, π/2].

The first post collected a Twitter thread about the approximation into a post. The second looked at how far the coefficients in Aryabhata’s approximation are from the optimal approximation as a ratio of quadratics.

This post will answer a couple questions. First, what value of π did Aryabhata have and how would that effect the approximation error? Second, how bad would Aryabhata’s approximation be if we used the approximation π² ≈ 10?

Using Aryabhata’s value of π

Aryabhata knew the value 3.1416 for π. We know this because he said that a circle of diameter 20,000 would have circumference  62,832. We don’t know, but it’s plausible that he knew π to more accuracy and rounded it to the implied value.

Substituting 3.1416 for π changes the sixth decimal of the approximation, but the approximation is good to only three decimal places, so 3.1416 is as good as a more accurate approximation as far as the error in approximating cosine is concerned.

Using π² ≈ 10

Substituting 10 for π² in Aryabhata’s approximation gives an approximation that’s convenient to evaluate by hand.

\cos x \approx \frac{10 - 4x^2}{10 + x^2}

It’s very accurate for small values of x but the maximum error increases from 0.00163 to 0.01091. Here’s a plot of the error.

Ancient accurate approximation for sine

This post started out as a Twitter thread. The text below is the same as that of the thread after correcting an error in the first part of the thread. I also added a footnote on a theorem the thread alluded to.

***

The following approximation for sin(x) is remarkably accurate for 0 < x < π.

\sin(x) \approx \frac{16x(\pi - x)}{5\pi^2 - 4x(\pi - x)}

The approximation is so good that you can’t see the difference between the exact value and the approximation until you get outside the range of the approximation.

Here’s a plot of just the error.

This is a very old approximation, dating back to Aryabhata I, around 500 AD.

In modern terms, it is a rational approximation, quadratic in the numerator and denominator. It’s not quite optimal since the ripples in the error function are not of equal height [1], but the coefficients are nice numbers.

***

As pointed out in the comments, replacing x with π/2 − x in order to get an approximation for cosine gives a much nicer equation.

\cos x \approx \frac{\pi^2 - 4x^2}{\pi^2 + x^2}

***

[1] The equioscillation theorem says that the optimal approximation will have ripples of equal positive and negative amplitude. This post explores the equioscillation theorem further and finds how far Aryabhata’s is from optimal.

New Twitter account: ElementFact

I started a new Twitter account this morning: @ElementFact. I’m thinking the account will post things like scientific facts about each element but also some history around how the element was discovered and named and other lore associated with the element.

@ElementFact icon

We’ll see how this goes. I’ve started many Twitter accounts over the years. Some have taken off and some have not.

Six months ago I started @TensorFact as an experiment to see what would happen with a narrowly focused account. It did moderately well, but I ran out of ideas for content. This will be the last week for that account.

You can see a list of my current Twitter accounts here.

It seemed like a good idea at the time

Things are the way they are because they got that way … one logical step at a time.” — Gerald Weinberg

English spelling is notoriously difficulty. It is the result of decisions that, while often unfortunate, were not unreasonable at the time.

Sometimes pronunciation simplified but spelling remained unchanged. For example, originally all the letters in knife were pronounced. In software development lingo, some spellings were retained for backward compatibility.

Sometimes pronunciation was chosen to reflect etymology. This seems like a strange choice now, but it made more sense at a time when Latin and French were far more commonly known in England, and a time when loan words were pouring into English. These choices have turned out to be unfortunate, but they were not malicious.

For more on this story, see Episode 153: Zombie Letters from The History of English Podcast.

Aquinas on epicycles

C. S. Lewis quotes Thomas Aquinas in The Discarded Image:

In astronomy an account is given of eccentricities and epicycles on the ground that if their assumption is made the sensible appearances as regards celestial motion can be saved. But this is not a strict proof since for all we know they could also be saved by some different assumption.

A warped perspective on math history

Yesterday I posted on @TopologyFact

The uniform limit of continuous functions is continuous.

John Baez replied that this theorem was proved by his “advisor’s advisor’s advisor’s advisor’s advisor’s advisor.” I assume he was referring to Christoph Gudermann.

The impressive thing is not that Gudermann was able to prove this simple theorem. The impressive thing is that he saw the need for the concept of uniform convergence. My impression from reading the Wikipedia article on uniform convergence is that Gudermann alluded to uniform convergence in passing and didn’t explicitly define it or formally prove the theorem above. He had the idea and applied it but didn’t see the need to make a fuss about it. His student Karl Weierstrass formalized the definition and saw how generally useful the concept was.

It’s easy for a student to get a warped perspective of math history. You might implicitly assume that mathematics was developed in the order that you learn it. If as a student you learn about uniform convergence and that the term was coined around 1840, you might reasonably conclude that in 1840 mathematicians were doing what is now sophomore-level math, which is far from true.

Gudermann tossed off the idea of uniform convergence in passing while working on elliptic functions, a topic I wasn’t exposed to until sometime after graduate school. My mathematics education was more nearly reverse-chronological than chronological. I learned 20th century mathematics in school and 19th century mathematics later. Much of the former was a sort of dehydrated abstraction of the latter. Much of my career has been rehydrating, discovering the motivation for and application of ideas I was exposed to as a student.

Related posts

Proto-calculus

David Bressoud has written a new book entitled Calculus Reordered: A History of the Big Ideas. He presents the major themes of calculus in historical order, which is very different from the order in which it is now taught. We now begin with limits, then differentiation, integration, and infinite series. Historically, integration came first and the rigorous definition of limits came last.

I wanted to quote a short excerpt from the book discussing a manuscript of Archimedes. In 1906, Johan Heiberg discovered that a medieval prayer book had recycled vellum that had contained an account of the methods used by Archimedes to compute areas and volumes, a sort of proto-calculus book. Researchers were able to reconstruct much of the original text that had been scraped off in order to reuse the vellum.

In 2003 a NOVA documentary speculated that calculus could have been developed centuries earlier had the manuscript not been lost, and that technology would be far ahead of where it is now, and that “We could have been on Mars today.”

Bressoud does not agree.

That is nonsense. As we shall see, Archimedes’ other works were perfectly sufficient to lead the way toward the development of calculus. The delay was not caused by an incomplete understanding of Archimedes’ methods but by the need to develop other mathematical tools. In particular, scholars needed the modern symbolic language of algebra and its application to curves before they could make substantial progress toward calculus as we know it.

Calculus Reordered book cover

Related post: History of the Central Limit Theorem

Kepler and the contraction mapping theorem

Johannes Kepler

The contraction mapping theorem says that if a function moves points closer together, then there must be some point the function doesn’t move. We’ll make this statement more precise and give a historically important application.

Definitions and theorem

A function f on a metric space X is a contraction if there exists a constant q with 0 ≤ q < 1 such that for any pair of points x and y in X,

d( f(x),  f(y) ) ≤ q d(xy)

where d is the metric on X.

A point x is a fixed point of a function f if f(x) = x.

Banach’s fixed point theorem, also known as the contraction mapping theorem, says that every contraction on a complete metric space has a fixed point. The proof is constructive: start with any point in the space and repeatedly apply the contraction. The sequence of iterates will converge to the fixed point.

Application: Kepler’s equation

Kepler’s equation in for an object in an elliptical orbit says

Me sin EE

where M is the mean anomalye is the eccentricity, and E is the eccentric anomaly. These “anomalies” are parameters that describe the location of an object in orbit. Kepler solved for E given M and e using the contraction mapping theorem, though he didn’t call it that.

Kepler speculated that it is not possible to solve for E in closed form—he was right—and used a couple iterations [1] of

f(E) = M + e sin E

to find an approximate fixed point. Since the mean anomaly is a good approximation for the eccentric anomaly, M makes a good starting point for the iteration. The iteration will converge from any starting point, as we will show below, but you’ll get a useful answer sooner starting from a good approximation.

Proof of convergence

Kepler came up with his idea for finding E around 1620, and Banach stated his fixed point theorem three centuries later. Kepler had the idea of Banach’s theorem, but he didn’t have a rigorous formulation of the theorem or a proof.

In modern terminology, the real line is a complete metric space and so we only need to prove that the function f above is a contraction. By the mean value theorem, it suffices to show that the absolute value of its derivative is less than 1. That is, we can use an upper bound on |‘| as the q in the definition of contraction.

Now

f ‘ (E) = e cos E

and so

|f ‘(E)| ≤ e

for all E. If our object is in an elliptical orbit, e < 1 and so we have a contraction.

Example

The following example comes from [2], though the author uses Newton’s method to solve Kepler’s equation. This is more efficient, but anachronistic.

Consider a satellite on a geocentric orbit with eccentricity e = 0.37255. Determine the true anomaly at three hours after perigee passage, and calculate the position of the satellite.

The author determines that M = 3.6029 and solves Kepler’s equation

Me sin EE

for E, which she then uses to solve for the true anomaly and position of the satellite.

The following Python code shows the results of the first 10 iterations of Kepler’s equation.

    from math import sin

    M = 3.6029
    e = 0.37255

    E = M
    for _ in range(10):
        E = M + e*sin(E)
        print(E)

This produces

    3.437070
    3.494414
    3.474166
    3.481271
    3.478772
    3.479650
    3.479341
    3.479450
    3.479412
    3.479425

and so it appears the iteration has converged to E = 3.4794 to four decimal places.

Note that this example has a fairly large eccentricity. Presumably Kepler would have been concerned with much smaller eccentricities. The eccentricity of Jupiter’s orbit, for example, is around 0.05. For such small values of e the iteration would converge more quickly.

Update: See this post for more efficient ways to solve Kepler’s equation.

Related posts

[1] Bertil Gustafsson saying in his book Scientific Computing: A Historical Perspective that Kepler only used two iterations. Since M gives a good starting approximation to E, two iterations would give a good answer. I imagine Kepler would have done more iterations if necessary but found empirically that two was enough. Incidentally, it appears Gustaffson has a sign error in his statement of Kepler’s equation.

[2] Euler Celestial Analysis by Dora Musielak.

John Napier

Julian Havil has written a new book John Napier: Life, Logarithms, and Legacy.

I haven’t read more than the introduction yet — a review copy arrived just yesterday — but I imagine it’s good judging by who wrote it. Havil’s book Gamma is my favorite popular math book. (Maybe I should say “semi-popular.” Havil’s books have more mathematical substance than most popular books, but they’re still aimed at a wide audience. I think he strikes a nice balance.) His latest book is a scientific biography, a biography with an unusual number of equations and diagrams.

Napier is best known for his discovery of logarithms. (People debate endlessly whether mathematics is discovered or invented. Logarithms are so natural — pardon the pun — that I say they were discovered. I might describe other mathematical objects, such as Grothendieck’s schemes, as inventions.) He is also known for his work with spherical trigonometry, such as Napier’s mnemonic. Maybe Napier should be known for other things I won’t know about until I finish reading Havil’s book.

Fermat’s proof of his last theorem

Fermat famously claimed to have a proof of his last theorem that he didn’t have room to write down. Mathematicians have speculated ever since what this proof must have been, though everyone is convinced the proof must have been wrong.

The usual argument for Fermat being wrong is that since it took over 350 years, and some very sophisticated mathematics, to prove the theorem, it’s highly unlikely that Fermat had a simple proof. That’s a reasonable argument, but somewhat unsatisfying because it’s risky business to speculate on what a proof must require. Who knows how complex the proof of FLT in The Book is?

André Weil offers what I find to be a more satisfying argument that Fermat did not have a proof, based on our knowledge of Fermat himself. Dana Mackinzie summarizes Weil’s argument as follows.

Fermat repeatedly bragged about the n = 3 and n = 4 cases and posed them as challenges to other mathematicians … But the never mentioned the general case, n = 5 and higher, in any of his letters. Why such restraint? Most likely, Weil argues, because Fermat had realized that his “truly wonderful proof” did not work in those cases.

Dana comments:

Every mathematician has had days like this. You think you have a great insight, but then you go out for a walk, or you come back to the problem the next day, and you realize that your great idea has a flaw. Sometimes you can go back and fix it. And sometimes you can’t.

The quotes above come from The Universe in Zero Words. I met Dana Mackinzie in Heidelberg a few weeks ago, and when I came home I looked for this book and his book on the formation of the moon, The Big Splat.

More on Fermat’s last theorem