How medieval astronomers made trig tables

How would you create a table of trig functions without calculators or calculus?

It’s not too hard to create a table of sines at multiples of 3°. You can use the sum-angle formula for sines

sin(α+β) = sin α cos β + sin β cos α.

to bootstrap your way from known values to other values. Elementary geometry gives you the sines of 45° and 30°, and the sum-angle formula will then give you the sine of 75°. From Euclid’s construction of a 5-pointed star you can find the sine of 72°. Then you can use the sum-angle formula to find the sine of 3° from the sines of 75° and 72°. Ptolemy figured this out in the 2nd century AD.

But if you want a table of trig values at every degree, you need to find the sine of 1°. If you had that, you could bootstrap your way to every other integer number of degrees. Ptolemy had an approximate solution to this problem, but it wasn’t very accurate or elegant.

The Persian astronomer Jamshīd al-Kāshī had a remarkably clever solution to the problem of finding the sine of 1°. Using the sum-angle formula you can find that

sin 3θ = 3 sin θ – 4 sin3 θ.

Setting θ = 1° gives you a cubic equation for the unknown value of sin 1° involving the known value of sin 3°. However, the cubic formula wasn’t discovered until over a century after al-Kāshī. Instead, he used a numerical algorithm more widely useful than the cubic formula: finding a fixed point of an iteration!

Define f(x) = (sin 3° + 4x3)/3. Then sin 1° is a fixed point of f. Start with an approximate value for sin 1° — a natural choice would be (sin 3°)/3 — and iterate. Al-Kāshī used this procedure to compute sin 1° to 16 decimal places.

Here’s a little Python code to play with this algorithm.

from numpy import sin, deg2rad

sin3deg = sin(deg2rad(3))

def f(x):
    return (sin3deg + 4*x**3)/3

x = sin3deg/3
for i in range(4):
    x = f(x)

This shows that after only three iterations the method has converged to floating point precision, which coincidentally is about 16 decimal places, the same as al-Kāshī’s calculation.

Source: Heavenly Mathematics: The Forgotten Art of Spherical Trigonometry


Roughly speaking, an ergodic system is one that mixes well. You get the same result whether you average its values over time or over space.

This morning I ran across the etymology of the word:

In the late 1800s, the physicist Ludwig Boltzmann needed a word to express the idea that if you took an isolated system at constant energy and let it run, any one trajectory, continued long enough, would be representative of the system as a whole. Being a highly-educated nineteenth century German-speaker, Boltzmann knew far too much ancient Greek, so he called this the “ergodic property”, from ergon “energy, work” and hodos “way, path.” The name stuck.

Found here, footnote on page 479.

Other etymological footnotes:


Miscellaneous math notes

This web site started as static HTML files. Later I added a WordPress blog, but still wrote some things as static HTML pages for various reasons. Now I’ve moved most of those static pages to WordPress pages so that they’ll have the same style as the blog.

There’s not a good way to find these pages except through search. So I plan to categorize them and write a short post each Wednesday for the next few weeks listing some related pages. This post starts the series with math notes that didn’t fall into any other category.

See also posts tagged math.

Next week: Emacs resources

Googol and googolplex

Numericon gives the history of the words googol and googolplex:

… the famous googol, 10100 (a 1 followed by 100 zeros), defined in 1929 by American mathematician Edward Kasner and named by his nine-year-old nephew, Milton Sirotta. Milton went even further and came up with the googolplex, now defined as 10googol but initially defined by Milton as a 1, followed by writing zeros until you get tired.

Related post: There isn’t a googol of anything

Four brief reviews

Princeton University Press and No Starch Press both sent me a couple books this week. Here are a few brief words about each.

The first from Princeton was The Best Writing on Mathematics 2014. My favorite chapters were The Beauty of Bounded Gaps by Jordan Ellenberg and The Lesson of Grace in Teaching by Francis Su. The former is a very high-level overview of recent results regarding gaps in prime numbers. The latter is taken from the Francis’ Haimo Teaching Award lecture. A recording of the lecture and a transcript are available here.

The second book from Princeton was a new edition of Andrew Hodges’ book Alan Turing: The Enigma. This edition has a new cover and the new subtitle “The Book That Inspired the Film ‘The Imitation Game.'” Unfortunately I’m not up to reading a 768-page biography right now.

The first book from No Starch Press was a new edition of The Book of CSS3: A Developer’s Guide to the Future of Web Design by Peter Gasston. The book says from the beginning that it is intended for people who have a lot of experience with CSS, including some experience with CSS 3. I tend to ignore such warnings; many books are more accessible to beginners than they let on. But in this case I do think that someone with more CSS experience would get more out of the book. This looks like a good book, and I expect I’ll get more out of it later.

The final book was a new edition of How Linux Works: What Every Superuser Should Know by Brian Ward. I’ve skimmed through this book and would like to go back and read it carefully, a little at a time. Most Unix/Linux books I’ve seen either dwell on shell commands or dive into system APIs. This one, however, seems to live up to its title and give the reader an introduction to how Linux works.

Uniformitarian or Paretoist

A uniformitarian view is that everything is equally important. For example, there are 118 elements in the periodic table, so all 118 are equally important to know about.

The Pareto principle would say that importance is usually very unevenly distributed. The universe is essentially hydrogen and helium, with a few other elements sprinkled in. From an earthly perspective things aren’t quite so extreme, but still a handful of elements make up the large majority of the planet. The most common elements are orders of magnitude more abundant than the least.

The uniformitarian view is a sort of default, not often a view someone consciously chooses. It’s a lazy option. No need to think. Just trudge ahead with no particular priorities.

The uniformitarian view is common in academia. You’re given a list of things to learn, and they all count the same. For example, maybe you have 100 vocabulary words in your Spanish class. Each word contributes one point to your grade on a quiz. The quiz measures what portion of the list you’ve learned, not what portion of that language you’ve learned. A quiz designed to test the latter would weigh words according to their frequency.

It’s easy to slip into a uniformitarian mindset, or a milder version of the same, underestimating how unevenly things are distributed. I’ve often fallen into the latter. I expect things to be unevenly distributed, but then I’m surprised just how uneven they are once I look at some data.

Related posts:

Cyclic fractions

Somewhere along the way you may have noticed that the digits in the decimal expansion of multiples of 1/7 are all rotations of the same digits:

1/7 = 0.142857142857…
2/7 = 0.285714285714…
3/7 = 0.428571428571…
4/7 = 0.571428571428…
5/7 = 0.714285714285…
6/7 = 0.857142857142…

We can make the pattern more clear by vertically aligning the sequences of digits:

1/7 = 0.142857142857…
2/7 =   0.2857142857…
3/7 =  0.42857142857…
4/7 =     0.57142857…
5/7 =      0.7142857…
6/7 =    0.857142857…

Are there more cyclic fractions like that? Indeed there are. Another example is 1/17. The following shows that 1/17 is cyclic:

 1/17 = 0.05882352941176470588235294117647…
 2/17 =           0.1176470588235294117647…
 3/17 =            0.176470588235294117647…
 4/17 =     0.2352941176470588235294117647…
 5/17 =        0.2941176470588235294117647…
 6/17 =      0.352941176470588235294117647…
 7/17 =          0.41176470588235294117647…
 8/17 =               0.470588235294117647…
 9/17 =       0.52941176470588235294117647…
10/17 =  0.5882352941176470588235294117647…
11/17 =              0.6470588235294117647…
12/17 =                0.70588235294117647…
13/17 =             0.76470588235294117647…
14/17 =    0.82352941176470588235294117647…
15/17 =   0.882352941176470588235294117647…
16/17 =         0.941176470588235294117647…

The next denominator to exhibit this pattern is 19. After finding 17 and 19 by hand, I typed “7, 17, 19″ into the Online Encyclopedia of Integer Sequences found a list of denominators of cyclic fractions: OEIS A001913. These numbers are called “full reptend primes” and according to MathWorld “No general method is known for finding full reptend primes.”

“Hello world” is the hard part

Kernighan and Ritchie’s classic book The C Programming Language began with a sample C program that printed “hello world.” Since then “hello world” has come describe the first program you write with any technology, even if it doesn’t literally print “hello world.”

Hello-world programs are often intimidating. People think “I must be a dufus because I find hello-world hard. At this rate I’ll never get to anything interesting.”

The problem is that we confuse the first task with the easiest task. Hello-world programs are almost completely arbitrary. You can’t deduce what a compiler is named, where files must be located, how they must be formatted, etc. You have to be told. The amount of arbitrary material you need to learn is greatest up-front and slowly decreases.

When I started programming I thought I’d quickly get past the hello-world stage and only write substantial programs from then on. Instead, it seems I’ve spent a good chunk of my career writing hello-world programs with no end in sight.


No discussion of hello-world programs would be complete without mentioning possibly the most intimidating hello-world program: the first Windows program in Charles Petzold’s Programming Windows book. I was only able to find the program from the Windows 98 edition of his book. I don’t recall how it differs much from the program in his first edition, but I vaguely remember the original being worse.

   HELLOWIN.C -- Displays "Hello, Windows 98!" in client area
                 (c) Charles Petzold, 1998

#include <windows.h>


int WINAPI WinMain (HINSTANCE hInstance, HINSTANCE hPrevInstance,
                    PSTR szCmdLine, int iCmdShow)
     static TCHAR szAppName[] = TEXT ("HelloWin") ;
     HWND         hwnd ;
     MSG          msg ;
     WNDCLASS     wndclass ;         = CS_HREDRAW | CS_VREDRAW ;
     wndclass.lpfnWndProc   = WndProc ;
     wndclass.cbClsExtra    = 0 ;
     wndclass.cbWndExtra    = 0 ;
     wndclass.hInstance     = hInstance ;
     wndclass.hIcon         = LoadIcon (NULL, IDI_APPLICATION) ;
     wndclass.hCursor       = LoadCursor (NULL, IDC_ARROW) ;
     wndclass.hbrBackground = (HBRUSH) GetStockObject (WHITE_BRUSH) ;
     wndclass.lpszMenuName  = NULL ;
     wndclass.lpszClassName = szAppName ;

     if (!RegisterClass (&wndclass))
          MessageBox (NULL, TEXT ("This program requires Windows NT!"), 
                      szAppName, MB_ICONERROR) ;
          return 0 ;
     hwnd = CreateWindow (szAppName,                  // window class name
                          TEXT ("The Hello Program"), // window caption
                          WS_OVERLAPPEDWINDOW,        // window style
                          CW_USEDEFAULT,              // initial x position
                          CW_USEDEFAULT,              // initial y position
                          CW_USEDEFAULT,              // initial x size
                          CW_USEDEFAULT,              // initial y size
                          NULL,                       // parent window handle
                          NULL,                       // window menu handle
                          hInstance,                  // program instance handle
                          NULL) ;                     // creation parameters
     ShowWindow (hwnd, iCmdShow) ;
     UpdateWindow (hwnd) ;
     while (GetMessage (&msg, NULL, 0, 0))
          TranslateMessage (&msg) ;
          DispatchMessage (&msg) ;
     return msg.wParam ;

LRESULT CALLBACK WndProc (HWND hwnd, UINT message, WPARAM wParam, LPARAM lParam)
     HDC         hdc ;
     RECT        rect ;
     switch (message)
     case WM_CREATE:
          PlaySound (TEXT ("hellowin.wav"), NULL, SND_FILENAME | SND_ASYNC) ;
          return 0 ;
     case WM_PAINT:
          hdc = BeginPaint (hwnd, &ps) ;
          GetClientRect (hwnd, &rect) ;
          DrawText (hdc, TEXT ("Hello, Windows 98!"), -1, &rect,
                    DT_SINGLELINE | DT_CENTER | DT_VCENTER) ;
          EndPaint (hwnd, &ps) ;
          return 0 ;
     case WM_DESTROY:
          PostQuitMessage (0) ;
          return 0 ;
     return DefWindowProc (hwnd, message, wParam, lParam) ;

Help wanted

I’m looking for people to help with some miscellaneous tasks. I don’t expect one person to do everything, but if you’re excellent at any of the following and interested in small projects please let me know.

  • CSS / responsive design
  • WordPress customization
  • Emacs customization
  • Advanced LaTeX
  • Data cleaning and visualization
  • Python (miscellaneous automation scripts)

I don’t have an immediate project to outsource, but these tasks come up occasionally and I’d like to have someone to contact when they do. Mostly these would be small self-contained projects, though data cleaning and visualization could be larger.



People want Swiss Army Knives

I ran across this graphic this morning on Twitter:

comparing a scalpel and a swiss army knife

Obviously the intended message is that scalpels are better than Swiss Army Knives. Certainly the scalpel looks simpler.

But most people would rather have a Swiss Army Knife than a scalpel. Many people, myself included, own a Swiss Army Knife but not a scalpel. (I also have a Letherman multi-tool that the folks at Snow gave me and I like it even better than my Swiss Army Knife.)

People like simplicity, at least a certain kind of simplicity, more in theory than in practice. Minimalist products that end up in the MoMA generally don’t fly off the shelves at Walmart.

The simplicity of a scalpel is superficial. The realistic alternative to a Swiss Army Knife, for ordinary use, is a knife, two kinds of screwdriver, a bottle opener, etc. The Swiss Army Knife is the simpler alternative in that context.

A surgeon would rightfully prefer a scalpel, but not just a scalpel. A surgeon would have a tray full of specialized instruments, collectively more complicated than a Swiss Army Knife.

I basically agree with the Unix philosophy that tools should do one thing well, but even Unix doesn’t follow this principle strictly in practice. One reason is that “thing” and “well” depend on context. The “thing” that a toolmaker has in mind may not exactly be the “thing” the user has in mind, and the user may have a different idea of when a tool has served well enough.

Blue Bonnet Bayes

Blue Bonnet™ used to run commercials with the jingle “Everything’s better with Blue Bonnet on it.” Maybe they still do.

Perhaps in reaction to knee-jerk antipathy toward Bayesian methods, some statisticians have adopted knee-jerk enthusiasm for Bayesian methods. Everything’s better with Bayesian analysis on it. Bayes makes it better, like a little dab of margarine on a dry piece of bread.

There’s much that I prefer about the Bayesian approach to statistics. Sometimes it’s the only way to go. But Bayes-for-the-sake-of-Bayes can expend a great deal of effort, by human and computer, to arrive at a conclusion that could have been reached far more easily by other means.

Related: Bayes isn’t magic

Image via Gallery of Graphic Design

How well does sample range estimate range?

I’ve been doing some work with Focused Objective lately, and today the following question came up in our discussion. If you’re sampling from a uniform distribution, how many samples do you need before your sample range has an even chance of covering 90% of the population range?

This is a variation on a problem I’ve blogged about before. As I pointed out there, we can assume without loss of generality that the samples come from the unit interval. Then the sample range has a beta(n – 1, 2) distribution. So the probability that the sample range is greater than a value c is

\int_c^1 n(n-1) x^{n-2} (1-x) \,dx = 1 - c^{n-1} (n - c(n-1))

Setting c = 0.9, here’s a plot of the probability that the sample range contains at least 90% of the population range, as a function of sample size.

The answer to the question at the top of the post is 16 or 17. These two values of n yield probabilities 0.485 and 0.518 respectively. This means that a fairly small sample is likely to give you a fairly good estimate of the range.

Integration trick

Here’s a clever example from Paul Nahin’s new book Inside Interesting Integrals. Suppose you want to evaluate

\int_{-1}^1 \frac{\cos(x)}{\exp(1/x) + 1}\,dx

Since the range of integration is symmetric around zero, you might think to see whether the integrand is an odd function, in which case the integral would be zero. (More on such symmetry tricks here.) Unfortunately, the integrand is not odd, so that trick doesn’t work directly. However, it does help indirectly.

You can split any function f(x) into its even and odd parts.

f_e(x) = \frac{f(x) + f(-x)}{2} \\ f_o(x) = \frac{f(x) - f(-x)}{2}

The integral of a function over a symmetric interval is the integral of its even part because its odd part integrates to zero. The even part of the integrand above works out to be simply cos(x)/2 and so the integral evaluates to sin(1).