Firsthand knowledge

From C. S. Lewis:

It has always therefore been one of my main endeavors as a teacher to persuade the young that firsthand knowledge is not only more worth acquiring than secondhand knowledge, but it usually much easier and more delightful to acquire.

This quote comes from the essay On the Reading of Old Books, part of the collection God in the Dock: Essays on Theology and Ethics. Lewis says here that it is easier to read Plato or St. Paul, for example, than to read books about Plato or St. Paul.  Lewis says that the fear of reading great authors

… springs from humility. The student is half afraid to meet one of the great philosophers face to face. He feels himself inadequate and thinks he will not understand him. But if he only knew, the great man, just because of his greatness, is much more intelligible than his modern commentators.

This does not only apply to literature. I see the same theme in math. Sometimes early math papers are easier to read because they are more concrete. When I was a postdoc at Vanderbilt I asked Richard Arenstorf about a theorem attributed to him in a book I was reading. He scoffed that he didn’t recognize it. He had done his work in a relatively concrete setting and did not approve of the fancy window dressing the author had placed around his theorem. I sat in on a few lectures by Arenstorf and found them amazingly clear.

The same theme appears in software development. Sometimes you can dive to the bottom of an abstraction hierarchy and find that things are simpler there than you would have supposed. The intervening layers obscure the substance of the program, making its core seem unduly mysterious. Like a mediocre mind commenting on the work of a great mind, developers who build layers of software around core functionality intend to make things easier but may do the opposite.

Related posts

Daylight Saving Time is a huge mess

Daylight Saving Time (DST) is a huge mess. For example, what is the time difference between Paris and Houston? It’s seven hours today. Yesterday it was six. A couple weeks ago it was seven.

Countries observe DST at different times or not at all. Even within one country it can be complicated. For example, the US generally observes DST. But Arizona, Puerto Rico, Hawaii, U.S. Virgin Islands, and American Samoa do not. And even within Arizona, the Navajo Nation does observe DST.

I wish everyone would take a few minutes to learn about universal time, UTC. It is essentially the time in Greenwich, England except it ignores DST. Learn how your time relates to UTC and use UTC when communicating with people who don’t share your time system. This allows, for example, a group to announce the time of a conference call in UTC and let everyone convert that to their local time. This is so much simpler than “your time”, “her time”, “his time” etc.

It’s also convenient to let people know how your time zone relates to UTC. For example, Houston time is UTC-6, i.e. six hours behind UTC. When someone asks in an email what my time zone is, maybe so they could call me during normal business hours, I may say Central Time, UTC-6. (Or in the summer, UTC-5.)

[The abbreviation UTC is an odd compromise. The French wanted to use the abbreviation TUC (temps universel coordonné) and the English wanted to use CUT (coordinated universal time). The compromise was UTC, which doesn’t actually abbreviate anything.]

Update: I’m not suggesting that people set their watches to UTC. Time zones are necessary. And if part of the world wants to have DST and another part doesn’t, vive la différence. I’m only suggesting that people use UTC when communicating with others outside their time zone.

Related post: Software engineering and alarm clocks

Shortest network in 3D

Imagine a set of points in three dimensions. You want to connect the points with the shortest possible network. Sometimes you can make the network shorter by adding extra points. (These extra points are called Steiner points.) How much can extra points help? The answer is worth $1,000.

Here’s an example. Suppose our points are the corners of a unit cube. You can connect these points with a network of length 7. If you add a point in the center of the cube and connect every point to the center, you get a network of length 4 √3 = 6.928. So in this case, adding an extra point made it possible to reduce the size of the minimal spanning network by about 1%. You could do better by adding more points.

What is the most you can reduce the length of the minimum spanning network in three dimensions by adding extra points? The question concerns all possible sets of points, not just a particular set like the example above. It is conjectured that the most you can save is about 21.6%. That is, for any set of points, the ratio of the length of the shortest network with extra points to that of the shortest network without extra points is bounded below by

\sqrt{ \frac{283-3\sqrt{21}}{700} + \frac{9\sqrt{11 - \sqrt{21}}\sqrt{2}}{140}}

In their new book Magical Mathematics, Persi Diaconis and Ron Graham say “We currently offer a thousand dollars for a proof (or disproof) that this ratio is the best possible.”

As unwieldy as the number above appears, it makes some sense. It looks like the square roots come from repeated applications of the Pythagorean theorem. Someone may be able to reverse engineer the example the conjecture is based on by using the form of the proposed lower bound.

(Diaconis and Graham say that the corresponding problem in two dimensions have been solved and the optimal ratio is √3 / 2. However, this paper says that the conjecture is still unresolved, contrary to popular belief.)

You ought to give the kid a chance

Martin Gardner (1914 – 2010) was best known for his articles on recreational mathematics, especially his column in Scientific American which he wrote from 1956 to 1981. Once Gardner wrote a letter of recommendation for a young man applying to graduate school at Harvard.

I don’t know a lot about mathematics, but this kid invented two of the best card tricks of the last ten years. You ought to give him a chance.

The kid was Persi Diaconis. At the time, Diaconis, like Gardner, was something of a mathematical outsider, someone with more creativity than credentials. Diaconis went on to become a mathematics professor at Harvard and won two MacArthur genius awards. He is now a professor at Stanford.

Source: Magical Mathematics (ISBN 0691151644)

Floating point error is the least of my worries

“Nothing brings fear to my heart more than a floating point number.” — Gerald Jay Sussman

The context of the above quote was Sussman’s presentation We really don’t know how to compute. It was a great presentation and I’m very impressed by Sussman. But I take exception to his quote.

I believe what he meant by his quote was that he finds floating point arithmetic unsettling because it is not as easy to rigorously understand as integer arithmetic. Fair enough. Floating point arithmetic can be tricky. Things can go spectacularly bad for reasons that catch you off guard if you’re unprepared. But I’ve been doing numerical programming long enough that I believe I know where the landmines are and how to stay away from them. And even if I’m wrong, I have bigger worries.

Nothing brings fear to my heart more than modeling error.

The weakest link in applied math is often the step of turning a physical problem into a mathematical problem. We begin with a raft of assumptions that are educated guesses. We know these assumptions can’t be exactly correct, but we suspect (hope) that the deviations from reality are small enough that they won’t invalidate the conclusions. In any case, these assumptions are usually far more questionable than the assumption that floating point arithmetic is sufficiently accurate.

Modeling error is usually several orders of magnitude greater than floating point error. People who nonchalantly model the real world and then sneer at floating point as just an approximation strain at gnats and swallow camels.

In between modeling error and floating point error on my scale of worries is approximation error. As Nick Trefethen has said, if computers were suddenly able to do arithmetic with perfect accuracy, 90% of numerical analysis would remain important.

To illustrate the difference between modeling error, approximation error, and floating point error, suppose you decide that the probability of something can be represented by a normal distribution. This is actually two assumptions: that the process is random, and that as a random variable it has a normal distribution. Those assumptions won’t be exactly true, so this introduces some modeling error.

Next we have to compute something about a normal distribution, say the probability of a normal random variable being in some range. This probability is given by an integral, and some algorithm estimates this integral and introduces approximation error. The approximation error would exist even if the steps in the algorithm could be carried out in infinite precision. But the steps are not carried out with infinite precision, so there is some error introduced by implementing the algorithm with floating point numbers.

For a simple example like this, approximation error and floating point error will typically be about the same size, both extremely small. But in a more complex example, say something involving a high-dimensional integral, the approximation error could be much larger than floating point error, but still smaller than modeling error. I imagine approximation error is often roughly the geometric mean of modeling error and floating point error, i.e. somewhere around the middle of the two on a log scale.

In Sussman’s presentation he says that people worry too much about correctness. Often correctness is not that important. It’s often good enough to produce a correct answer with reasonably high probability, provided the consequences of an error are controlled. I agree, but in light of that it seems odd to be too worried about inaccuracy from floating point arithmetic. I suspect he’s not that worried about floating point and that the opening quote was just an entertaining way to say that floating point math can be tricky.

More on floating point computing