From the monthly archives:

April 2009

Joel Spolsky coined the term leaky abstraction for programming concepts that usually shield you from messy details but sometimes break down. A perfect abstraction is a black box that you never have to open. A leaky abstraction is a black box that you have to open up occasionally.

Floating point numbers, the computer representations of real numbers,  are leaky abstractions. They work remarkably well: you can usually pretend that a floating point type is a mathematical real number. But sometimes you can’t. The abstraction leaks, though not very often.

Most explanations I’ve heard for the limitations of machine numbers are pedantic. “There are only a finite number of floating point numbers so they can’t represent real numbers well.” That’s not much help. It doesn’t explain why floating point numbers actually do represent real numbers sufficiently well for most applications, and it doesn’t suggest where the abstraction might leak.

A standard floating point number has roughly 16 decimal places of precision and a maximum value on the order of 10308, a 1 followed by 308 zeros. (According to IEEE standard 754, the typical floating point implementation.)

Sixteen decimal places is a lot. Hardly any measured quantity is known to anywhere near that much precision. For example, the constant in Newton’s Law of Gravity is only known to four significant figures. The charge of an electron is known to 11 significant figures, much more precision than Newton’s gravitational constant, but still less than a floating point number. So when are 16 figures not enough? One problem area is subtraction. The other elementary operations — addition, multiplication, division — are very accurate. As long as you don’t overflow or underflow,  these operations often produce results that are correct to the last bit. But subtraction can be anywhere from exact to completely inaccurate.  If two numbers agree to n figures, you can lose up to n figures of precision in their subtraction. This problem can show up unexpectedly in the middle of other calculations. For an example, see this post on calculating standard deviation. See also computing derivatives, the second example in  Five Tips for Floating Point Programming.

What about overflow or underflow? When do you need numbers bigger than 10308? Often you don’t. But in probability calculations, for example, you need them all the time unless you’re clever. It’s common in probability to compute a medium-sized number that is the product of an astronomically large number and an infinitesimally small number. The final result fits into a computer just fine, but the intermediate numbers might not due to overflow or underflow. For example, the maximum floating point number on most computers is somewhere between 170 factorial and 171 factorial. Such large factorials often appear in applications, often in ratios with other large factorials. See Avoiding Overflow, Underflow, and Loss of Precision for tricks on how to work with factorials that would overflow if computed directly.

Often you can afford to be blissfully ignorant of the details of floating point arithmetic, but sometimes you cannot. A great place to learn more is David Goldberg’s paper What Every Computer Scientist Should Know About Floating-Point Arithmetic.

Update: See follow-up post, Anatomy of a floating point number.

Related posts:

How to compute binomial probabilities
NaN, 1.#IND, 1.#INF, and all that

{ 23 comments }

Living within chosen limits

by John on April 2, 2009

The latest EconTalk podcast is an interview with Brink Lindsey, author of The Age of Abundance. Lindsey said that in the 1980’s and 90’s we learned how to live with the freedoms gained in the 1960’s and 70’s. Many negative social indicators soared in the 60’s and 70’s: crime, divorce, drug use, abortion, etc. But during the 80’s and 90’s many of these indicators reversed direction, and Lindsey believes it is because many people have learned to replace legal and societal limits with chosen limits.

I don’t know whether I agree with Lindsey’s sweeping sociological analysis, but I do see some truth to it. I like his phrase “living within chosen limits.” I see a movement toward living within chosen limits on technology. The most obvious example may be Twitter. About 8,000,000 people at this point see some value in limiting their correspondence to 140 character messages. Some other ways I hear of people placing voluntary limits on their technology:

  • Unplugging from the Internet to work
  • Using terminal-style text editors to minimize distraction
  • Using browser-based applications with limited functionality to avoid installing software
  • Setting a five-sentence limit on email messages
  • Paper organizers, e.g. the Hipster PDA

I imagine the people who adopt these limitations will moderate their approach over time. Instead of unplugging from the Internet, they’ll make better use of it and become more disciplined. They may decide that some modern word processor features are worthwhile but still chose something more streamlined than Microsoft Word.

It may take a generation or more to learn how to take advantage of the new possibilities. We’re in a period of excess now, analogous to the culture of the 1960’s. It will be interesting to see what the analogy of the 80’s and 90’s will be.

Related posts from Kevin Kelly:

Neo-Amish Drop Outs
Amish Hackers

Related posts here:

Strategy for dealing with information overload
Selective use of technology
Getting to the bottom of things

{ 3 comments }

Interpolation errors

by John on April 1, 2009

Say you have a function f(x) and you want to find a polynomial p(x) that agrees with f(x) at several points. Given a set of points x0, x1, x2, … xn you can always find a polynomial of degree n such that p(xi) = f(xi) for i = 0, 1, 2, …, n. It seems reasonable that the more points you pick, the better the interpolating polynomial p(x) will match the function f(x). If the two functions match exactly at a lot of points, they should match well everywhere. Sometimes that’s true and sometimes it’s not.

Here is a famous example due to Carle Runge. Let f(x) = 1/(1 + x2) and let pn be the polynomial that interpolates f(x) at n+1 evenly spaced nodes in the interval [-5, 5]. As n becomes larger, the fit becomes worse.

Here’s a graph of f(x) and p9(x). The smooth blue line is f(x) and the wiggly red line is p9(x).

graph of f(x) and p9(x)

Here’s the analogous graph for p16(x).

graph of f(x) and p16(x)

The fit is improving in the middle. In fact, the curves agree to within the thickness of the plot line from say -1 to 1. But the fit is so bad in the tails that the graph had to be cut off. Here’s another plot of f(x) and p16(x) on a different scale to show how far negative the polynomial dips.

graph of f(x) and p16(x) on different scale

The problem is the spacing of the nodes. Interpolation errors are bad for evenly spaced nodes. If we interpolate f(x) at different points, at the Chebyshev nodes, then the fit is good.

interpolating at Chebyshev nodes

The Chebyshev nodes on [-1, 1] are xi = cos( π i / n ). Here we multiplied these nodes by 5 to scale to the interval [-5, 5].

If the function f(x) is absolutely continuous, as in our example, then the interpolating polynomials converge uniformly when you interpolate at Chebyshev nodes. However, ordinary continuity is not enough. Given any sequence of nodes, there exists a continuous function such that the polynomial interpolation error grows like log(n) as the number of nodes n increases.

Some numerical integration methods are based on interpolating polynomials: fit a polynomial to the integrand, then integrate the polynomial exactly to approximate the original integral. The examples above suggest that increasing the order of such integration methods might not improve accuracy and might even make things worse.

{ 11 comments }