Floating point numbers are a leaky abstraction

Joel Spolsky coined the term leaky abstraction for programming concepts that usually shield you from messy details but sometimes break down. A perfect abstraction is a black box that you never have to open. A leaky abstraction is a black box that you have to open up occasionally.

Floating point numbers, the computer representations of real numbers,  are leaky abstractions. They work remarkably well: you can usually pretend that a floating point type is a mathematical real number. But sometimes you can’t. The abstraction leaks, though not very often.

Most explanations I’ve heard for the limitations of machine numbers are pedantic. “There are only a finite number of floating point numbers so they can’t represent real numbers well.” That’s not much help. It doesn’t explain why floating point numbers actually do represent real numbers sufficiently well for most applications, and it doesn’t suggest where the abstraction might leak.

A standard floating point number has roughly 16 decimal places of precision and a maximum value on the order of 10308, a 1 followed by 308 zeros. (According to IEEE standard 754, the typical floating point implementation.)

Sixteen decimal places is a lot. Hardly any measured quantity is known to anywhere near that much precision. For example, the constant in Newton’s Law of Gravity is only known to six significant figures. The charge of an electron is known to 11 significant figures, much more precision than Newton’s gravitational constant, but still less than a floating point number. So when are 16 figures not enough? One problem area is subtraction. The other elementary operations — addition, multiplication, division — are very accurate. As long as you don’t overflow or underflow,  these operations often produce results that are correct to the last bit. But subtraction can be anywhere from exact to completely inaccurate.  If two numbers agree to n figures, you can lose up to n figures of precision in their subtraction. This problem can show up unexpectedly in the middle of other calculations. For an example, see this post on calculating standard deviation. See also computing derivatives, the second example in  Five Tips for Floating Point Programming.

What about overflow or underflow? When do you need numbers bigger than 10308? Often you don’t. But in probability calculations, for example, you need them all the time unless you’re clever. It’s common in probability to compute a medium-sized number that is the product of an astronomically large number and an infinitesimally small number. The final result fits into a computer just fine, but the intermediate numbers might not due to overflow or underflow. For example, the maximum floating point number on most computers is somewhere between 170 factorial and 171 factorial. Such large factorials often appear in applications, often in ratios with other large factorials. See Avoiding Overflow, Underflow, and Loss of Precision for tricks on how to work with factorials that would overflow if computed directly.

Often you can afford to be blissfully ignorant of the details of floating point arithmetic, but sometimes you cannot. A great place to learn more is David Goldberg’s paper What Every Computer Scientist Should Know About Floating-Point Arithmetic.

Update: See follow-up post, Anatomy of a floating point number.

Related posts:

How to compute binomial probabilities
NaN, 1.#IND, 1.#INF, and all that

Tagged with:
Posted in Computing, Math
17 comments on “Floating point numbers are a leaky abstraction
  1. John Moeller says:

    Nice post. This is a huge problem in sums too, especially when the addends are different by many orders of magnitude.

  2. Don’t forget a Bayesian classifier — it involves multiplying large numbers of small probabilities, then normalizing them. After getting strange results, and switching to BigDecimal, I learned that the usual solution is to do all calculations in log space.

  3. David Eisner says:

    Did anybody else notice the error in Figure D-1 of David Goldberg’s paper (http://docs.sun.com/source/806-3568/ncg_goldberg.html#1374) ? The “0″ should be at the left of the number line, not at the hash mark corresponding to 1/2.

  4. John Cowan says:

    Not to mention the problems that arise because we think in decimal but compute in binary. Does 0.1 + 0.2 = 0.3? No, not in binary floats, it doesn’t.

  5. EastwoodDC says:

    This also ties in with your post a while back on accurately calculation the variance.

  6. John says:

    David Eisner: I updated the links to point to an updated version of Goldberg’s paper. Thanks for pointing out the error.

  7. Jack says:

    I disagree with your idea that floating-point is supposed to represent “real numbers”. They represent rational numbers.

    If you want to approximate reals with rationals, that’s your choice, but you should at least understand the difference, and why you can get into trouble thereby.

  8. John says:

    Jack: Every floating point number does represent a rational number, so I could see your point. But I’d still say that floating point numbers are trying to approximate real numbers. Arithmetic operations are designed to be approximately correct in the analytical sense of the real line, not exactly correct in the algebraic sense of a quotient field. I think fixed-point numbers are a better model for rational numbers.

  9. Tomas says:

    I’m surprised to not see the “move to the logspace” basic trick… which is to compute exp(sum log x_i) in place of prod(x_i). Works great for probabilities – they tend to be from some distribution and then central limit works for you…
    Also normalizing small (or large) numbers to sum up to, say, 1.0 is better accomplished by first subtracting the max of all the logarithms.

  10. John says:

    Tomas: I agree that the “move to the logspace” trick is very important. I talk about it here in the context of computing binomial probabilities.

  11. Tomas says:

    And of course Kevin did talk about it first, my oversight.

  12. Tom Ochs wrote a series of articles about exactly this sort of thing in “Computer Language” magazine some years ago.

  13. When your errors are multiplicative, and you have choice in the order in which to perform operations, another “trick” is to arrange the computation so that the greatest norm operators are applied last. Here’s an example http://yaroslavvb.blogspot.com/2010/09/order-matters.html

  14. Cfr says:

    Also, operations with floating point numbers are not pure functional on different machines/compilers.

  15. Al Chou says:

    There’s a whole discipline — numerical computation — about calculating with floating point numbers efficiently and maximizing both accuracy and precision. A lot of the hard work is done with pencil and paper analysis before any code is written. The techniques discussed above are just a few of the things done in numerical analysis.

  16. Chas Emerick says:

    Many links to sun.com are now useless…including yours to What Every Computer Scientist Should Know About Floating-Point Arithmetic (corrected).

  17. John says:

    Thanks! I’ve updated the link with the Oracle one.

7 Pings/Trackbacks for "Floating point numbers are a leaky abstraction"
  1. [...] Cook presents Floating point numbers are a leaky abstraction. This is the first of two blog posts about how computer arithmetic works and what problems to look [...]

  2. [...] Floating point numbers are a leaky abstraction — The Endeavour Good remainder of some of the very real problems relying upon an abstracted system for accuracy. (tags: computer computerscience technology programming mathematics precision accuracy) [...]

  3. [...] Floating point numbers are a leaky abstraction ? X [...]

  4. [...] to calculate binomial probabilities Floating point numbers are a leaky abstraction ? [...]

  5. [...] into some articles ruminating about floating point arithmetic.  On John D. Cook’s blog,  Numbers are a Leaky Abstraction and Anatomy of a Floating Point Number are two interesting posts that deal with the various [...]

  6. [...] Floating point numbers are a leaky abstraction Avoiding overflow, underflow, and loss of precision Just an approximation [...]

  7. [...] in a phone screen). Dates and times are a leaky abstraction. Unicode is a leaky abstraction. Even arithmetic is a leaky abstraction: if you’ve ever encountered overflow or rounding errors, you’ll know what I [...]