The quadratic formula and low-precision arithmetic

What could be interesting about the lowly quadratic formula? It’s a formula after all. You just stick numbers into it.

Well, there’s an interesting wrinkle. When the linear coefficient b is large relative to the other coefficients, the quadratic formula can give wrong results when implemented in floating point arithmetic.

Quadratic formula and loss of precision

The quadratic formula says that the roots of

ax^2 + bx + c = 0

are given by

x = \frac{-b \pm \sqrt{b^2 - 4ac}}{2a}

That’s true, but let’s see what happens when we have ac = 1 and b = 108.

    from math import sqrt

    def quadratic(a, b, c):
        r = sqrt(b**2 - 4*a*c)
        return ((-b + r)/(2*a), (-b -r)/(2*a))

    print( quadratic(1, 1e8, 1) )

This returns

    (-7.450580596923828e-09, -100000000.0)

The first root is wrong by about 25%, though the second one is correct.

What happened? The quadratic equation violated the cardinal rule of numerical analysis: avoid subtracting nearly equal numbers. The more similar two numbers are, the more precision you can lose from subtracting them. In this case √(b² – 4ac) is very nearly equal to b.

If we ask Python to evaluate

    1e8 - sqrt(1e16-4)

we get 1.49e-8 when the correct answer would be 2.0e-8.

Improving precision for quadratic formula

The way to fix the problem is to rationalize the numerator of the quadratic formula by multiplying by 1 in the form

\frac{-b \mp \sqrt{b^2 - 4ac}}{-b \mp \sqrt{b^2 - 4ac}}

(The symbol ∓ is much less common than ±. It just means that if you take the + sign in the quadratic formula, take the – sign above, and vice versa.)

When we multiply by the expression above and simplify we get

\frac{2c}{-b \mp \sqrt{b^2 - 4ac}}

Let’s code this up in Python and try it out.

    def quadratic2(a, b, c):
        r = sqrt(b**2 - 4*a*c)
        return (2*c/(-b - r), 2*c/(-b+r))

    print( quadratic2(1, 1e8, 1) )

This returns

    (-1e-08, -134217728.0)

So is our new quadratic equation better? It gives the right answer for the first root, exact to within machine precision. But now the second root is wrong by 34%. Why is the second root wrong? Same reason as before: we subtracted two nearly equal numbers!

The familiar version of the quadratic formula computes the larger root correctly, and the new version computes the smaller root correctly. Neither version is better overall. We’d be no better off or worse off always using the new quadratic formula than the old one. Each one is better when it avoids subtracting nearly equal numbers.

The solution is to use both quadratic formulas, using the appropriate one for the root you’re trying to calculate.

Low precision arithmetic

Is this a practical concern? Yes, and here’s why: Everything old is new again.

The possible inaccuracy in the quadratic formula was serious before double precision (64-bit floating point) arithmetic became common. And back in the day, engineers were more likely to be familiar with the alternate form of the quadratic formula. You can still run into quadratic equations that give you trouble even in double precision arithmetic, like the example above, but it’s less likely to be a problem when you have more bits at your disposal.

Now we’re interested in low-precision arithmetic again. CPUs have gotten much faster, but moving bits around in memory has not. Relative to CPU speed, memory manipulation has gotten slower. That means we need to be more concerned with memory management and less concerned about floating point speed.

Not only is memory juggling slower relative to CPU, it also takes more energy. According to Gustafson, reading 64 bits from DRAM requires 65 times as much energy as doing a floating point combined multiply-add because it takes place off-chip. The table below, from page 6 of Gustafson’s book, gives the details. Using lower precision floating point saves energy because more numbers can be read in from memory in the same number of bits. (Here pJ = picojoules.)

Operation  Energy consumed Where
Perform a 64-bit floating point multiply-add 64 pJ on-chip
Load or store 64 bits of register data 6 pJ on-chip
Read 64 bits from DRAM 4200 pJ off-chip

So we might be interested in low-precision arithmetic to save energy in a battery powered mobile device, or to save clock cycles in a server application manipulating a lot of data. This means that the numerical tricks that most people have forgotten about are relevant again.

More numerical computing posts

6 thoughts on “The quadratic formula and low-precision arithmetic

  1. I’ve always had reservations when came to attaining roots to polynomials with a formula with more than one value.

  2. Ondřej Čertík

    One other source of loss of accuracy comes from calculating the discriminant if b**2 is close to 4*a*c. This document has Matlab code with accurate implementation that should work for all cases:

    https://people.eecs.berkeley.edu/~wkahan/Qdrtcs.pdf

    Given how complicated it is, a viable option can be to just use the Newton method and use any of the quadratic formulas above as an initial guess for the roots.

  3. Another trick is always computing one root with the formula where you only add and compute the second root using the fact that the product of the roots is c/a.

  4. Also see our article “Numerical Difficulties in Pre-University Informatics Education and Competitions”, which appeared in Informatics in Education, Vol. 2, Nr. 1, pp.21-38: https://www.mii.lt/informatics_in_education/pdf/INFE012.pdf
    It discusses the quadratic equation next to three other typical numerical problems that are often underestimated, and provides some annotated pointers to accessible literature on this topic.

    Erratum: The example below equation (6) should read “fl(1.06 + 3.06) = fl(4.12) = 4.1 != 4.2 = 1.1 +^+ 3.1 = fl(1.06) +^+ fl(3.06)”, where +^+ is one-decimal precision addition.

Comments are closed.