There are variations on Newton’s root finding method that use higher derivatives and converge faster. Alston Householder developed a sequence of such methods of the form
where the superscripts in parentheses indicate derivatives. When n = 2, Householder’s method reduces to Newton’s method.
Once Newton’s method is close enough to a root, the error is squared at each iteration. Said another way, the number of correct decimal places roughly doubles. With the Householder method of order n, the number of correct decimal places roughly increases by a factor of n.
The Householder method Hn can be written as a rational function of f and its derivatives up to order n − 1, though the formulas are complicated.
Maybe it’s time for Householder’s methods to be more widely used: with automatic differentiation [1], the complexity of computing higher order derivatives is not the drawback it was when these computations were done solely by hand.
Demonstration
We’ll illustrate the convergence of the Householder methods by finding roots of the quintic polynomial
f(x) = x5 + ax + b.
I used Mathematica to calculate expressions for the Householder methods and export the results as code. Mathematica does not have a function to export Python code, but it does have a function CForm
to export expressions as C code. I started to use this, converting exponents from C to Python form with search and replace. Then I tried using FortranForm
instead, and that was much easier because Fortran and Python both represent powers with a **
operator.
def f(x, a, b): return x**5 + a*x + b def H2(x, a, b): return x - (b + a*x + x**5)/(a + 5*x**4) def H3(x, a, b): num = (a + 5*x**4)*(b + a*x + x**5) den = a**2 - 10*b*x**3 + 15*x**8 return x - num/den def H4(x, a, b): num = (b + a*x + x**5)*(-a**2 + 10*b*x**3 - 15*x**8) den = a**3 + 10*b**2*x**2 + 5*a**2*x**4 - 80*b*x**7 - 25*a*x**8 + 35*x**12 return x + num/den
To set values of a and b, I arbitrarily chose a = 2 and solved for b so that π is a root. That way we have a known value of the root for comparison. I used x = 4 as my starting guess at the root.
I first used −log10|x − π| to measure the number of correct decimals. That didn’t quite work because the method would converge exactly (to the limits of machine precision) which led to taking the log of zero. So I modified my accuracy function to the following.
def decimals(x): d = abs(x - pi) if d != 0: return( round(-log10(d),2) ) else: return "inf"
Then I tested each of the Householder methods with the following.
epsilon = 1e-14 for H in [H2, H3, H4]: x = 4 for i in range(1, 10): if abs(f(x,a,b)) > epsilon: x = H(x,a,b) print(i, x, decimals(x))
Here are the results in tabular form. I replaced all the decimals after the first incorrect decimal with underscores to make it easier to see the convergence.
|---+-------------------+--------| | i | x | digits | |---+-------------------+--------| | 1 | 3.4______________ | 0.53 | | 2 | 3.18_____________ | 1.33 | | 3 | 3.142____________ | 2.87 | | 4 | 3.141593_________ | 5.93 | | 5 | 3.141592653590___ | 12.07 | | 6 | 3.141592653589793 | inf | |---+-------------------+--------| | 1 | 3.2______________ | 1.11 | | 2 | 3.1416___________ | 4.03 | | 3 | 3.1415926535899__ | 12.79 | | 4 | 3.141592653589793 | inf | |---+-------------------+--------| | 1 | 3.15_____________ | 1.84 | | 2 | 3.141592654______ | 8.85 | | 3 | 3.141592653589793 | inf | |---+-------------------+--------|
Note that, as promised, the number of correct decimal places for Hn increases by roughly a factor of n with each iteration.
Update: See this post where the symbolic calculations are done in SymPy rather than Mathematica. Everything stays in Python, and there’s no need to edit the code for the Householder methods.
Related posts
[1] Note that in this post I used Mathematica to do symbolic differentiation, not automatic differentiation. Automatic differentiation differs from both symbolic and numerical differentiation. See an explanation here.
So is, say, H4 actually better than Newton’s method? It takes about half as many iterations to get the same precision, but at least in this example, computing each iteration appears to require more than twice as many operations. Is this only because this is a relatively simple example?
Maybe H4 would be more efficient than H2 when your function is fairly expensive to evaluate but computing derivatives at the same time isn’t much more work. This is often the case with automated differentiation.
The notion that automatic differentiation enables the use of higher-order Newton iterations was the object of research I did with Jean-Pierre Dussault in the mid-2000s. Another interesting point is that automatic directional differentiation avoids the beyond-quadratic polynomial complexity of higher-order derivative computations, bounding it to O(n). https://www.researchgate.net/publication/267074870_Implementation_issues_for_high-order_algorithms