ODE solver landscape

Posted on 12 June 2020 by John

Many methods for numerically solving ordinary differential equations are either Runge-Kutta methods or linear multistep methods. These methods can either be explicit or implicit.

The table below shows the four combinations of these categories and gives some examples of each.

$\begin{tabular}{|l|ll|} \hline & Runge-Kutta & Linear multistep\\ \hline Explicit & ERK & Adams-Bashforth\\ Implicit & (S)DIRK & Adams-Moulton, BDF\\ \hline \end{tabular}$

Runge-Kutta methods advance the solution of a differential equation one step at a time. That is, these methods approximate the solution at the next time step using only the solution at the current time step and the differential equation itself.

Linear multistep methods approximate the solution at the next time step using the computed solutions at the latest several time steps.

Explicit methods express the solution at the next time step as an explicit function of other information, not including the solution itself. The solution at the next time step appears on only one side of the equations.

Implicit methods express the solution at the next time step as a function of other information including the solution itself. The solution at the next time step appears on both sides of the equations. Additional work needs to be done to solve for the solution.

More on explicit vs implicit methods here.

In the table above, ERK stands for, not surprisingly, explicit Runge-Kutta methods. DIRK stands for diagonally implicit Runge-Kutta. SDIRK stands for singly diagonally implicit Runge-Kutta. BDF stands for backward difference formulas.

How to compute the square root of a complex number

Posted on 9 June 2020 by John

Suppose you’re given a complex number

z = x + iy

and you want to find a complex number

w = u + iv

such that w² = z. If all goes well, you can compute w as follows:

ℓ = √(x² + y²)
u = √((ℓ + x)/2)
v = sign(y) √((ℓ − x)/2).

For example, if z = 5 + 12i, then ℓ = 13, u = 3, and v = 2. A quick calculation confirms

(3 + 2i)² = 5 + 12i.

(That example worked out very nicely. More on why in the next post.)

Note that ℓ ≥ |x|, and so the square roots above have non-negative arguments.

Numerical issues

I said “if all goes well” above because in floating point arithmetic, various things could go wrong. In [1], the authors point out that x² + y² could overflow. They give a more complicated algorithm that addresses the problem of overflow, and they reference a paper with an even more complicated algorithm to cover all the corner cases — NaNs, infinities, etc.

If you have access to a hypot function, it will let you compute √(x² + y²) without overflow. If you don’t have access to a hypot function, you can roll your own quickly using the algorithm here.

This may seem unnecessary. Why would you need to worry about overflow? And why would you have access to a real square root function but not a complex square root function?

If you’re using standard 64-bit floating point numbers, the hypotenuse calculation is not going to overflow as long as your arguments are less than 10¹⁵⁴, and you can use the function csqrt from complex.h.

But if you’re using a 16-bit float or even an 8-bit float, overflow is an ever-present concern, and you might not have the math library support you’re accustomed to.

[1] Jean-Michel Muller et al. Handbook of Floating Point Arithmetic.

Accurately computing a 2×2 determinant

Posted on 31 May 2020 by John

The most obvious way to compute the determinant of a 2×2 matrix can be numerically inaccurate. The biggest problem with computing ad − bc is that if ad and bc are approximately equal, the subtraction could lose a lot of precision. William Kahan developed an algorithm for addressing this problem.

Fused multiply-add (FMA)

Kahan’s algorithm depends on a fused multiply-add function. This function computes xy + z using one rounding operation at the end, where the direct approach would use two.

In more detail, the fused multiply-add behaves as if it takes its the floating point arguments x, y, and z and lifts them to the Platonic realm of real numbers, calculates xy + z exactly, and then brings the result back to the world of floating point numbers. The true value of xy + z may not have an exact representation in the floating point world, and so in general it will be necessary to round the result.

The direct way of computing xy + z would behave as if it first lifts x and y to the Platonic realm, computes xy exactly, then pushes the result back down to the floating point world, rounding the result. Then it takes this rounded version of xy back up to the Platonic realm, possibly to a different value than before, and pushes z up, adds the two numbers, then pushes the sum back down to the world of floating point.

You can get more accuracy if you avoid round trips back and forth to the Platonic realm. If possible, you’d like to push everything up to the Platonic realm once, do as much work as you can up there, then come back down once. That’s what fused multiply-add does.

Kahan’s determinant algorithm

Here is Kahan’s algorithm for computing ad − bc for floating point numbers.

w = RN(bc)
e = RN(w – bc)
f = RN(ad – w)
x = RN(f + e)

The function RN computes its argument exactly, then rounds the result to the nearest floating point value, rounding to even in case of a tie. (This is the default rounding mode for compilers like gcc, so the C implementation of the algorithm will directly follow the more abstract specification above.)

If there were no rounding we would have

w = bc
e = w − bc = 0
f = ad − w = ad − bc
x = f + e = ad − bc + 0 = ad − bc

But of course there is rounding, and so Kahan’s steps that seem redundant are actually necessary and clever. In floating point arithmetic, e is not zero, but it does exactly equal w – bc. This is important to why the algorithm works.

Here is a C implementation of Kahan’s algorithm and a demonstration of how it differs from the direct alternative.

    #include <math.h>
    #include <stdio.h>
    
    double naive_det(double a, double b, double c, double d) {
        return a*d - b*c;
    }
    
    double kahan_det(double a, double b, double c, double d) {
        double w = b*c;
        double e = fma(-b, c, w);
        double f = fma(a, d, -w);
        return (f+e);
    }
    
    int main() {
        double a = M_PI;
        double b = M_E;
        double c = 355.0 / 113.0;
        double d = 23225.0 / 8544.0;
    
        printf("Naive: %15.15g\n", naive_det(a, b, c, d));
        printf("Kahan: %15.15g\n", kahan_det(a, b, c, d));
    }

The values of c and d were chosen to be close to M_PI and M_E so that the naive computation of the determinant would less accurate. See these post on rational approximations to π and rational approximations to e.

The code above prints

    Naive: -7.03944087021569e-07
    Kahan: -7.03944088015194e-07

How do we know that Kahan’s result is accurate? Careful analysis of Kahan’s algorithm in [1] gives bounds on its accuracy. In particular, the absolute error is bounded by 1.5 ulps, units of least precision.

Curse of dimensionality and integration

Posted on 30 May 2020 by John

The curse of dimensionality refers to problems whose difficulty increases exponentially with dimension. For example, suppose you want to estimate the integral of a function of one variable by evaluating it at 10 points. If you take the analogous approach to integrating a function of two variables, you need a grid of 100 points. For a function of three variables, you need a cube of 1000 points, and so forth.

You cannot estimate high-dimensional integrals this way. For a function of 100 variables, using a lattice with just two points in each direction would require 2¹⁰⁰ points.

There are much more efficient ways to approximate integrals than simply adding up values at grid points, assuming your integrand is smooth. But when applying any of these methods to multi-dimensional integrals, the computational effort increases exponentially with dimension.

The methods that first come to mind don’t scale well with dimension, but that doesn’t necessarily mean there aren’t any methods that do scale well.

Are there numerical integration methods whose computational requirement scale slower than exponentially with dimension? In general, no. You can beat the curse of dimension for special integrals. And if you’re content with probabilistic bounds rather than deterministic bounds, you can get around the curse by using Monte Carlo integration, sort of [1].

If you want worst-case error bounds, you cannot escape the curse of dimensionality [2]. You can require that the functions you’re integrating have so many derivatives and that these derivatives are bounded by some constant, but the amount of computation necessary to guarantee that the error is below a specified threshold increases exponentially with dimension.

Computing the area of a thin triangle

Posted on 27 February 2020 by John

Heron’s formula computes the area of a triangle given the length of each side.

$A = \sqrt{s(s-a)(s-b)(s-c)}$

where

$s = \frac{a + b + c}{2}$

If you have a very thin triangle, one where two of the sides approximately equal s and the third side is much shorter, a direct implementation Heron’s formula may not be accurate. The cardinal rule of numerical programming is to avoid subtracting nearly equal numbers, and that’s exactly what Heron’s formula does if s is approximately equal to two of the sides, say a and b.

William Kahan’s formula is algebraically equivalent to Heron’s formula, but is more accurate in floating point arithmetic. His procedure is to first sort the sides in decreasing order, then compute

$A = \frac{1}{4} \sqrt{(a + (b + c))(c - (a - b))(c + (a - b))(a + (b - c))}$

You can find this method, for example, in Nick Higham’s book Accuracy and Stability of Numerical Algorithms.

The algebraically redundant parentheses in the expression above are not numerically redundant. As we’ll demonstrate below, the method is less accurate without them.

Optimizing compilers respect the parentheses: the results are the same when the code below is compiled with gcc with no optimization (-O0) and with aggressive optimization (-O3). The same is true of Visual Studio in Debug and Release mode.

C code demo

First, here is a straightforward implementation of Heron

    #include <math.h>

    float heron1(float a, float b, float c) {
        float s = 0.5 * (a + b + c);
        return sqrt(s*(s - a)*(s - b)*(s - c));
    }

And here’s an implementation of Kahan’s version.

    void swap(float* a, float* b) {
        float t = *b;
        *b = *a;
        *a = t;
    }

    float heron2(float a, float b, float c) {
        // Sort a, b, c into descending order
        if (a < b) swap(&a, &b);
        if (a < c) swap(&a, &c);
        if (b < c) swap(&b, &c);

        float p = (a + (b + c))*(c - (a - b))*(c + (a - b))*(a + (b - c));
        return 0.25*sqrt(p);
}

Finally, here’s an incorrect implementation of Kahan’s method, with “unnecessary” parentheses removed.

    float heron3(float a, float b, float c) {
        // Sort a, b, c into descending order
        if (a < b) swap(&a, &b);
        if (a < c) swap(&a, &c);
        if (b < c) swap(&b, &c);

        float p = (a + b + c)*(c - (a - b))*(c + a - b)*(a + b - c);
        return 0.25*sqrt(p);
    }

Now we call all three methods.

    int main()
    {
        float a = 100000.1, b = 100000.2, c = 0.3;
        printf("%0.8g\n", heron1(a, b, c));
        printf("%0.8g\n", heron2(a, b, c));
        printf("%0.8g\n", heron3(a, b, c));
    }

And for a gold standard, here is an implementation in bc with 40 decimal place precision.

    scale = 40

    a = 10000000.01
    b = 10000000.02
    c = 0.03

    s = 0.5*(a + b + c)
    sqrt(s*(s-a)*(s-b)*(s-c))

Here are the outputs of the various methods, in order of increasing accuracy.

    Heron: 14363.129
    Naive: 14059.268
    Kahan: 14114.293
    bc:    14142.157

Here “naive” means the incorrect implementation of Kahan’s method, heron3 above. The bc result had many more decimals but was rounded to the same precision as the C results.

Related post: How to compute the area of a polygon

A better adaptive Runge-Kutta method

Posted on 19 February 2020 by John

This is the third post in a series on Runge-Kutta methods. The first post in the series introduces Runge-Kutta methods and Butcher tableau. The next post looked at Fehlberg’s adaptive Runge-Kutta method, first published in 1969. This post looks at a similar method from Dormand and Prince in 1980.

Like Fehlberg’s method, the method of Dormand and Prince can be summarized in a big, intimidating tableau, which we will display below. However we will discuss three differences between the methods:

Order 4(5) vs 5(4)
Derivative reuse
Precision / computation ratio

Dormand Prince tableau

Here’s the Butcher tableau for the Dormand-Prince method in all it’s glory:

Dormand-Prince tableau

The only detail of the table that will be important below is that 7th and 8th rows are identical.

Order 4(5) vs order 5(4)

Fehlberg’s method, a.k.a. RKF45, computes each update to the solution using a 4th order Runge-Kutta method, and uses a 5th order Runge-Kutta method to estimate the error.

The method of Dormand and Prince also uses 4th and 5th order Runge-Kutta methods, but in the opposite way. The fifth order method is used to advance the solution, and the 4th order method is used for comparison to estimate error.

Derivative reuse

The work in solving

y' = f(t, y)

by a Runge-Kutta method is roughly proportional to the number of stages. Dormand-Prince is a 7-stage method while Fehlberg is a 6-stage method, so it would seem that the latter is more efficient. However, if you look back at the Dormand-Prince tableau, the last row above the horizontal line equals the first row below the line. That means that the last evaluation of f at one step can be reused at the first evaluation of f at the next step.

Precision per unit work

In their book Solving Differential Equations, vol. 1, Hairer et al compare several adaptive Runge-Kutta methods, including Fehlberg (RKF45) and Dormand-Prince, and conclude that the latter produces more precision per unit work.

We again see that the [Fehlberg] method underestimates the local error. Further, with the use of local extrapolation, the advantage of RKF4(5) melts away to a large extent. The best method of all these is without a doubt the coefficient set of Dormand and Prince.

How to estimate ODE solver error

Posted on 19 February 2020 by John

This post brings together several themes I’ve been writing about lately: caching function evaluations, error estimation, and Runge-Kutta methods.

A few days ago I wrote about how Runge-Kutta methods can all be summarized by a set of numbers called the Butcher tableau. These methods solve

y' = f(t, y)

by evaluating f at some partial step, then evaluating f again at some new partial step and some linear combination of values from previous steps, etc. In the preface to J. C. Butcher’s book on differential equations, JM Sanz-Serna describes this structure as follows.

Runge-Kutta schemes … are highly nonlinear with a remarkable Matrioshka doll structure, where the vector field has to be evaluated an expression that involves the vector field evaluated at and expression that involves the vector field …

Once all the “Matrioshka dolls” are lined up, i.e. all the intermediate results have been calculated, the final estimate is a linear combination of these values.

Here’s the clever idea behind adaptive solvers: Create two Runge-Kutta methods of different orders that depend on the same intermediate results. Then both can be computed without new function evaluations, and the results compared. The difference between the results can be used as an estimate of the local error. Then you can adjust your step size accordingly and try again.

I’m going to present two adaptive Runge-Kutta schemes. I’ll go over Fehlberg’s method in this post and a variation in the next post that has some advantages over Fehlberg’s method.

Runge-Kutta-Felhberg (RKF45)

Fehlberg’s method, commonly known as RKF45, starts with a six-stage Runge Kutta method whose coefficients are given by the following tableau.

$\begin{array} {c|ccccccc} 0\\ 1/4 & 1/4 \\ 3/8 & 3/32 & 9/32 \\ 12/13 & 1932/2197 & -7200/2197 & 7296/2197 \\ 1 & 439/216 & -8 & 3680/513 & -845/4104 \\ 1/2 & -8/27 & 2 & -3544/2565 & 1859/4104 & -11/40 \\ \hline y_1 & 25/216 & 0 & 1408/2565 & 2197/4104 & -1/5 & 0 & \\ \hline \hat{y} & 16/135 & 0 & 6656/12825 & 28561/56430 & -9/50 & 2/55 \end{array}$

The meaning of the tableau is described here. (Imagine the effort it took Erwin Felhberg to derive this in 1969. Presumably he had little more than a desktop calculator to help. Maybe not even that.)

What’s important about the tableau for this post is that the coefficients above the horizontal line are used to create six numbers, the k‘s in the notation of the post referenced above. The k‘s are multiplied by the coefficients below the horizontal line to produce the solution of the differential equation at the next step. This is the “4” of RKF45.

RKF45 then applies another method which reuses all the k‘s in a different linear combination. This is summarized in the following variation of the Butcher tableau.

$\begin{array} {c|ccccccc} 0\\ 1/4 & 1/4 \\ 3/8 & 3/32 & 9/32 \\ 12/13 & 1932/2197 & -7200/2197 & 7296/2197 \\ 1 & 439/216 & -8 & 3680/513 & -845/4104 \\ 1/2 & -8/27 & 2 & -3544/2565 & 1859/4104 & -11/40 \\ \hline y_1 & 25/216 & 0 & 1408/2565 & 2197/1404 & -1/5 & 0 & \\ \hline \hat{y} & 16/135 & 0 & 6656/12825 & 28561/56430 & -9/50 & 2/55 \end{array}$

This is essentially two tableau combined into one. The first is as above. The second is like the one above but with a different bottom row. The bottom row gives the coefficients for a Runge-Kutta method corresponding to the “5” part of RKF45.

The 4 stands for 4th order, i.e. the local error for a step size h is O(h⁴). The 5 stands for 5th order, i.e. local error O(h⁵). RKF45 is two different methods, but they share so much computation that the second one almost comes for free; it does not require any new function evaluations, only taking a linear combination of six numbers.

However, RK45 is only “free” if you’ve gone to the effort of using a six-stage method. The amount of computation is roughly proportional to the number of stages, so we do about 50% more work to have RKF45 with an error estimate than doing the most common 4th order RK method. So if you knew exactly what step size to use, basic RK would be more efficient. But how could you know the optimal step size a priori?

By guiding us to choose the right step size, the extra work in RKF45 more than pays for itself. It could save a lot of computation that would come from using too small a step size, or prevent inaccurate results due to using too large a step size. Or maybe both: maybe a differential equation needs small steps at one period of time and can use larger steps at another period of time.

For more information on Felhberg’s method, see Solving Differential Equations by Hairer et al.

Trapezoid rule and Romberg integration

Posted on 18 February 2020 by John

This post will look at two numerical integration methods, the trapezoid rule and Romberg’s algorithm, and memoization. This post is a continuation of ideas from the recent posts on Lobatto integration and memoization.

Although the trapezoid rule is not typically very accurate, it can be in special instances, and Romberg combined it with extrapolation to create a very accurate method.

Trapezoid rule

The trapezoid is the simplest numerical integration method. The only thing that could be simpler is Riemann sums. By replacing rectangles of Riemann sums with trapezoids, you can make the approximation error an order of magnitude smaller.

The trapezoid rule is crude, and hardly recommended for practical use, with two exceptions. It can be remarkably efficient for periodic functions and for analytic functions that decay double exponentially. The trapezoid rule works so well in these cases that it’s common to transform a general function so that it has one of these forms so the trapezoid rule can be applied.

To be clear, the trapezoid rule for a given step size h may not be very accurate. But for periodic and double exponential functions the error decreases exponentially as h decreases.

Here’s an implementation of the trapezoid rule that follows the derivation directly.

    def trapezoid1(f, a, b, n):
        integral = 0
        h = (b-a)/n
        for i in range(n):
            integral += 0.5*h*(f(a + i*h) + f(a + (i+1)*h))
        return integral

This code approximates the integral of f(x) over [a, b] by adding up the areas of n trapezoids. Although we want to keep things simple, a slight change would make this code twice as efficient.

    def trapezoid2(f, a, b, n):
        integral = 0.5*( f(a) + f(b) )
        h = (b-a)/n
        for i in range(1, n):
            integral += f(a + i*h)
        return h*integral

Now we’re not evaluating f twice at every interior point.

Estimating error

Suppose you’ve used the trapezoid rule once, then you decide to use it again with half as large a step size in order to compare the results. If the results are the same within your tolerance, then presumably you have your result. Someone could create a function where this comparison would be misleading, where the two results agree but both are way off. But this is unlikely to happen in practice. As Einstein said, God is subtle but he is not malicious.

If you cut your step size h in half, you double your number of integration points. So if you evaluated your integrand at n points the first time, you’ll evaluate it at 2n points the second time. But half of these points are duplicates. It would be more efficient to save the function evaluations from the first integration and reuse them in the second integration, only evaluating your function at the n new integration points.

It would be most efficient to write your code to directly save previous results, but using memoization would be easier and still more efficient than redundantly evaluating your integrand. We’ll illustrate this with Python code.

Now let’s integrate exp(cos(x)) over [0, π] with 4 and then 8 steps.

    from numpy import exp, cos, pi

    print( trapezoid2(f, 0, pi, 4) )
    print( trapezoid2(f, 0, pi, 8) )

This prints

    3.97746388
    3.97746326

So this suggests we’ve already found our integral to six decimal places. Why so fast? Because we’re integrating a periodic function. If we repeat our experiment with exp(x) we see that we don’t even get one decimal place agreement. The code

    print( trapezoid2(exp, 0, pi, 4 ) )
    print( trapezoid2(exp, 0, pi, 8 ) )

prints

    23.26
    22.42

Eliminating redundancy

The function trapezoid2 eliminated some redundancy, but we still have redundant function evaluations when we call this function twice as we do above. When we call trapezoid2 with n = 4, we do 5 function evaluations. When we call it again with n = 8 we do 9 function evaluations, 4 of which we’ve done before.

As we did in the Lobatto quadrature example, we will have our integrand function sleep for 10 seconds to make the function calls obvious, and we will add memoization to have Python to cache function evaluations for us.

    from time import sleep, time
    from functools import lru_cache

    @lru_cache()
    def f(x):
        sleep(10)
        return exp(cos(x))

    t0 = time()
    trapezoid2(f, 0, pi, 4)
    t1 = time()
    print(t1 - t0)
    trapezoid2(f, 0, pi, 8)
    t2 = time()
    print(t2 - t1)

This shows that the first integration takes 50 seconds and the second requires 40 seconds. The first integration requires 5 function evaluations and the second requires 9, but the latter is faster because it only requires 4 new function evaluations.

Romberg integration

In the examples above, we doubled the number of integration intervals and compared results in order to estimate our numerical integration error. A natural next step would be to double the number of intervals again. Maybe by comparing three integrations we can see a pattern and project what the error would be if we did more integrations.

Werner Romberg took this a step further. Rather than doing a few integrations and eye-balling the results, he formalized the inference using Richardson extrapolation to project where the integrations are going. Specifically, his method applies the trapezoid rule at 2^m points for increasing values of m. The method stops when either the maximum value of m has been reached or the difference between successive integral estimates is within tolerance. When Romberg’s method is appropriate, it converges very quickly and there is no need for m to be large.

To illustrate Romberg’s method, let’s go back to the example of integrating exp(x) over [0, π]. If we were to use the trapezoid rule repeatedly, we would get these results.

     Steps    Results
         1  37.920111
         2  26.516336
         4  23.267285
         8  22.424495
        16  22.211780
        32  22.158473

This doesn’t look promising. We don’t appear to have even the first decimal correct. But Romberg’s method applies Richardson extrapolation to the data above to produce a very accurate result.

    from scipy.integrate import romberg

    r = romberg(exp, 0, pi, divmax = 5) 
    print("exact:   ", exp(pi) - 1)
    print("romberg: ", r)

This produces

    exact:    22.1406926327792
    romberg:  22.1406926327867

showing that although none of the trapezoid rule estimates are good to more than 3 significant figures, the extrapolated estimate is good 12 figures, almost to 13 figures.

If you pass the argument show=True to romberg you can see the inner workings of the integration, including a report that the integrand was evaluated 33 times, i.e. 1 + 2^m times when m is given by divmax.

It seems mysterious how Richardson extrapolation could take the integral estimates above, good to three figures, and produce an estimate good to twelve figures. But if we plot the error in each estimate on a log scale it becomes more apparent what’s going on.

Plot of error in Romberg integration

The errors follow nearly a straight line, and so the extrapolated error is “nearly” negative infinity. That is, since the log errors nearly follow a straight line going down, polynomial extrapolation produces a value whose log error is very large and negative.

Lobatto integration

Posted on 15 February 2020 by John

A basic idea in numerical integration is that if a method integrates polynomials exactly, it should do well on polynomial-like functions [1]. The higher the degree of polynomial it integrates exactly, the more accurate we expect it will be on functions that behave like polynomials.

The best known example of this is Gaussian quadrature. However, this post shows why for some applications you might want to use Lobatto quadrature instead. Lobatto quadrature is similar in spirit to Gaussian quadrature, but allocates points to solve a slightly different optimization problem.

Gaussian quadrature

If you need to integrate a function over an interval, it matters a great deal where you choose to evaluate the function. The optimal choice, in term of what polynomials you can integrate exactly, is to use Gaussian quadrature. By evaluating the integrand at n optimally chosen points you can integrate polynomials of degree 2n − 1 or less exactly.

So if Gaussian integration is optimal, why would you want to do anything else? The Gaussian integration points are all interior to the interval of integration. In some applications, you have to evaluate your integrand at the end points, and so you want to optimize subject to a constraint: how can you best allocate your integration points subject to the constraint that two of your integration points are the two ends of the interval? The solution is Lobatto quadrature, also called Gauss-Lobatto quadrature.

Lobatto quadrature

By evaluating the integrand at n points, two of which are the end points, Lobatto’s method exactly integrates polynomials of degree 2n-3.

Suppose for whatever reason you already know the value of the integrand evaluated at the end points. You’ve got m more function evaluations to spend. If you use those m points for Gaussian quadrature, you can exactly integrate polynomials of degree

2m − 1

or less. But if you use Lobatto quadrature, your m interior evaluations plus your two known values at the end points give you a total of m+2 function evaluations, and so can integrate polynomials of degree

2(m + 2) − 3 = 2m + 1

or less exactly, two degrees higher than if we had used Gaussian quadrature.

Next, suppose you only know the value of the function you’re integrating at one end point. Say you’ve already integrate f(x) over the interval [1, 2] using Lobatto quadrature, and now you want to integrate over [2, 3]. You already know the value f(2) from your previous integration.

Suppose you have m new function evaluations you can afford, and you don’t know f(3). If you use Lobatto quadrature, f(3) has to come out of your function evaluation budget, so you can afford m − 1 interior integration points. You then know the value of f(x) at m + 1 points: f(2) came for free, you evaluated f(x) at m − 1 interior points and at x = 3. This lets you exactly integrate polynomials of degree

2(m + 1) − 3 = 2m − 1

or less, the same as Gaussian quadrature. But if you then need to integrate over [3, 4], knowing f(3) gives you a head start on the next integration, and so on.

Weights and integration points

You can look up the weights and integration points for Gaussian quadrature and Lobatto quadrature in, for example, Abramowitz and Stegun.

There is a nice symmetry between the two integration methods: Gaussian quadrature uses integration points based on the zeros of Legendre polynomials, and weights that depend on the derivatives of these polynomials. Lobatto quadrature is the other way around: integration points are given by the zeros of derivatives of Legendre polynomials, and the weights involve the Legendre polynomials themselves.

Python example

Here we’ll implement the Gauss and Lobatto rules of order five. Most of the code is data on integration points and weights.

    gauss_points = [
        -0.906179845938664,
        -0.538469310105683,
        0.0,
        +0.538469310105683,
        +0.906179845938664
    ]
    
    gauss_weights = [
        0.23692688505618908,
        0.47862867049936647,
        0.5688888888888889,
        0.47862867049936647,    
        0.23692688505618908
    ]
    
    lobatto_points = [
        -1.0,
        -0.6546536707079771,
        0,
        +0.6546536707079771,
        +1.0
    ]
    
    lobatto_weights = [
        0.1,
        0.5444444444444444,
        0.7111111111111111,
        0.5444444444444444,
        0.1
    ]

The integration points and weights are symmetrical, so you could make the code more compact at the expense of making it a little more complicated. Putting the + in front of positive integration points is a little unconventional, but it emphasizes the symmetry by making the positive and negative weights align vertically.

Here’s our integration code:

    def integrate(f, xs, ws):
        return sum(f(xs[i])*ws[i] for i in range(5))

where we pass it the function to integrate and either Gauss data or Lobatto data.

The following verifies that with 5 integration points, Gauss should be able to exactly integrate a 9th order polynomial, and Lobatto should be able to integrate a 7th order polynomial.

    print( integrate(lambda x: x**9 + 1,
               gauss_points, gauss_weights) )

    print( integrate(lambda x: x**7 + 1,
               lobatto_points, lobatto_weights) )

Both print 2.0 as expected. The integral of an odd function over [−1, 1] is zero, and the integral of 1 over the same interval is 2.

Now let’s use both to integrate cosine over [−1, 1].

    print( integrate(cos, gauss_points, gauss_weights) )
    print( integrate(cos, lobatto_points, lobatto_weights) )

The exact integral is 2 sin(1). Here are the results.

    Exact:   1.682941969
    Gauss:   1.682941970
    Lobatto: 1.682942320

So Gauss is correct to 8, almost 9, decimal places, and Lobatto is correct to 5 decimal places.

Next, let’s hard code a 3rd order Gauss rule for comparison.

    def gauss3(f):
        k = 0.6**0.5
        s = f(-k)*5 + f(0)*8 + f(k)*5
        return s/9

We can verify that it integrates fifth order polynomials exactly:

    print(gauss3(lambda x: x**5 + 1))

and we can use it to integrate cosine:

    print(gauss3(cos))

This returns 1.68300, a little less accurate then the Lobatto rule above, illustrating that typically Lobatto will be more accurate than Gauss with the same number of function evaluations interior to the interval of integration.

More on numerical integration

[1] Polynomials can’t have horizontal asymptotes, for example, and so we should not be surprised that a method that integrates high order polynomials exactly could still do poorly on, say, a normal probability density.

Runge-Kutta methods and Butcher tableau

Posted on 13 February 2020 by John

If you know one numerical method for solving ordinary differential equations, it’s probably Euler’s method. If you know two methods, the second is probably 4th order Runge-Kutta. It’s standard in classes on differential equations or numerical analysis to present Euler’s method as conceptually simple but inefficient introduction, then to present Runge-Kutta as a complicated but efficient alternative.

Runge-Kutta methods are a huge family of numerical methods with a wide variety of trade-offs: efficiency, accuracy, stability, etc. Euler’s method is a member of the Runge-Kutta family as are countless other variations. You could devote a career to studying Runge-Kutta methods, and some people have.

Beneath the complexity and variety, all Runge-Kutta methods have a common form that can be summarized by a matrix and two vectors. For explicit Runge-Kutta methods (ERK) the matrix is triangular, and for implicit Runge-Kutta methods (IRK) the matrix is full.

This summary of an RK method is known as a Butcher tableau, named after J. C. Butcher who classified RK methods.

“The” Runge-Kutta method

For example, let’s start with what students often take to be “the” Runge-Kutta method. This method approximates solutions to a differential equation of the form

y' = f(t, y)

$y_{n+1} = y_n + \frac{h}{6}\left( k_{n1} + 2k_{n2} + 2k_{n3} + k_{n4}\right)$

where

$k_{n1} &=& f(t_n, y_n) \\ k_{n2} &=& f(t_n + 0.5h, y_n + 0.5hk_{n1}) \\ k_{n3} &=& f(t_n + 0.5h, y_n + 0.5hk_{n2}) \\ k_{n4} &=& f(t_n + h, y_n + hk_{n3}) \\$

The Butcher tableau for this ERK method is

$\begin{array} {c|cccc} 0\\ 1/2 & 1/2\\ 1/2 &0 &1/2 \\ 1& 0& 0& 1\\ \hline & 1/6 & 1/3 & 1/3 &1/6 \end{array}$

The numbers along the left side are the coefficients of h in the first argument of f.

The numbers along the bottom are the coefficients of the ks in the expression for the value of y at the next step.

The numbers in the middle of the array are the coefficients of the ks in second argument of f. Because this is an explicit method, each k only depends on the previous ks, and so the table of coefficients has a triangular form.

Runge-Kutta 3/8 rule

The method above is the most common 4th order ERK rule, there is another known as the 3/8 rule. It is a little less efficient and a little more accurate. A step of this rule is given by

$y_{n+1} = y_n + \frac{h}{8}\left( k_{n1} + 3k_{n2} + 3k_{n3} + k_{n4}\right)$

where

$\begin{align*} k_{n1} &= f(t_n, y_n) \\ k_{n2} &= f(t_n + \frac{h}{3}, y_n + \frac{h}{3}k_{n1}) \\ k_{n3} &= f(t_n +\frac{2h}{3}, y_n -\frac{h}{3}k_{n1} + hk_{n2}) \\ k_{n4} &= f(t_n + h, y_n + h k_{n1} - h k_{n2} + hk_{n3}) \end{align*}$

This method is summarized in the following Butcher tableau.

$\begin{array} {c|cccc} 0\\ 1/3 & 1/3\\ 2/3 & -1/3 &1 \\ 1& 1& -1 & 1\\ \hline & 1/8 & 3/8 & 3/8 &1/8 \end{array}$

This example makes it a little easier to see what’s going on since none of the coefficients in the triangular array are zero. Full detail is given in the section below.

General Explicit Runge-Kutta

The most general form of an ERK rule with s steps is

$y_{n+1} = y_n + h \sum_{i-1}^s b_i k_{ni}$

where

$k_{ni} = f\left(x_n + c_i h, y_n + h \sum_{j=1}^{i-1} a_{ij} k_{nj}\right)$

and the Butcher tableau is

$\begin{array} {c|ccccc} 0\\ c_2 & a_{21}\\ c_3 & a_{31} & a_{32} \\ \vdots & \vdots & & \ddots \\ c_s& a_{s1}& a_{s2} & \cdots & a_{s,s-1}\\ \hline & b_1 & b_2 & \cdots & b_{s-1} & b_s \end{array}$

General implicit Runge-Kutta

With explicit (ERK) methods, each k depends only on its predecessors. With implicit (IRK) methods each k potentially depends on each of the others. The matrix in the tableau is full, not triangular, and one must solve for the ks.

Now

$k_{ni} = f\left(x_n + c_i h, y_n + h \sum_{j=1}^s a_{ij} k_{nj}\right)$

with the sum going all the way up to s, and the Butcher tableau is

$\begin{array} {c|ccccc} c_1 & a_{11} & a_{12} & \cdots & a_{1s} \\ c_2 & a_{21} & a_{22} & \cdots & a_{2s} \\ \vdots & \vdots & & \ddots & \vdots \\ c_s& a_{s1}& a_{s2} & \cdots & a_{s,s}\\ \hline & b_1 & b_2 & \cdots & b_{s} \end{array}$

Implicit methods are more complicated to implement, and require more computation for a given step size. However, they are more stable for stiff differential equations and may allow larger steps. Implicit methods are less efficient when they’re not needed, and more efficient when they are needed.

Back to Euler’s method

I said at the top of the post that Euler’s method was a special case of Runge-Kutta. The Butcher tableau for the explicit (forward) Euler method is simply

$\begin{array} {c|c} 0 & 0\\ \hline & 1\end{array}$

and the tableau for the implicit (backward) Euler method is just

$\begin{array} {c|c} 1 & 1\\ \hline & 1\end{array}$

In this post I say more about these two methods and compare their stability.

More posts on ODE solvers

Numerical issues

Related posts

Fused multiply-add (FMA)

Kahan’s determinant algorithm

More floating point posts

More integration posts

C code demo

Dormand Prince tableau

Order 4(5) vs order 5(4)

Derivative reuse

Precision per unit work

More differential equation posts

Runge-Kutta-Felhberg (RKF45)

Related posts

Trapezoid rule

Estimating error

Eliminating redundancy

Romberg integration

More on numerical integration

Gaussian quadrature

Lobatto quadrature

Weights and integration points

Python example

More on numerical integration

“The” Runge-Kutta method

Runge-Kutta 3/8 rule

General Explicit Runge-Kutta

General implicit Runge-Kutta

Back to Euler’s method

More on differential equations