The most obvious way to approximate the derivative of a function numerically is to use the definition of derivative and stick in a small value of the step size *h*.

*f*′ (*x*) ≈ ( *f*(*x* + *h*) − *f*(*x*) ) / *h*.

How small should *h* be? Since the exact value of the derivative is the limit as *h* goes to zero, the smaller *h* is the better. Except that’s not true in computer arithmetic. When *h* is too small, *f*(*x* + *h*) is so close to *f*(*x*) that the subtraction loses precision.

One way to see this is to think of the extreme case of *h* so small that *f*(*x* + *h*) equals *f*(*x*) to machine precision. Then the derivative is approximated by 0, regardless of what the actual derivative is.

As *h* gets smaller, the approximation error decreases, but the numerical error increases, and so there is some optimal value of *h*.

You can do significantly better by using a symmetric approximation for the derivative:

*f*′ (*x*) ≈ ( *f*(*x* + *h*) − *f*(*x* − *h*) ) / 2*h*.

For a given value of *h*, this formula has about the same numerical error but less approximation error. You still have a trade-off between approximation error and numerical error, but it’s a better trade-off.

If the function *f* that you want to differentiate is analytic, i.e. differentiable as a function of a complex variable, you can take the step *h* to be a complex number. When you do, the numerical difficulty completely goes away, and you can take *h* much smaller.

Suppose *h* is a small real number and you take

*f*′ (*x*) ≈ ( *f*(*x* + *ih*) − *f*(*x* − *ih*) ) / 2*ih*.

Now *f*(*x* + *ih*) and −*f*(*x* − *ih*) are approximately equal by the Schwarz reflection principle. And so rather than canceling out, the terms in the numerator add. We have

*f*(*x* + *ih*) − *f*(*x* − *ih*) ≈ 2 *f*(*x* + *ih*)

and so

*f*′ (*x*) ≈ *f*(*x* + *ih*) / *ih*.

## Example

Let’s take the derivative of the gamma function Γ(*x*) at 1 using each of the three methods above. The exact value is −γ where γ is the Euler-Mascheroni constant. The following Python code shows the accuracy of each approach.

from scipy.special import gamma def diff1(f, x, h): return (f(x + h) - f(x))/h def diff2(f, x, h): return (f(x + h) - f(x - h))/(2*h) γ = 0.5772156649015328606065 x = 1 h = 1e-7 exact = -γ approx1 = diff1(gamma, x, h) approx2 = diff2(gamma, x, h) approx3 = diff2(gamma, x, h*1j) print(approx1 - exact) print(approx2 - exact) print(approx3 - exact)

This prints

9.95565755390615e-08 1.9161483510998778e-10 (9.103828801926284e-15-0j)

In this example the symmetric finite difference approximation is about 5 times more accurate than the asymmetric approximation, but the complex step approximation is 10,000,000 times more accurate.

It’s a little awkward that the complex step approximation returns a complex number, albeit one with zero complex part. We can eliminate this problem by using the approximation

*f*′ (*x*) ≈ Im *f*(*x* + *ih*) / *h*

which will return a real number. When we implement this in Python as

def diff3(f, x, h): return (f(x + h*1j) / h).imag

we see that it produces the same result as `diff2`

but without the zero imaginary part.

This is a cool technique. I have never seen it before. It does seem to me that there is a bit of an apples-to-oranges comparison because there might be many more arithmetic operations involved in calculating the function of a complex variable versus the function of a real variable. Perhaps a better comparison would be to a higher-order purely-real formula. On the other hand, if one is comparing lines of code and coding effort, this really is an apples-to-apples comparison.

Your exposition is hard for me to follow at f(x + ih) − f(x − ih) ≈ 2 f(x + ih)

If f(x) = x, for example, is it not the case that

f(x+ih)-f(x-ih) = (x + ih) – (x – ih) = -2 ih. ?

and, of course, -2 ih / (-2 ih) = 1 = the derivative of the function f(x) = x

More generally, isn’t f(x + ih) − f(x − ih) = 2 ih (df/dx) + higher order terms?