SciPy integration misunderstanding

Today I needed to compute an integral similar to this:

\int_{1000}^\infty \frac{dx}{100, x^3}

I used the following SciPy code to compute the integral:

from scipy.integrate import quad

def f(x):
    return 0.01*x**-3

integral, error = quad(f, 1000, sp.inf, epsrel = 1e-6)
print integral, error

My intention was to compute the integral to 6 significant figures. (epsrel is a shortened form of epsilon relative, i.e. relative error.) To my surprise, the estimated error was larger than the value of the integral. Specifically, the integral was computed as 5.15 × 10−9 and the error estimate was 9.07 × 10−9.

What went wrong? The integration routine quad lets you specify either a desired bound on your absolute error (epsabs) or a desired bound on your relative error (epsrel). I assumed that since I specified the relative error, the integration would stop when the relative error requirement was met. But that’s not how it works.

The quad function has default values for both epsabs and epsrel.

def quad(... epsabs=1.49e-8, epsrel=1.49e-8, ...):

I thought that since I did not specify an absolute error bound, the bound was not effective, or equivalently, that the absolute error target was 0. But no! It was as if I’d set the absolute error bound to 1.49 × 10−8. Because my integral is small (the exact value is 5 × 10−9) the absolute error requirement is satisfied before the relative error requirement and so the integration stops too soon.

The solution is to specify an absolute error target of zero. This condition cannot be satisfied, and so the relative error target will determine when the integration stops.

integral, error = quad(f, 1000, sp.inf, epsrel = 1e-6, epsabs = 0)

This correctly computes the integral as 5 × 10−9 and estimates the integration error as 4 ×10−18.

It makes some sense for quad to specify non-zero default values for both absolute and relative error, though I imagine most users expect small relative error rather than small absolute error, so perhaps the latter could be set to 0 by default.

10 thoughts on “SciPy integration misunderstanding

  1. If there was a reliable way to detect whether the user set the parameter or not this’d be easy to solve. Perhaps have both set to negative 1.49e-8 and have a negative value as an indication you should use the positive value only if the other parameter is not positive (and thus set by the user). I would prefer if Python could provide a method to call on the parameter to find out; something like “epsrel.user_set == True”.

  2. A common pattern i have seen to distinguish default values from user specified values, is to default to None.
    instead of:

    def foo(a=0, b=0):
    ....

    do

    def foo(a=None, b=None):
    if (a==None): a = 0
    if (b==None): b = 0

    or in complicated cases you can do:

    def foo(a=None, b=None):
    if (a==None) and (b==None):
    ....
    elif (a==None):
    ....

  3. My a priori expectation, having never used SciPy or looked at the documentation, would be for quad to have to comply with both the epsabs and epsrel constraints so if you didn’t care about the absolute error you should set epsabs to infinite, not zero.

    Saying “tolerate zero absolute error” as a means of saying “don’t care about the error” seems very back to front. Say you’d put epsrel = 1e-6, epsabs = 2e-18 then you’d have got the same result as with epsrel = 1e-6, epsabs = 0 (i.e., a result with an absolute error of 4e-18) which just seems wrong; ignoring a constraint the user has expressed is potentially quite dangerous.

    I think this is an API design failure which arises from thinking about the whole issue the wrong way; where the API should be defined in terms of “what” is wanted (what constraints to put on the result) it is actually defined by “how” it’s to be produced (how should we terminate our loop).

  4. @Ed

    It seems all right to me. I read it as “iterate until the estimated error is epsabs or less”. In this case, you want to iterate until the error reaches as low as possible (and 0 is the minimum), because you only want to stop when a certain relative error is reached.

    The code probably has a while loop that goes on until either epsilon is satisfied.

  5. Typically, the criterion used is to accept an approximate value if

    abs(error_estimate) <= eps_abs + eps_rel * abs(value).

    So setting eps_abs to zero has the desired effect.
    The reason this is not done by default (when eps_rel is specified) is that this causes a problem when the value is zero or very small (which you may not now before doing the calculation). For values close to zero setting eps_abs often makes more sense than setting eps_rel.

  6. @Vlad N: «I read it as “iterate until the estimated error is epsabs or less”. »

    You’re thinking that the specification should say how the function should be implemented, not what its result should be. This is poor abstraction.

  7. @Ed Davies

    It would be great if we could specify that the approximation should be within eps of the exact solution, but unfortunately for many general numerical methods this is not possible. So the abstraction is necessarily leaky and it is useful to be aware of some of the implementation details.
    Unless you do some analysis of the specific problem at hand, the best you can hope for is that eps_abs and eps_rel are knobs that you can turn to get a more precise solution. This is especially the case for a lot of software for approximating the solution of differential equations.
    The documentation and references for QUADPACK [1] are very useful for the case of numerical integration.
    Quote from [2]: “Yet it must be stressed that the programs cannot be totally reliable.”

    [1] http://nines.cs.kuleuven.be/software/QUADPACK/
    [2] http://www.netlib.org/quadpack/doc

  8. You probably could have mapped the integral from 1000 -> infinity of x^-3 to be an integral from0 to 1 of something proportional to x^3 and it would have been more numerically well-posed.

  9. I tried the same integral in R, using the default argument and I arrived at similar values:
    fun <- function(x) (1/(100*x^3)) integrate(fun,lower=1000,upper=Inf)
    5.156078e-09 with absolute error < 9.1e-09

    But unlike in Python, as the doc shows, the abs error is by default equated to the relative error.

  10. Hi guys,
    While i was trying to evaluate following integral
    def i1(x,r,delta):
    alpha=0.1
    kf=np.sqrt(1-delta**2)
    aa=np.cos((2*kf+x)*r)
    bb=sp.exp(-alpha*x)
    return pist(x)*aa*bb
    def result1(r,delta):
    res1,err=integrate.quad(lambda x: i1(x,r,delta),0,sp.inf, epsrel = 1e-6, epsabs = 2e-18)
    return res1
    take for example, delta=0.6. After evaluation what i found that for some values of r, the error is comparable or more than the result. Can any one of you throw light on this issue. Moreover what theoretical suggestion shows that the integral will be independent of alpha (defined in program), but i don’k know what’s happening with the quad routine in python.

Comments are closed.