Here’s a simple integration approximation that works remarkably well in some contexts.
Suppose you have an integrand that looks roughly like a normal density, something with a single peak that drops off fairly quickly on either side of the peak. The majority of integrals that arise in basic applications of probability and statistics fit this description.
You can approximate the value of the integral as the area of a box. The height of the box is the value of the integrand at its peak. The width of the box is the FWHM: full width at half maximum, i.e. the distance between two points on where the integrand take on half its maximum value.
This post will include several examples to illustrate how well, or how poorly, the technique works.
First let’s look at exp(-x²), the normal density function apart for some scaling factors, integrated over the whole real line. The maximum of the function is 1, occurring at x = 0. So half the maximum is 1/2, and exp(-x²) = 1/2 at x = ± √ log 2. The FWHM is thus 2 √ log 2, and since our maximum is 1, the FWHM is the integral approximation. This gives 1.665, whereas the exact value is √π = 1.772.
Next, let’s look at integrating 1/(1 + x2n) for positive integers n. In every case, the maximum is 1. The integrand takes on half its maximum value at x = ± 1, so the FWHM is always 2. So we would approximate the integral as 2 in every case, independent of n.
When n = 1, the correct value of the integral is π and so our approximation of 2 is pretty bad. But that’s not surprising. We said above that our integrand needs to drop off fairly quickly, but that’s not the case here. We have a fat tailed distribution, the Cauchy, and so it’s not surprising that our approximation is poor.
However, as n increases, our integral drops off more quickly, and the approximation improves. When n = 2, the exact value of the integral is 2.221. When n = 3, the integral is 2.094. When n = 10 the integral is 2.008. And in the limit as n goes to infinity, the integral is 2 and our approximation is exact.
The integrands in the examples above were all symmetric about the origin. That may give the misleading impression that our approximation only applies to symmetric integrals. The examples also all had a maximum of 1, which could give the impression that the integral approximation is just the FWHM, not the FWHM times the maximum. We’ll correct both of these impressions in the next example.
Our task is to approximate the integral
which comes up in Bayesian logistic regression with an improper prior.
The integrand is symmetric about 0 when a = b. But otherwise the integrand is not symmetric, and the peak occurs at log(a/b) rather than at 0 and the value at that peak is
We’ll use the following Python code to help with our calculations.
from scipy import log, exp from scipy.optimize import brentq def logistic_max(a, b): return a**a * b** b /(a+b)**(a+b) def fwhm(a, b): hm = 0.5*logistic_max(a, b) f = lambda x: a*x - (a+b)*log(1 + exp(x)) - log(hm) left = brentq(f, -2, 0) right = brentq(f, 0, 2) return right - left def approx(a, b): return logistic_max(a, b)*fwhm(a, b)
And now for a couple examples. When a = 3 and b = 2, we get an approximate value of 0.0761, compared to an exact value of 0.0833, a relative error of 8.6%.
When a = 10 and b = 12, we get an approximate value of 2.647 × 10-7, compared to an exact value of 2.835 × 10-7, a relative error of 6.6%.
One thought on “Integral approximation trick”
You could think of this as a special case of ye olde trapezoid rule. For simplicity’s sake translate the function so that the maximum occurs at 0. If f(0) = h, f(a) = f(-a) = h/2, and f(na) = 0 for all integers n > 1 or n < -1 (this is why you need the function to decay rapidly), approximating using trapezoids of width a gives
(0 + h/2)/2 a + (h/2 + h)/2 a + (h + h/2)/2 a + (h/2 + 0)/2 a = 2ah.
If f is asymmetric, you can instead let f(b) = f(-a) = h/2, and use trapezoids of different widths on the left and right, obtaining
(0 + h/2)/2 a + (h/2 + h)/2 a + (h + h/2)/2 b + (h/2 + 0)/2 b = (a + b)h.