If you present calculus students with a definite integral, their Pavlovian response is “Take the anti-derivative, evaluate it at the limits, and subtract.” They think that’s what it means. But it’s not what a definite integral *means*. It’s how you (usually) *calculate* its value. This is not a pedantic fine point but a practically important distinction. **It pays to distinguish what something means from how you usually calculate it**. Without this distinction, things that are possible may seem impossible. [1]

For example, suppose you want to compute the following integral that comes up frequently in probability.

There is no (elementary) function whose derivative is exp(−*x*^{2}). It’s not just hard to find or ugly. It simply doesn’t exist, not within the universe of elementary functions. There are functions whose derivative is exp(−*x*^{2}), but these functions are not finite algebraic combinations of the kinds of functions you’d see in high school.

If you think of the definite integral above as meaning “the result you get when you find an antiderivative, let its arguments go off to ∞ and −∞, and subtract the two limits” then you’ll never calculate it. And when you hear that the antiderivative doesn’t exist (in the world of functions you’re familiar with) then you might think that not only can you not calculate the integral, no one can.

In fact the integral is easy to calculate. It requires an ingenious trick [2], but once you see that trick it’s not hard.

Let *I* be the value of the integral. Changing the integration variable makes no difference, i.e.

and so

This integral can be converted to polar coordinates. Instead of describing the plane as an infinite square with *x* and *y* each going off to infinity in both directions, we can think of it as an infinite disk, with radius going off to infinity. The advantage of this approach is that the Jacobian of the change of variables gives us an extra factor of *r* that makes the exponential integral tractable.

From this we get *I*^{2} = π and so *I* = √π.

This specific trick comes up occasionally. But more generally, it is often the case that definite integrals are easier to compute than indefinite integrals. One of the most common applications of complex analysis is computing such integrals through the magic of contour integration. This leads to a lesson closely related to the one above, namely that **you may not have to do what it looks like you need to do**. In this case, you don’t always need to compute indefinite integrals (anti-derivatives) as an intermediate step to compute definite integrals. [3]

Mathematics is filled with theorems that effectively say that you don’t actually have to compute what you conceptually need to compute. Sometimes you can get by with calculating much less.

* * *

[1] One frustration I’ve had working with statisticians is that many have forgotten the distinction between *what* they want to calculate and *how* they calculate it. This makes it difficult to suggest better ways of computing things.

[2] Lord Kelvin said of this trick “A mathematician is one to whom *that* is as obvious as that twice two makes four is to you. Liouville was a mathematician.”

[3] If you look back carefully, we had to compute the integral of exp(-*r*^{2}) *r*, which you would do by first computing its anti-derivative. But we didn’t have to compute the anti-derivative of the original integrand. We traded a hard (in some sense impossible) anti-derivative problem for an easy one.

I first came up to this lesson by Prof. David Jerison at MIT OCW’s Single Variable Calculus lectures. Here’s the video (https://ocw.mit.edu/courses/mathematics/18-01sc-single-variable-calculus-fall-2010/unit-3-the-definite-integral-and-its-applications/part-c-average-value-probability-and-numerical-integration/session-65-bell-curve-conclusion), I’m pretty sure you (and your readers) will find it pretty insightful and fun.

How timely! I’m preparing material for a first course in probability next semester, and needed a good, non-trivial application of the Jacobian. I am SO stealing this lesson. Thanks,

[1] Can you provide an example of this? I’d be interested to see exactly what you mean and what ramifications this has for statistical analysis.

Or you can use the definition for the normal density, which sums to 1. Make sigma = sqrt(2) and you immediately get sqrt(pi) for your answer.

This problem is also a specific instance of the gamma function, in particular gamma(1/2).

I think this calculation came first, i.e. that’s how we know the normalizing constant for the normal density.

Thank you!