The **Cauchy distribution** is the probability distribution on the real line with density proportional to 1/(1 + *x*²). It comes up often as an example of a fat-tailed distribution, one that can wreak havoc on intuition and on applications. It has no mean and no variance.

The **truncated Cauchy distribution** has the same density, with a different proportionality constant, over a finite interval [*a*, *b*]. It’s a well-behaved distribution, with a mean, variance, and any other moment you might care about.

Here’s a sort of **paradox**. The Cauchy distribution is pathological, but the truncated Cauchy distribution is perfectly well behaved. But if *a* is a big negative number and *b* is a big positive number, surely the Cauchy distribution on [*a*, *b*] is sorta like the full Cauchy distribution.

I will argue here that in some sense the truncated Cauchy distribution isn’t so well behaved after all, and that a Cauchy distribution truncated to a large interval is indeed like a Cauchy distribution.

You can show that the variance of the truncated Cauchy distribution over a large interval is approximately equal to the length of the interval. So if you want to get around the problems of a Cauchy distribution by truncating it to live on a large interval, you’ve got a problem: it matters very much how you truncate it. For example, you might think “It doesn’t matter, make it [-100, 100]. Or why not [-1000, 1000] just to be sure it’s big enough.” But the variances of the two truncations differ by a factor of 10.

Textbooks usually say that the Cauchy distribution has **no** variance. It’s more instructive to say that it has **infinite** variance. And as the length of the interval you truncate the Cauchy to approaches infinity, so does its variance.

The mean of the Cauchy distribution does not exist. It fails to exist in a different way than the variance. The integral defining the variance of a Cauchy distribution diverges to +∞. But the integral defining the mean of the Cauchy simply does not exist. You can get different values of the integral depending on how you let the end points go off to infinity. In fact, you could get any value you want by specifying a particular way for truncation interval to grow to the real line.

So once again you can’t simply say “Just truncate it to a big interval” because the mean depends on how you choose your interval.

If we were working with a thin-tailed distribution, like a normal, and truncating it to a big interval, the choice of interval would make little difference. If you truncate a normal distribution to [-10, 5], or [-33, 42], or [-2000, 1000] makes almost no difference to the mean or the variance. But for the Cauchy distribution, it makes a substantial difference.

***

I’ve experimented with something new in this post: it’s deliberately short on details. Lots of words, no equations. Everything in this post can be made precise, and you may consider it an exercise to make everything precise if you’re so inclined.

By leaving out the details, I hope to focus attention on the philosophical points of the post. I expect this will go over well with people who don’t want to see the details, and with people who can easily supply the details, but maybe not with people in between.

” I expect this will go over well with people who don’t want to see the details, and with people who can easily supply the details, …”

What fun! You’ve just written a great open-ended homework project for upper-class stats majors, thanks!. What did you get into (or get into you) to make you start channeling for Donald Knuth?

It is a fine model of a blog post. It might interesting to complete the blog post with the mathematical analysis, possibly as distinct material (e.g., a technical note). An appendix of sort.The mathematical analysis could only be sketched…

Since the variance is defined in terms of the mean and the mean does not exist for a Cauchy, in what sense can you say the variance is infinite?

Good point. According to the usual definitions you can’t define variance without a mean. Intuitively, the mean doesn’t exist because it wanders around without settling on any particular value. But the variance doesn’t wander around, it goes to infinity. You could formalize that by looking at the sample mean and sample variance as the sample size goes to infinity.

If you’re looking for feedback on your new style, I’ll say that I didn’t notice until you mentioned it. I’m going to tell myself it’s because I’m in the group “who can easily supply the details”.

We might also redefine the mean and variance as the argmin and min, respectively, of the expected squared error, E((X-x)^2) as a function of x.

For the Cauchy distribution, this is everywhere infinity, so it makes sense to say that the mean is undefined and the variance is infinity.

The way I learned about the Cauchy distribution having an undefined mean was to think of it as being the distribution of a rays emanating from a single point, where the rays’ angles are uniformly distributed. You wouldn’t say that a circular ripple travels outward at a well-defined “mean angle”, and that reasoning carries through the simple coordinate transformation from azimuthal to 1D rectangular coordinates.

I don’t know the limitations of this picture (and I’d be curious to learn!), but it did make the concept of an undefined mean more palatable.

Great post as always. I also didn’t notice the change in style, which perhaps suggests that your usual style was already quite readable and intuitive!

Prior posts: “Here’s some LaTeX.”

This post: “Now use it!”