The Cauchy distribution is the probability distribution on the real line with density proportional to 1/(1 + x²). It comes up often as an example of a fat-tailed distribution, one that can wreak havoc on intuition and on applications. It has no mean and no variance.
The truncated Cauchy distribution has the same density, with a different proportionality constant, over a finite interval [a, b]. It’s a well-behaved distribution, with a mean, variance, and any other moment you might care about.
Here’s a sort of paradox. The Cauchy distribution is pathological, but the truncated Cauchy distribution is perfectly well behaved. But if a is a big negative number and b is a big positive number, surely the Cauchy distribution on [a, b] is sorta like the full Cauchy distribution.
I will argue here that in some sense the truncated Cauchy distribution isn’t so well behaved after all, and that a Cauchy distribution truncated to a large interval is indeed like a Cauchy distribution.
You can show that the variance of the truncated Cauchy distribution over a large interval is approximately equal to the length of the interval. So if you want to get around the problems of a Cauchy distribution by truncating it to live on a large interval, you’ve got a problem: it matters very much how you truncate it. For example, you might think “It doesn’t matter, make it [-100, 100]. Or why not [-1000, 1000] just to be sure it’s big enough.” But the variances of the two truncations differ by a factor of 10.
Textbooks usually say that the Cauchy distribution has no variance. It’s more instructive to say that it has infinite variance. And as the length of the interval you truncate the Cauchy to approaches infinity, so does its variance.
The mean of the Cauchy distribution does not exist. It fails to exist in a different way than the variance. The integral defining the variance of a Cauchy distribution diverges to +∞. But the integral defining the mean of the Cauchy simply does not exist. You can get different values of the integral depending on how you let the end points go off to infinity. In fact, you could get any value you want by specifying a particular way for truncation interval to grow to the real line.
So once again you can’t simply say “Just truncate it to a big interval” because the mean depends on how you choose your interval.
If we were working with a thin-tailed distribution, like a normal, and truncating it to a big interval, the choice of interval would make little difference. If you truncate a normal distribution to [-10, 5], or [-33, 42], or [-2000, 1000] makes almost no difference to the mean or the variance. But for the Cauchy distribution, it makes a substantial difference.
I’ve experimented with something new in this post: it’s deliberately short on details. Lots of words, no equations. Everything in this post can be made precise, and you may consider it an exercise to make everything precise if you’re so inclined.
By leaving out the details, I hope to focus attention on the philosophical points of the post. I expect this will go over well with people who don’t want to see the details, and with people who can easily supply the details, but maybe not with people in between.