Much of what you’ll read about power laws in popular literature is not mathematically accurate, but still useful.

A lot of probability distributions besides power laws look approximately linear on a log-log plot, particularly over part of their range. The usual conclusion from this observation is that much of the talk about power laws is rubbish. But you could **take this the other way around** and say that all the talk about power laws applies more generally to other distributions!

Most appeals to power laws aren’t about power laws per se. For example, someone may say that supposedly rare events are not as rare as is commonly believed, because power laws. The conclusion may be correct even if the explanation isn’t. We routinely underestimate the probability of extreme events because probability distribution tails are often heavier than we suppose. Power laws are an example of heavy-tailed distributions, but other heavy-tailed distributions can thwart our intuition just as effectively as power laws.

When people say “power law” in this context, substitute “heavy-tailed distribution” in your mind and see whether everything still makes sense. Often it will.

Similarly, when you hear you should go for the Pareto optimum “because power laws,” interpret this generously. When someone appeals to the Pareto principle (a.k.a. the 80-20 rule) they probably don’t expect exactly 80% of returns to come from 20% of efforts. What matters is that the return on effort is not uniformly distributed. In fact, returns are often very unevenly distributed. Some actions accomplish far more than others. This is true when returns have a power law distribution, but certainly isn’t limited to that case.

Fat tails and diminishing returns are a lot to think about, and power laws help us think about them. Certainly power laws are used when they’re not a very good fit, but this is equally true of normal distributions. Both power laws and normal distributions may be useful tools of thought even when they’re not very accurate models.

Power laws are often useful as cautionary tales. “We don’t know that this situation follows a power law. In fact we don’t have much of a clue *what* distribution it follows. But *what if* it follows a power law? Then what?”

The difference between power laws and other heavy-tailed distributions matters more when you’re trying to make quantitative predictions. But here be dragons. It’s likely that data that appears to follow a power law distribution will only follow it so far and then depart. But the same is true of any other distribution you care to fit.

I can’t see “power law” and not think about Cosma Shalizi’s article “So You Think You Have a Power Law — Well Isn’t That Special?” (http://bactra.org/weblog/491.html – slide deck: http://www.stat.cmu.edu/~cshalizi/2010-10-18-Meetup.pdf)