In his speech The dog and the Frisbee, Andrew Haldane argues that simple models often outperform complex models in complex situations. He cites as examples sports prediction, diagnosing heart attacks, locating serial criminals, picking stocks, and understanding spending patterns. The gist of his argument is this:

Complex environments often instead call for simple decision rules. That is because these rules are more robust to ignorance.

And yet behind every complex set of rules is a paper showing that it outperforms simple rules, under conditions of its author’s choosing. That is, the person proposing the complex model picks the scenarios for comparison. Unfortunately, the world throws at us scenarios not of our choosing. Simpler methods may perform better when model assumptions are violated. And model assumptions are always violated, at least to some extent.

**Related posts**:

- More theoretical power, less real power
- Advantages of crude models
- Canonical examples from robust statistics

For daily tips on data science, follow @DataSciFact on Twitter.

In a single word:

Overfitting.

See also: “The robust beauty of improper linear models in decision”, the wonderful 1979 paper by Robyn Dawes, which argues against doing *any tuning* of the term ≈weights in a linear regression.

http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.188.5825

In passing, it suggests the following crude but very effective model of marital happiness: rate of lovemaking minus rate of fighting.

There’s a good… well, it’s either an example or a counter-example, depending on how you look at it. The most accurate model for predicting earthquakes is to say that wherever there’s just been an earthquake, there will be another one soon (but more quantified than that, obviously). Though the simple model works a lot of the time, it only predicts the edge case of multiple shocks.

So this is kind of the opposite of what you’re talking about: a simple model that only works in certain cases. It creates a huge niche for more complex models that are applicable for more cases. You could say the same about the Newtonian vs Einsteinian motion models.

Sometimes you don’t want robustness (in the statistical sense).

I.e. you may want to change your mind as the facts (or data) change.

Maybe if you’re dealing with noise-y data you want robustness. But that’s another issue.