From John Tukey’s Sunset Salvo:
Our suffering sinuses are now frequently relieved by antihistamines. Our suffering philosophy — whether implicit or explicit — of data analysis, or of statistics, or of science and technology needs to be far more frequently relieved by antihubrisines.
To the Greeks hubris meant the kind of pride that would be punished by the gods. To statisticians, hubris should mean the kind of pride that fosters an inflated idea of one’s powers and thereby keeps one from being more than marginally helpful to others.
Tukey then lists several antihubrisines. The first is this:
The data may not contain the answer. The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data.
3 thoughts on “Antihubrisines”
Not mine, but relevant:
Big Data, n.: the belief that any sufficiently large pile of shit contains a pony with probability approaching 1
While I wouldn’t have phrased it quite that way, there is some truth in that; for is it not true that as a set of random data grows to infinity, the probability of finding any desired substring approaches one?
It would need to be a _very_ large pile of equine effluvium, though.
@Avi: I think that’s right if you think of a pile of shit informally as information about the world but without much structure. What we really have is piles of, um, data, but with a lot of structure which makes it worth less than a pile of shit. Plenty of interesting problems can’t be solved through big data because now matter how big the N, it doesn’t inform you about your question. e.g.-All the facebook scraping in the world won’t tell you about which duck first transmits bird flu to people if the duck belongs to a person living in a hut in North Korea.