The data may not contain the answer

Mark Reid sent me a link to a couple quotes by John Tukey that I had not seen before. First,

To statisticians, hubris should mean the kind of pride that fosters an inflated idea of one’s powers and thereby keeps one from being more than marginally helpful to others. … The feeling of “Give me (or more likely even, give my assistant) the data, and I will tell you what the real answer is!” is one we must all fight against again and again, and yet again.


The data may not contain the answer. The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data.

Here are some more posts about John Tukey:

Approximate problems and approximate solutions
Innovation IV
Tukey tallying
How to linearize data for regression

Tagged with: ,
Posted in Statistics
2 comments on “The data may not contain the answer
  1. John Venier says:

    Tukey and Deming are the two statisticians I admire the most.

  2. David Judkins says:

    Thanks! I was looking for the exact wording of the “aching desire” quote.

2 Pings/Trackbacks for "The data may not contain the answer"
  1. [...] father of modern exploratory data analysis and data visualization. These quotes (the first two via The Endeavour, the third from his Wikipedia article) should be read and taken to heart. To statisticians, hubris [...]

  2. [...] Maybe the information relevant to treating your malady is in how DNA is expressed, not in the DNA per se, in which case a sequence of your genome would be useless. Or maybe the most important information is not genetic at all. The data may not contain the answer. [...]