John Ioannidis stirred up a healthy debate when he published Why Most Published Research Findings Are False. Unfortunately, most of the discussion has been over whether the word “most” is correct, i.e. whether the proportion of false results is more or less than 50 percent. At least there is more awareness that some published results are false and that it would be good to have some estimate of the proportion.
However, a more fundamental point has been lost. At the core of Ioannidis’ paper is the assertion that the proportion of true hypotheses under investigation matters. In terms of Bayes’ theorem, the posterior probability of a result being correct depends on the prior probability of the result being correct. This prior probability is vitally important, and it varies from field to field.
In a field where it is hard to come up with good hypotheses to investigate, most researchers will be testing false hypotheses, and most of their positive results will be coincidences. In another field where people have a good idea what ought to be true before doing an experiment, most researchers will be testing true hypotheses and most positive results will be correct.
For example, it’s very difficult to come up with a better cancer treatment. Drugs that kill cancer in a petri dish or in animal models usually don’t work in humans. One reason is that these drugs may cause too much collateral damage to healthy tissue. Another reason is that treating human tumors is more complex than treating artificially induced tumors in lab animals. Of all cancer treatments that appear to be an improvement in early trials, very few end up receiving regulatory approval and changing clinical practice.
A greater proportion of physics hypotheses are correct because physics has powerful theories to guide the selection of experiments. Experimental physics often succeeds because it has good support from theoretical physics. Cancer research is more empirical because there is little reliable predictive theory. This means that a published result in physics is more likely to be true than a published result in oncology.
Whether “most” published results are false depends on context. The proportion of false results varies across fields. It is high in some areas and low in others.
* * *
For daily tips on data science, follow @DataSciFact on Twitter.