Here’s a little saying that irritates me:
Absence of evidence is not evidence of absence.
It’s the kind of thing a Sherlock Holmes-like character might say in a detective novel. The idea is that we can’t be sure something doesn’t exist just because we haven’t seen it yet.
What bothers me is that the statement misuses the word “evidence.” The statement would be correct if we substituted “proof” for “evidence.” We can’t conclude with absolute certainty that something doesn’t exist just because we haven’t yet proved that it does. But evidence is not the same as proof.
Why do we believe that dodo birds are extinct? Because no one has seen one in three centuries. That is, there is an absence of evidence that they exist. That is tantamount to evidence that they do not exist. It’s logically possible that a dodo bird is alive and well somewhere, but there is overwhelming evidence to suggest this is not the case.
Evidence can lead to the wrong conclusion. Why did scientists believe that the coelacanth was extinct? Because no one had seen one except in fossils. The species was believed to have gone extinct 65 million years ago. But in 1938 a fisherman caught one. Absence of evidence is not proof of absence.
Though it is not proof, absence of evidence is unusually strong evidence due to subtle statistical result. Compare the following two scenarios.
Scenario 1: You’ve sequenced the DNA of a large number prostate tumors and found that not one had a particular genetic mutation. How confident can you be that prostate tumors never have this mutation?
Scenario 2: You’ve found that 40% of prostate tumors in your sample have a particular mutation. How confident can you be that 40% of all prostate tumors have this mutation?
It turns out you can have more confidence in the first scenario than the second. If you’ve tested N subjects and not found the mutation, the length of your confidence interval around zero is proportional to N. But if you’ve tested N subjects and found the mutation in 40% of subjects, the length of your confidence interval around 0.40 is proportional to √N. So, for example, if N = 10,000 then the former interval has length on the order of 1/10,000 while the latter interval has length on the order of 1/100. This is known as the rule of three. You can find both a frequentist and a Bayesian justification of the rule here.
Absence of evidence is unusually strong evidence that something is at least rare, though it’s not proof. Sometimes you catch a coelacanth.