We sometimes speak of data as if data could talk. For example, we say such things as “What do the data say?” and “Let the data speak for themselves.” It turns out there’s a way to take this figure of speech seriously: Evidence can be meaningfully measured in decibels.
In acoustics, the intensity of a sound in decibels is given by
where P1 is the power of the sound and P0 is a reference value, the power in a sound at the threshold of human hearing.
In Bayesian statistics, the level of evidence in favor of a hypothesis H1 compared to a null hypothesis H0 can be measured in the same way as sound intensity if we take P0 and P1 to be the posterior probabilities of hypotheses H0 and H1 respectively.
Measuring statistical evidence in decibels provides a visceral interpretation. Psychologists have found that human perception of stimulus intensity in general is logarithmic. And while natural logarithms are more mathematically convenient, logarithms base 10 are easier to interpret.
A 50-50 toss-up corresponds to 0 dB of evidence. Belief corresponds to positive decibels, disbelief to negative decibels. If an experiment shows H1 to be 100 times more likely than H0 then the experiment increased the evidence in favor of H1 by 20 dB.
A normal conversation is about 60 acoustic dB. Sixty dB of evidence corresponds to million to one odds. A train whistle at 500 feet produces 90 acoustic dB. Ninety dB of evidence corresponds to billion to one odds, data speaking loudly indeed.
To read more about evidence in decibels, see Chapter 4 of Probability Theory: The Logic of Science.
6 thoughts on “How loud is the evidence?”
It’s a shame this doesn’t seem to be more widely embedded. I don’t care about how likely the data are under some bogus null; I care how likely they are to indicate a departure from accepted models. I think that most non-statisticians expect this as well, rather than the opaque and noisy inference generated by tests with p-values.
Working with log-odds can also be convenient in software. You can sum log-odds without worrying about underflow, whereas taking the product of lots of small probabilities will quickly get you into problems.
It’s not just perception. If human cognition is involved, you’re going to see log scale functions.
Now, convert the decibels into a visual intensity measure. Data visualizations should use color to tell us the intensity of the statistic.
This is the amount of surprise over the expected surprisal. Surprisal = log(p1).
information content (entropy)= log(p1)-log(p0) = log(p1/p0).
where S=N*H, so the 10 is a scaling factor that includes N, so Shannon entropy H using p*log(p) instead of 10*log(p) is accounted for. The physical entropy of what you’ve got there is the difference in the observed and the baseline specific entropy, i.e. entropy per second instead of entropy per mole:
The relation between physical entropy and information entropy is Sinfo=N*H and Sphyscial=N*Sspecific = kb*ln(states)
Sspecific available from tables
Sphysical=kB*ln(2)*N*H where H in log base 2.
Apparently, the “volume of evidence” is not just a fancy turn of phrase after all.