Suppose you’re drawing random samples uniformly from some interval. How likely are you to see a new value outside the range of values you’ve already seen? The problem is more interesting when the interval is unknown. You may be trying…

Suppose you’re drawing random samples uniformly from some interval. How likely are you to see a new value outside the range of values you’ve already seen? The problem is more interesting when the interval is unknown. You may be trying…

Cancer research is sometimes criticized for being timid. Drug companies run enormous trials looking for small improvements. Critics say they should run smaller trials and more of them. Which side is correct depends on what’s out there waiting to be…

There’s a theorem in statistics that says You could read this aloud as “the mean of the mean is the mean.” More explicitly, it says that the expected value of the average of some number of samples from some distribution…

Suppose a large number of people each have a slightly better than 50% chance of correctly answering a yes/no question. If they answered independently, the majority would very likely be correct. For example, suppose there are 10,000 people, each with…

Russ Roberts had this to say about the proposal to replacing the calculus requirement with statistics for students. Statistics is in many ways much more useful for most students than calculus. The problem is, to teach it well is extraordinarily…

David Hogg calls conventional statistical notation a “nomenclatural abomination”: The terminology used throughout this document enormously overloads the symbol p(). That is, we are using, in each line of this discussion, the function p() to mean something different; its meaning…

When I was in college, I overheard two senior faculty arguing over an undergraduate probability homework assignment. This seemed very strange. It occurred to me that I’d never seen faculty argue over something elementary before, and I couldn’t imagine an…

I was culling out books, mostly obsolete technical books, and I remembered that I have an extra copy of Feller’s classic probability text. It’s volume 1, second edition. If you’re a student and would like the book, please send me…

John Ioannidis stirred up a healthy debate when he published Why Most Published Research Findings Are False. Unfortunately, most of the discussion has been over whether the word “most” is correct, i.e. whether the proportion of false results is more…

From Controversies in the Foundations of Statistics by Bradley Efron: Statistics seems to be a difficult subject for mathematicians, perhaps because its elusive and wide-ranging character mitigates against the traditional theorem-proof method of presentation. It may come as some comfort…

The other day I heard someone say something like the following: I can’t believe how people don’t understand probability. They don’t realize that if a coin comes up heads 20 times, on the next flip there’s still a 50-50 chance…

Sometimes you can derive a probability distributions from a list of properties it must have. For example, there are several properties that lead inevitably to the normal distribution or the Poisson distribution. Although such derivations are attractive, they don’t apply…

Stand on a large clock, say on the 1. Now flip a coin and move ahead one hour if the coin turns up heads, and back one hour otherwise. Keep repeating the process until you’ve stood on all 12 numbers.…

For a set of positive probabilities pi summing to 1, their entropy is defined as (For this post, log will mean log base 2, not natural log.) This post looks at a couple questions about computing entropy. First, are there…

Bayesian statistics is to Python as frequentist statistics is to Perl. Perl has the slogan “There’s more than one way to do it,” abbreviated TMTOWTDI and pronouced “tim toady.” Perl prides itself on variety. Python takes the opposite approach. The…