Blog Archives

Distribution of a range

Suppose you’re drawing random samples uniformly from some interval. How likely are you to see a new value outside the range of values you’ve already seen? The problem is more interesting when the interval is unknown. You may be trying

Tagged with:
Posted in Math, Statistics

Timid medical research

Cancer research is sometimes criticized for being timid. Drug companies run enormous trials looking for small improvements. Critics say they should run smaller trials and more of them. Which side is correct depends on what’s out there waiting to be

Tagged with: , ,
Posted in Science, Statistics

The mean of the mean is the mean

There’s a theorem in statistics that says You could read this aloud as “the mean of the mean is the mean.” More explicitly, it says that the expected value of the average of some number of samples from some distribution

Tagged with:
Posted in Statistics

Independent decision making

Suppose a large number of people each have a slightly better than 50% chance of correctly answering a yes/no question. If they answered independently, the majority would very likely be correct. For example, suppose there are 10,000 people, each with

Tagged with:
Posted in Math

On replacing calculus with statistics

Russ Roberts had this to say about the proposal to replacing the calculus requirement with statistics for students. Statistics is in many ways much more useful for most students than calculus. The problem is, to teach it well is extraordinarily

Tagged with: ,
Posted in Statistics

Nomenclatural abomination

David Hogg calls conventional statistical notation a “nomenclatural abomination”: The terminology used throughout this document enormously overloads the symbol p(). That is, we are using, in each line of this discussion, the function p() to mean something different; its meaning

Tagged with:
Posted in Statistics

Probability is subtle

When I was in college, I overheard two senior faculty arguing over an undergraduate probability homework assignment. This seemed very strange. It occurred to me that I’d never seen faculty argue over something elementary before, and I couldn’t imagine an

Tagged with:
Posted in Uncategorized

Giving away classic probability book

I was culling out books, mostly obsolete technical books, and I remembered that I have an extra copy of Feller’s classic probability text. It’s volume 1, second edition. If you’re a student and would like the book, please send me

Tagged with: ,
Posted in Uncategorized

Some fields produce more false results than others

John Ioannidis stirred up a healthy debate when he published Why Most Published Research Findings Are False. Unfortunately, most of the discussion has been over whether the word “most” is correct, i.e. whether the proportion of false results is more

Tagged with: ,
Posted in Science, Statistics

Elusive statistics

From Controversies in the Foundations of Statistics by Bradley Efron: Statistics seems to be a difficult subject for mathematicians, perhaps because its elusive and wide-ranging character mitigates against the traditional theorem-proof method of presentation. It may come as some comfort

Tagged with:
Posted in Statistics

Levels of uncertainty

The other day I heard someone say something like the following: I can’t believe how people don’t understand probability. They don’t realize that if a coin comes up heads 20 times, on the next flip there’s still a 50-50 chance

Tagged with: ,
Posted in Math

Deriving distributions vs fitting distributions

Sometimes you can derive a probability distributions from a list of properties it must have. For example, there are several properties that lead inevitably to the normal distribution or the Poisson distribution. Although such derivations are attractive, they don’t apply

Tagged with:
Posted in Statistics

Random walk on a clock

Stand on a large clock, say on the 1. Now flip a coin and move ahead one hour if the coin turns up heads, and back one hour otherwise. Keep repeating the process until you’ve stood on all 12 numbers.

Tagged with:
Posted in Math

Calculating entropy

For a set of positive probabilities pi summing to 1, their entropy is defined as (For this post, log will mean log base 2, not natural log.) This post looks at a couple questions about computing entropy. First, are there

Tagged with:
Posted in Computing, Math

Bayes : Python :: Frequentist : Perl

Bayesian statistics is to Python as frequentist statistics is to Perl. Perl has the slogan “There’s more than one way to do it,” abbreviated TMTOWTDI and pronouced “tim toady.” Perl prides itself on variety. Python takes the opposite approach. The

Tagged with: ,
Posted in Statistics