Blog Archives

More sides or more dice?

My previous post looked at rolling 5 six-sided dice as an approximation of a normal distribution. If you wanted a better approximation, you could roll dice with more sides, or you could roll more dice. Which helps more? Whether you

Tagged with:
Posted in Math

Rolling dice for normal samples: Python version

A handful of dice can make a decent normal random number generator, good enough for classroom demonstrations. I wrote about this a while ago. My original post included Mathematica code for calculating how close to normal the distribution of the sum

Tagged with: , , ,
Posted in Math, Python

Bad normal approximation

Sometimes you can approximate a binomial distribution with a normal distribution. Under the right conditions, a Binomial(n, p) has approximately the distribution of a normal with the same mean and variance, i.e. mean np and variance np(1-p). The approximation works

Tagged with:
Posted in Statistics

Moments of mixtures

I needed to compute the higher moments of a mixture distribution for a project I’m working on. I’m writing up the code here in case anyone else finds this useful. (And in case I’ll find it useful in the future.)

Tagged with: ,
Posted in Python, Statistics

Data calls the model’s bluff

I hear a lot of people saying that simple models work better than complex models when you have enough data. For example, here’s a tweet from Giuseppe Paleologo this morning: Isn’t it ironic that almost all known results in asymptotic

Tagged with:
Posted in Statistics

Robustness of equal weights

In Thinking, Fast and Slow, Daniel Kahneman comments on The robust beauty of improper linear models in decision making by Robyn Dawes. According to Dawes, or at least Kahneman’s summary of Dawes, simply averaging a few relevant predictors may work

Tagged with:
Posted in Statistics

Offended by conditional probability

It’s a simple rule of probability that if A makes B more likely, B makes A more likely. That is, if the conditional probability of A given B is larger than the probability of A alone, the the conditional probability

Tagged with: ,
Posted in Statistics

Visualization, modeling, and surprises

This afternoon Hadley Wickham gave a great talk on data analysis. Here’s a paraphrase of something profound he said. Visualization can surprise you, but it doesn’t scale well. Modelling scales well, but it can’t surprise you. Visualization can show you

Tagged with:
Posted in Statistics

Statistics stories wanted

Andrew Gelman is trying to collect 365 stories about life as a statistician: So here’s the plan. 365 of you write vignettes about your statistical lives. Get into the nitty gritty—tell me what you do, and why you’re doing it.

Tagged with:
Posted in Statistics

Elementary statistics book recommendation

I’ve thought about making a personal FAQ page. If I do, one of the questions would be what elementary statistics book I recommend. Unfortunately, I don’t have an answer for that one. I haven’t seen such a book I’d recommend

Tagged with:
Posted in Statistics

Closet Bayesian

When I was a grad student, a statistics postdoc confided to me that he was a “closet Bayesian.” This sounded absolutely bizarre. Why would someone be secretive about his preferred approach to statistics? I could not imagine someone whispering that

Tagged with: ,
Posted in Statistics

Beethoven, Beatles, and Beyoncé: more on the Lindy effect

This post is a set of footnotes to my previous post on the Lindy effect. This effect says that creative artifacts have lifetimes that follow a power law distribution, and hence the things that have been around the longest have

Tagged with: , ,
Posted in Uncategorized

The Lindy effect

The longer a technology has been around, the longer it’s likely to stay around. This is a consequence of the Lindy effect. Nassim Taleb describes this effect in Antifragile but doesn’t provide much mathematical detail. Here I’ll fill in some

Tagged with:
Posted in Uncategorized

Extended distribution chart

Lawrence Leemis published a chart in 1986 showing the relationships between around 20 probability distributions. I made an online version of this chart a few years ago. In 2008 Leemis published a larger version of his original chart. A few

Tagged with:
Posted in Uncategorized

How well do moments determine a distribution?

If two random variables X and Y have the same first few moments, how different can their distributions be? Suppose E[Xi] = E[Yi] for i = 0, 1, 2, … 2p. Then there is a polynomial P(x) of degree 2p

Tagged with:
Posted in Math