An estimator in statistics is a way of guessing a parameter based on data. An estimator is unbiased if over the long run, your guesses converge to the thing you’re estimating. Sounds eminently reasonable. But it might not be.
Suppose you’re estimating something like the number of car accidents per week in Texas and you counted 308 the first week. What would you estimate is the probability of seeing no accidents the next week?
If you use a Poisson model for the number of car accidents, a very common assumption for such data, there is a unique unbiased estimator. And this estimator would estimate the probability of no accidents during a week as 1. Worse, had you counted 307 accidents, the estimated probability would be -1! The estimator alternates between two ridiculous values, but in the long run these values average out to the true value. Exact in the limit, useless on the way there. A slightly biased estimator would be much more practical.
See Michael Hardy’s article for more details: An_Illuminating_Counterexample.pdf
For daily tips on data science, follow @DataSciFact on Twitter.