Folk wisdom says that for all practical purposes, a Student-*t* distribution with 30 or more degrees of freedom is a normal distribution. Well, not for *all* practical purposes.

For 30 or more degrees of freedom, the error in approximating the PDF or CDF of a Student-*t* distribution with a normal is less than 0.005. So for many applications, the *n* > 30 rule of thumb is appropriate. (See these notes for details.)

However, sometimes you need to look at the quantiles of a *t* distribution, such as when finding confidence intervals. For example, when computing confidence intervals, you don’t need to evaluate the CDF of a Student-*t* distribution *per se* but rather the *inverse* of such a CDF. And in that case, the error in the normal approximation was larger than I expected.

Say you’re computing a 95% confidence interval for the mean of a set of 31 data points. You first find *t*^{*} such that *P*(*t* > *t*^{*}) = 0.025 where *t* is a Student-*t* random variable with 31 – 1 = 30 degrees of freedom. Your confidence interval is the sample mean +/-* t*^{*} *s*/√*n* where *s* is the sample standard deviation. For 30 degrees of freedom, *t*^{*} = 2.04. If you used the normal approximation, you’d get 1.96 instead of 2.04, a relative error of about 4% meaning the error in computing your confidence interval is about 4%. While the error in normal approximation to the CDF is less than 0.005 for *n* > 30, the error in the normal approximation to the CDF *inverse* is an order of magnitude greater. Also, the error increases as the confidence increases. For example, for a 99% confidence interval, the error is about 6.3%.

It may be that none of this is a problem. If you only have 31 data points, there’s a fair amount of uncertainty in your estimate of the mean, and there’s no point in quantifying with great precision an estimate of how uncertain you are! Modeling assumptions are probably a larger source of error than the normal approximation to the Student-*t*. But as a numerical problem, it’s interesting that the approximation error may be larger than expected. For *n* = 300, the error in the normal approximation to *t*^{*} is about 0.4%. This means the error in the normal approximation to the inverse CDF is as good at *n*=300 as the normal approximation to the CDF itself is at *n*=30.

I think it is also interesting / instructive to think of the slightly wider confidence intervals as the price you pay for having to estimate the variance as well as the mean, instead of just the mean.

I always thought it was hokey how introductory texts ease you into things by first (or second) posing the questions in terms of estimating the population mean from a sample when the population variance is known. Of course it is done this way for pedagogical reasons, but who has ever heard of such a situation, where the variance is known exactly but the mean is unknown?[1]

If I recall correctly most of my students were more confused than enlightened by this strategy. But they were taking my class because they wanted to avoid as much math as possible. I think this helps students more if they are more conversant with math.

[1] I have since then thought of at least one plausible situation. Suppose you are measuring some constant quantity (say, length or mass of some object) and the measurement device is known to have an absolute error distributed normally with zero mean and some specific, known variance. Then the results you would get with repeated measurments would have an unknown mean (the quantity of interest) but be normally distributed with known variance.

I agree, the one-sample z-test can be left out of a stat class. The t-test is only slightly more complicated. In fact, it may even be easier to teach the t-test first since students wouldn’t be distracted by wondering how you could possibly know the variance without knowing the mean. (Assuming they’re tracking well enough to be confused.)

However, I think it’s worthwhile to teach the two-sample z-test. The difference of two normals is a normal; the difference between two t distributions is only

approximatelya t distribution, and there’s no simple way to say what the appropriate degrees of freedom are for the approximating t distribution. So in this case, the t-test is sufficiently complicated that it’s worthwhile to derive a z-test for a warm-up.