Comments on: Elementary statistics book recommendation

By: Rafael

Rafael — Thu, 26 Jul 2018 21:13:38 +0000

That’s great. Thank you very much.

By: John

John — Fri, 06 Jul 2018 14:00:26 +0000

In reply to Rafael. My suggestion is to focus on learning probability first. It's easier to present probability correctly, and most books do a fairly good job. Once you know probability well, you'll be sensitive to the false statements in many statistics books if you pay attention.

By: Rafael

Rafael — Fri, 06 Jul 2018 13:10:45 +0000

So what should I do? I currently only know the difference between mean, mode, median. I really want to understand statistics; mainly for linguistic research, but I also want to avoid all those pitfalls you’ve written about.

By: John

John — Fri, 15 Dec 2017 11:21:56 +0000

In reply to Vegarsti. Sorry, I'm not familiar with that one.

By: Vegarsti

Vegarsti — Fri, 15 Dec 2017 10:52:17 +0000

What do you think of Devore & Berk? We used it in undergrad. Felt decent.

By: Pat Trayers

Pat Trayers — Sun, 19 Jun 2016 19:34:51 +0000

The best book I have used is: College Statistics Made Easy. It claims to be the best book to use and is so different from any other statistics book. Its link is:
https://www.amazon.com/College-Statistics-Made-Easy-Connolly/dp/0993304702?ie=UTF8&*Version*=1&*entries*=0

By: Alecco

Alecco — Sat, 06 Apr 2013 16:55:10 +0000

Statistics in Plain English is quite good for intro. Got it out of the top threads in stats.stackexchange.

By: Kent Hunter

Kent Hunter — Mon, 28 Jan 2013 06:29:32 +0000

So what “more advanced books” did you learn from? I have a pretty good grasp of real analysis, but I’ve never managed to get as comfortable with statistics as I would like.

By: BobC

BobC — Sun, 20 Jan 2013 03:24:38 +0000

I’d recommend starting statistics from a real-world perspective, dealing with measurements of real systems. Learning about the characterization of, and detection of, measurement errors and biases. Learning when to discard data, and how to improve how data is taken. Only after the data itself has been well described can the next step of performing higher-level analyses be applied.

It’s engineering statistics, for which there are many excellent freshman-level texts. It starts with error estimation, analysis, and propagation. Seeing how errors in different domains (time, magnitude, etc.), and their correlations, affect analytical results.

For me, my first single semester course illuminated the spectrum of lies, damned lies, and statistics. It opened the door to understanding control systems and feedback at an intuitive level, guiding me toward the math I needed for a particular situation. Best of all, it let me quickly validate hunches by determining the characteristics of the data I’d need to obtain (quality, quantity, etc.) in order to confirm or refute it.

That wedge of statistics also provided access to other knowledge domains, where I became able to critique papers with significant statistical content, such as in economics, AI, and particularly medicine. There are lots of medical researchers who don’t know their stats from a hole in the ground!

I’m not much of a stats freak, knowing little more than the basics. But I learned them well, and use them regularly. Though doing stats occupies about 3% of my work (in embedded software), it consistently provides the most usable results for the time spent. It especially helps me clarify fuzzy thinking, to craft mini-experiments to help me determine if I’m on the right track.

I believe all stats beginners should start with Engineering Statistics, and expand from there.

For me, it provides clear guidance for deciding what tools and techniques may be used under what conditions, and how to validate the results. Dealing with real-world data obtained using a stopwatch or multimeter or o’scope or logic analyzer provides great clarity: Does the instrument measure the world correctly? What are its limitations? Can those limitations be compensated or allowed for? How can data from different instruments be combined? If I split my data into two sets, and repeat my analysis on each set, how can I expect the results to behave?

Statistics, as a concept, can be baffling to beginners: Anchoring it to the real world (not “soft” data such as surveys) quickly makes the concepts concrete.

By: ezra abrams

ezra abrams — Fri, 18 Jan 2013 00:15:57 +0000

as a non math non stats person, I would say that for most people, the most important thing is understanding the real world situation, and how to apply simple tests.
There are times you need heteroscedastic non linear least squares, but 99% of the time, the problem is understanding what the relevant comparision group is, or somehting really basic like that.

If I may – politely – stats courses are taugth by stats guys (or gals) who tend to be math oriented; also math is an easy way to spread out the class and find winners and loosers.
But that doesn’t mean a lot of math is necessary; I think stats courses should spend a lot more time on exp design and understanding how to put the data into really simple bins

By: John

John — Wed, 16 Jan 2013 21:04:59 +0000

In reply to David R.. David: I enjoyed reading "The Lady Tasting Tea." I like history, but I learned much of my statistics before knowing any historical context. When I read that book, I wished I read it earlier.

By: David R.

David R. — Wed, 16 Jan 2013 20:49:05 +0000

Depends on what you mean by “elementary”. If you have little or no background in stats and are interested in learning about the history of the field, you might enjoy “The Lady Tasting Tea” by David Salsburg. I also just stumbled upon “Naked Statistics” by Charles Wheelan, which looks promising… I learn better when the subject is presented in an interesting fashion. Textbook authors seem to struggle with that concept.

By: Ed

Ed — Tue, 15 Jan 2013 21:35:50 +0000

I gave my nephew a copy of “The Manga Guide to Statistics” at Christmas, since his JC isn’t giving a course in it. As a real text it wouldn’t work, but for a self taught introduction it seemed as good as anything else (and got an amused laugh upon unwrapping).

By: RobF

RobF — Tue, 15 Jan 2013 00:12:34 +0000

Hey, man, I love your blog. I’ve been reading it for years. I have also lately come to an amateur study of statistics and I was also searching for a good introductory book. I settled on “The Complete Idiot’s Guide to Statistics”, which was fine for my purposes. I’ll supplement with the other titles mentioned above. Personally, I’ve come to believe that part of the reason we struggle to find a satisfactory “introductory stats” book is that the topic of statistics is simply too huge — broader, deeper, and more sub-specialized than might be first be assumed, and the field is expanding rapidly. It’s less like looking for an intoductory book on “algebra” and more like looking for an introductory book on “architecture” or “engineering”.

By: Tony Zbaraschuk

Tony Zbaraschuk — Mon, 14 Jan 2013 23:08:40 +0000

I would almost certainly use, in addition, the very short book “How to Lie with Statistics.” You might want something else to provide all the formulas, but that book excels at getting across the basic concepts and the common mistakes.

By: Bret

Bret — Mon, 14 Jan 2013 18:32:55 +0000

One delicious component of an elementary statistics breakfast might be D.S. Sivia’s “Data Analysis: A Bayesian Tutorial”. http://www.amazon.com/dp/0198568320 A wonderful introduction to Bayesian analysis, and I believe it avoids many of your “more-harm-than-good”s.

By: Nicola Ward Petty

Nicola Ward Petty — Mon, 14 Jan 2013 16:43:15 +0000

The question is also, for whom is the book destined? Is is a terminal course in applying statistics or a first year course in the mathematics of statistics? Either way there needs to be a lot of real data, something that most books don’t have. I’m glad you mentioned Cobb’s paper, which was a life-changing find for me. You may also be interested in my post on how textbooks suck the fun out of statistics.
http://learnandteachstatistics.wordpress.com/2012/04/10/statistics-textbooks/
I also think that textbooks are on their way out, and that online learning or apps are preferable.
http://learnandteachstatistics.wordpress.com/2012/01/20/textbooks/
Or try my app AtMyPace: Statistics! Much cheaper than a textbook and far from complete, but fun.

By: Mark Gardener

Mark Gardener — Mon, 14 Jan 2013 16:04:38 +0000

I wrote my first book “Statistics for Ecologists Using R and Excel” because I also hadn’t found any really good books: http://www.amazon.com/Statistics-Ecologists-Using-Excel-Presentation/dp/1907807128/ref=ntt_at_ep_dpt_3

I made plenty of mistakes whilst writing it – especially the non-inclusion of exercises, something that will be rectified in the future. However, I tried to give more than just formulae and included some background information as to “why this works”. I also tried to cover the idea of statistical analysis from beginning to end: planning what you want to do, collecting the data and writing it in a coherent fashion, carrying out the appropriate tests and then reporting the results (including graphs).

Mark Gardener
http://www.dataanalytics.org.uk

By: Bob Mrotek

Bob Mrotek — Sun, 13 Jan 2013 21:42:07 +0000

John, If you write ” Elementary Statistics for Dummies” I will be your first customer!

By: l hodge

l hodge — Sun, 13 Jan 2013 16:10:53 +0000

I like Bock, Velleman, and De Veaux which is a high school (AP) text. They definately do not reduce statistics to formulas to be applied by rote, but may not have the the mathematical rigor you are looking for.

By: human mathematics

human mathematics — Sun, 13 Jan 2013 07:02:20 +0000

My list would be:

• Farnsworth cran.r-project.org/doc/contrib/Farnsworth-EconometricsInR.pdf
• Faraway Linear Models with R http://books.google.com/books?id=fvenzpofkagC&lpg=PP1&dq=%22applied%20linear%20models%20in%20r%22&pg=PP1#v=onepage&q=%22applied%20linear%20models%20in%20r%22&f=false
• Some kind of exploratory data analysis eg, http://www.indiana.edu/~wim/docs/9_8_2010_presentation.pdf (cf, http://twitter.com/EdwardTufte/status/284036109426630656)
• Angrist & Pischke’s book looks interesting. http://www.mostlyharmlesseconometrics.com http://www.stat.columbia.edu/~gelman/research/published/angristpischke2.pdf
• Richard Jeffrey http://www.princeton.edu/~bayesway/Book*.pdf for the probability theory

If I were recommending just one book it would be Jeffrey’s. But somehow fit in http://twitter.com/EdwardTufte/status/284036109426630656.

Toward the modelling + residuals end: I think http://en.wikipedia.org/wiki/Income_inequality_in_the_United_States#Race_and_gender_disparities (+ down 1 page) provides a nice microcosm of the reason one wants to do statistical modelling and the pitfalls of trying to do so. Start with the overall disparities, then note things like Blacks are on average younger; see how the disparities change at different education levels (then the questions shift toward why are blacks getting fewer doctorates; does the women/men doctoral pay divide refer to subjects study; and what’s the difference between running an experiment and a “statistical control” where we “subtract off” some model?

By: Gray Calhoun

Gray Calhoun — Sun, 13 Jan 2013 03:04:19 +0000

I enthusiastically second the recommendation for Freedman, Pisani, and Purves’s book (I read this blog from an rss reader, so “enthusiastically” means, “came to the webpage to comment”). Howard Wainer’s books on statistical graphics (i.e. Visual revelations) might do a better job of addressing your specific concerns, though.

By: bram

bram — Sun, 13 Jan 2013 01:01:34 +0000

I think youre sentiment can be summarized as: “if you think you know statistics, you don’t know statistics”. Elementary books try to give the feeling you know statistics, and that’s where the danger is …

PS: I think I know statistics :-)

By: tom_b

tom_b — Sun, 13 Jan 2013 00:01:51 +0000

I smiled when I read that you do not have an enthusiastic recommendation – mainly because I feel like I (many times) almost emailed to ask you to write a blog posting on a series of intro/elementary stats textbooks appropriate for personal study. I have come to the conclusion that having a fairly “mature” mathematical background (I’m planning and guessing on my own that this maybe analysis and linear algebra at the beginning grad level) is step one, then step two is to plow through a number of stats books, harvesting the good while having the taste to throw out the bad.

Sadly, my somewhat traditional CS education never required more than a “stats for engineers” type course. Even though I’m not currently feeling enamored with traditional credentials and programs of study these days, I am kind of on the fence whether if one is going to claim any mastery of statistics if one ought to pursue a MS in either stats or biostats.

By: Justin

Justin — Sat, 12 Jan 2013 23:09:50 +0000

Then write a FAQ, and put that in the section about elementary statistics books. It will be disappointing to some, but helpful to others who will be guided to not waste their time.

By: Harris

Harris — Sat, 12 Jan 2013 22:19:02 +0000

The issue with most introductory statistics books is that they give you formulas without any theoretical background, so people tend to misuse the techniques. The biggest offenders are books marketed to other disciplines, like “Statistics for Computer Scientists,” or “Statistics for Biologists.” I really like Mathematical Statistics With Applications by Wackerly, Mendenhall, and Scheaffer because it introduces probability and statistics with enough theoretical background that you can take a more advanced follow-on course if desired. If you don’t have to take a follow on course then hopefully the basic theory will keep you from making the simple mistakes that most scientists make.

By: Jerzy

Jerzy — Sat, 12 Jan 2013 21:03:33 +0000

John, I’ve been in the same boat. If you ever do find such a book I hope you’ll share it with us.

Đani, thanks for suggesting the Cobb article. There’s good advice already on the first page (“judge a book by its exercises, and you cannot go far wrong”).
Here’s a JSTOR link in case other readers are interested too:
http://www.jstor.org/stable/10.2307/2289170

By: Dave

Dave — Sat, 12 Jan 2013 20:09:08 +0000

I’ve been thinking about this recently too. Here’s my idea, which I hope someone steals (I’m probably not qualified to attempt it myself):

I don’t think a traditional textbook is the way to go. Ideally, I think you’d want some web application (or other software) that allows you to dig deeper into the material as needed. For instance, the top layer would just be conceptual. All text and graphs, with math no more complicated than some simple arithmetic and probability for demonstrations. The goal at this layer would be to develop an intuition for the material. Optionally (depending on a student’s goals), the student could drill down on a topic into the underlying math, which is the next layer. This would include all of the formal mathematical definitions of whatever topic is at hand. The focus here is to translate the concepts to math. Finally, the last layer would be a programming layer. This would give examples of how to actually perform the analyses/tests/modeling with Python or R or whatever.

There would be quizzes/homeworks to test the student’s understanding on each topic at each layer. The top layer would ask conceptual questions, the next layer would ask the student to solve problems mathematically by hand, and the last layer would have the student solve more involved problems with software.

If a manager needed to brush up on stats in order to better communicate with his/her analytics team, he/she could just read through the top layer. A mathematician wanting to learn the concepts more deeply could stick to the first two layers. Someone wanting to become a practitioner would go through all three layers. Thinking about it this way would force an author to make sure that each layer is complete and complementary to the others.

By: Ed Davies

Ed Davies — Sat, 12 Jan 2013 18:02:32 +0000

This might be the book, when it comes out: http://tamino.wordpress.com/2012/12/30/hiatus/ I know nothing other than that I've found his blog posts very educational (though I've skipped some of the heavier maths bits).

By: Đani Burić

Đani Burić — Sat, 12 Jan 2013 16:42:10 +0000

I would suggest “Statistics” by Freedman, Pisani, and Purves, and also “The Basic Practice of Statistics” by David S. Moore, or any of his similar books.

George W. Cobb has written an article (JASA, vol. 82, 1987) about introductory statistics textbooks.