I’ve resisted using the term “data science,” and enjoy poking fun at it now and then, but I’ve decided it’s not such a bad label after all.
Here are some of the pros and cons of the term. (Listing “cons” first seems backward, but I’m currently leaning toward the pro side, so I thought I should conclude with it.)
The term “data scientist” is sometimes used to imply more novelty than is there. There’s not a great deal of difference between data science and statistics, though the new term is more fashionable. (Someone quipped that data science is statistics on a Mac.)
Similarly, the term data scientist is sometimes used as an excuse for ignorance, as in “I don’t understand probability and all that stuff, but I don’t need to because I’m a data scientist, not a statistician.”
The big deal about data science isn’t data but the science of drawing inferences from the data. Inference science would be a better term, in my opinion, but that term hasn’t taken off.
Data science could be a useful umbrella term for statistics, machine learning, decision theory, etc. Also, the title data scientist is rightfully associated with people who have better computational skills than statisticians typically have.
While the term data science isn’t perfect, there’s little to recommend the term statistics other than that it is well established. The root of statistics is state, as in a government. This is because statistics was first applied to the concerns of bureaucracies. The term statistics would be equivalent to governmentistics, a historically accurate but otherwise useless term.