Yesterday Simply Statistics linked to a paper with the provocative title Classifier Technology and the Illusion of Progress. I’ve only skimmed the article so far, but here are a few sentences that stood out.
In particular, simple methods typically yield performance almost as good as more sophisticated methods, to the extent that the difference in performance may be swamped by other sources of uncertainty that generally are not considered in the classical supervised classification paradigm. …
The situation to date thus appears to be one of very substantial theoretical progress … While all of these things are true, it is the contention of this paper that the practical impact of the developments has been inflated; that although progress has been made, it may well not be as great as has been suggested.
On the plus side, it appears that this 2006 article has been widely cited:
http://scholar.google.com/scholar?cites=859292737433991399&as_sdt=2005&sciodt=0,5&hl=en
It is not surprising to me. Same situation is in IR, with AdHoc searching. According to the Yahoo Learning To Rank results, the best ML methods that employ hundreds of relevance signals outperform a relatively simple benchmark (based on dozens of signals) by 10%:
http://jmlr.csail.mit.edu/proceedings/papers/v14/chapelle11a/chapelle11a.pdf
PS: by less than 10%, sorry.
As a broader commentary, this is directly related to the peer review process. Of course, people tend to oversell their work… but we also have a tendency to be careful regarding claims that can be, and will be checked. However, the peer review process is such that if your claim is not impressive enough, you will be rejected and get little attention… this pushes people to make more aggressive claims.
This also kills a whole category of research which aims to deepen our understanding of existing methods. Given a choice between something new that is much better, and a paper that examines older techniques… people tend to prefer the new stuff… yet maybe the more conservative papers would be more useful on the long run…
To put it another way, new techniques are overrated. Science is mostly built by revisiting old ideas.