I saw an impressive chart once of how numerical linear algebra algorithm efficiency have improved over time. I haven’t been able to find that chart, but here is something similar. Thanks to Naveen Palli for pointing this out.
Even more remarkable [than Moore’s law] — and even less widely understood — is that in many areas, performance gains due to improvements in algorithms have vastly exceeded even the dramatic performance gains due to increased processor speed.
… Here is just one example … Grötschel, an expert in optimization, observes that a benchmark production planning model solved using linear programming would have taken 82 years to solve in 1988, using the computers and the linear programming algorithms of the day. Fifteen years later — in 2003 — this same model could be solved in roughly 1 minute, an improvement by a factor of roughly 43 million. Of this, a factor of roughly 1,000 was due to increased processor speed, whereas a factor of roughly 43,000 was due to improvements in algorithms! Grötschel also cites an algorithmic improvement of roughly 30,000 for mixed integer programming between 1991 and 2008.
From a federal report with too long a title, page 71.
As I mentioned in my previous post, many of the speed-ups in optimization are the result of speed-ups in numerical linear algebra, because like so much other applied math, it all boils down to linear algebra.
It’s worth keeping in mind that technology improvement isn’t all about processor speed. Much of the speedup over the past 30 years hinges on memory technology. Huge problems can now be solved almost entirely in memory now, which is on the order of a million times faster than disk. Back in the day, a lot of algorithm effort was directed to how to move small chunks of data in and out of memory efficiently and how to have the right data in memory when it was needed for computation. Even then, processors spent a lot of time waiting around for data to become available.
Proebsting’s Law: improvements to compiler technology double the performance of typical programs every 18 years. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.29.434
Similar chart of what you’re referring to. Evolution of sorting algorithms and the comparison home vs super computer.
Chart: https://scontent-bru2-1.xx.fbcdn.net/hphotos-xfp1/v/t35.0-12/12359422_1099812400037880_1451817825_o.jpg?oh=e302084069648b9884d6c71c4bc5eef5&oe=566A8727
Source: http://algs4.cs.princeton.edu/lectures/23Quicksort-2×2.pdf
Thanks. One difference between sorting and linear algebra is that computer scientists quickly discovered O(n log n) sorting algorithms, which is the optimal asymptotic order. Sorting would be more analogous if for years O(n^3) were state of the art, then someone discovered an O(n^2.5) algorithm, then O(n^2) etc.
Nitpick: Sorting is O(n log n) for generic data. For quite a few cases, like 32/64 bit values (floats, integers, small strings) there are O(n) sorting algorithms, like RadixSort
coca: That’s actually a good point. Radix sort is analogous to what people have done with linear algebra. For dense matrices, you’re essentially stuck with Gaussian elimination. But most matrices encountered in applications have some exploitable special structure like sparsity. That’s where the real progress has happened in linear algebra.
There’s these two plots in this article, which compare the numerical and hardware improvements on speed-up factors.
http://imgur.com/wD4XFvD
“Graduate Education in Computational Science and Engineering” (p. 168)
SIAM Rev., 43(1), 163–177.
http://epubs.siam.org/doi/abs/10.1137/S0036144500379745
Ummm…..I don’t follow this post at all.
Most of the machine learning-related optimization has been moving–quickly and for the past decade, at least–towards first-order methods because (1) the matrices involved for a second-order method would be too huge, and (2) it’s not like finding the exact optimum matters anyways since the error from only getting close to the optimal solution isn’t very large compared to the errors introduced from the approximate model. Also, the second-order-ish methods that have been published haven’t really caught on (e.g., L-BFGS-type optimization methods are less popular than, say, clever tuning of SGD).
The MIP-related optimization literature is more about various cuts & branching algorithms than linear algebra. For example, it was the early 90’s when people realized that Gomory cuts actually work quite well in practice, which was a major revelation. And that had nothing to do with numerical linear algebra. I think the list of other tricks and such for MIPs that have been incorporated into (commercial, closed-source) solvers goes on and on. (Bob Bixby has written several papers on this, e.g. MIP Theory And Practice: Closing the Gap.)
It’s true that numerical linear algebra improvements have some important performance benefits for optimization algorithms…but that’s been most true for interior point methods. And those are very far from being the alpha and omega of numerical optimization techniques. (Though it makes sense that Stephen Boyd would emphasize them, especially in older lectures, since they were a major breakthrough in the 80’s, were intensively studied for at least two decades, and are the major focus of his convex optimization textbook.)
Is this the graph that you were originally mentioned?
The original source is from the Presidents Information Technology Advisory Commitee report dated June 2005.