People often come to me with R code that’s running slower than they’d like. It’s not unusual to make the code 10 or even 100 times faster by rewriting it in C++.
Not all that speed improvement comes from changing languages. Some of it comes from better algorithms, eliminating redundancy, etc.
Why bother optimizing?
If code is running 100 times slower than you’d like, why not just run it on 100 processors? Sometimes that’s the way to go. But maybe the code doesn’t split up easily into pieces that can run in parallel. Or maybe you’d rather run the code on your laptop than send it off to the cloud. Or maybe you’d like to give your code to someone else and you want them to be able to run the code conveniently.
Optimizing vs rewriting R
It’s sometimes possible to tweak R code to make it faster without rewriting it, especially if it is naively using loops for things that could easily be vectorized. And it’s possible to use better algorithms without changing languages.
Beyond these high-level changes, there are a number of low-level changes that may give you a small speed-up. This way madness lies. I’ve seen blog posts to the effect “I rewrote this part of my code in the following non-obvious way, and for reasons I don’t understand, it ran 30% faster.” Rather than spending hours or days experimenting with such changes and hoping for a small speed up, I use a technique fairly sure to give a 10x speed up, and that is rewriting (part of) the code in C++.
If the R script is fairly small, and if I have C++ libraries to replace all the necessary R libraries, I’ll rewrite the whole thing in C++. But if the script is long, or has dependencies I can’t replace, or only has a small section where nearly all the time is spent, I may just rewrite that portion in C++ and call it from R using Rcpp.
Simulation vs analysis
The R programs I’ve worked on often compute something approximately by simulation that could be calculated exactly much faster. This isn’t because the R language encourages simulation, but because the language is used by statisticians who are more inclined to use simulation than analysis.
Sometimes a simulation amounts to computing an integral. It might be possible to compute the integral in closed form with some pencil-and-paper work. Or it might be possible to recognize the integral as a special function for which you have efficient evaluation code. Or maybe you have to approximate the integral, but you can do it more efficiently by numerical analysis than by simulation.
Redundancy vs memoization
Sometimes it’s possible to speed up code, written in any language, simply by not calculating the same thing unnecessarily. This could be something simple like moving code out of inner loops that doesn’t need to be there, or it could be something more sophisticated like memoization.
The first time it sees a function called with a new set of arguments, memoization saves the result and creates a way to associate the arguments with the result in some sort of look-up table, such as a hash. The next time the function is called with the same argument, the result is retrieved from memory rather than recomputed.
Memoization works well when the set of unique arguments is fairly small and the calculation is expensive relative to the cost of looking up results. Sometimes the set of potential arguments is very large, and it looks like memoization won’t be worthwhile, but the set of actual arguments is small because some arguments are used over and over.
Related post: 20 weeks down to 20 minutes