Benchmarking C++, Python, R, etc.

The other day Travis Oliphant pointed out an interesting paper: A Comparison of Programming Languages in Economics. The paper benchmarks several programming languages on a computational problem in economics.

All the usual disclaimers about benchmarks apply, your mileage may vary, etc. See the paper for details.

Here I give my summary of their summary of their results. The authors ran separate benchmarks on Mac and Windows. The results were qualitatively the same, so I just report the Windows results here.

Times in the table below are relative to the fastest C++ run.

Language Time
C++ 1.00
Java 2.10
Julia 2.70
CPython 155.31
Python with Numba 1.57
R 505.09
R using compiler package 243.38

 

The most striking result is that the authors were able to run their Python code 100x faster, achieving performance comparable to C++, by using Numba.

20 thoughts on “Benchmarking C++, Python, R, etc.

  1. They didn’t optimize for the languages, especially vectorization with Matlab and R. The argue with only 18k item vectors it wouldn’t impact performance much to vectorize performance. They are wrong with respect to R and basing that on Matlab.

    Furthermore, it’s generally the case that there a many many ways to implement this test in each language and each has styles of writing and features that optimize performance. Therefore, without the actual code used the paper is next to worthless.

  2. There are many ways to conduct benchmarks. But at a minimum, these are the numbers a couple people got on their code.

    Personally, I’m not interested in the maximum performance possible to squeeze out of each language. I’m more interested in what kind of performance someone like myself is able to get without great effort. I admit that’s not very precise, but it is what I care about. This paper suggests, but certainly doesn’t prove, that I might get my Python code to run a lot faster by using Numba, for example, and that Julia is worth trying. And it matches my experience that R is a couple orders of magnitude slower than C++. That’s about all I expect from benchmarks. A more rigorous study wouldn’t be any more useful to me.

  3. It’s not my benchmark, but the authors did try PyPy. They got 45, a significant improvement over CPython, but not nearly as impressive as Numba.

  4. Ah they mention PyPy specifically, thanks. I must now endure the shame of proof that I’m too lazy to open the very obvious source link.

  5. Joe: I didn’t make it clear that I’d cut out some of their results. They also looked at Fortran, Mathematica, and Matlab.

    I left out the PyPy result because it wasn’t as dramatic as the Numba result. I left out Fortran because the result is not surprising. And I left out the Mathematica and Matlab results because they were harder to describe. My personal taste.

  6. Each of those languages have different peak performances, and different costs. So tell me how long it took to code the code, and use test cases that would generate great execution numbers for each language. Then, do the analyses. The raw numbers don’t mean much.

  7. Python is the Borg of languages. A decade ago I regularly employed a dozen languages: Now I’m 95% Python.

    Awk, sed, perl, tcl? Great tools, but Python’s re module (with a few other modules) gets the job done. But once in a while my fingers still tap out a sed one-liner.

    Matlab, Mathematica, MathCad, R, Julia? I’ve used them all, but Python/iPython/Sage can carry the load. I do occasionally sneak over to use SimuLink, though I haven’t yet tried xcos or OpenModelica/OMPython.

    Java, C/C++? Python does remarkably well, with Numba/PyPy for speed (and, back in the day, Psyco). MicroPython is available for embedded systems with only a few k of memory (even without an OS).

    And don’t get me started on GUIs: I hate ’em with a passion, yet I have to make one for every embedded system I develop. Tkinter makes it quick, relatively painless, not quite stick-in-the-eye ugly (if never beautiful), and even cross-platform as stand-alone executables (pyinstaller/cx_freeze rock). For me, Python replaced Tcl/Tk long ago: It was one of the reasons I first tried Python.

    What’s my #2 language? Bash/sh. Mainly to make embedded Linux behave.

    Spending so much time in one language applied across many applications in multiple domains and environments has synergistic benefits. Makes even a Swiss Army Knife seem limited. (Yes, I have probably succumbed to hammer-nail syndrome.)

    All that said, Python is far from perfect. I haven’t made the leap to 3.x because several packages and tools I rely upon have yet to make their own leap, but making maximal use of __future__ makes the delay somewhat tolerable. Python debugging when using the threading and multiprocessing modules is still a pain, but good design will limit thread/process issues. And Python can consume vast oceans of memory: I often have to manually control gc passes to get adequate worst-case performance, even after my code has been optimized for memory use (such as by using deques instead of simple lists or queues).

    Python profiling has improved, thanks to yappi.

    My GPU is accessible from Python, thanks to projects like PyCUDA, PyOpenCL, gnumpy, SimpleCV (!!!) and OpenCV-Python.

    Now, if only I could program an FPGA using Python: MyHDL and Migen are interesting, but you still need deep knowledge of VHDL/Verilog and vendor toolchains.

  8. Sorry John, but without code you can’t tell if this is the performance someone like yourself might get without great effort because you have no idea if they code like you.

    That said, they do provide a link to code—missed that. As I suspected a quick look at the R reveals that it’s horribly slow. I’m guessing a little work could make it 10-100x faster. And for someone who thinks of math in vectors it would be simpler, more expressive, and more easy to maintain. And more importantly, more like I would do myself without great effort.

    Also, depending on how the code is run shouldn’t compile time be factored into it? At least report it. If you’re developing something for a targeted research project to be run once, compile and debug should more definitely be included. Which I think then would favour something like Python with Numba even more.

  9. One of the dilemmas the authors faced was wanting to run the “same” code on each platform, but also not to present a platform in an artificially bad light due to not running idiomatic code. In the case of Mathematica, they ran both the naively ported code and more idiomatic Mathematica code and reported both numbers. It would have been better if they had done the same for R. But I would be shocked if the best R code ran anywhere near as fast as the others (except Python without Numba).

  10. pypy is slow because of call the np.dot function which is slower in pypy than in CPython.

  11. the np.dot code is in line 67, 71, I think that’ s why pypy is 1/10 speed of numba version.
    ”’
    65 log = math.log
    66 zeros = np.zeros
    67 dot = np.dot
    68
    69 while(maxDifference > tolerance):
    70
    71 expectedValueFunction = dot(mValueFunction,mTransition.T)
    72
    73 for nProductivity in xrange(nGridProductivity):
    ”’

  12. Oh, Joe Taber, I made a mistake. The main bottleneck of pypy(CPython without numba) is the inner loop. Maybe one reason is that access numpy array is 2 times slower in pypy than in CPython with numba. Another reason may be that, pypy has very good performance on accessing 1d numpy array but poor for multiple-dimension array. For exmaple, sum of 100,000,000 array is as fast as numba, but sum of 10,000 x 10,000 2d array is 10 times slow than numba. So, that’ s why pypy is 1/20 speed of CPython with numba. I hope the pypy developer could optimize access performance of pypy on multiple-dimension ndarray.

  13. Interesting; I didn’t know that C++ had caught up to Fortran. I wonder if that is general, or a peculiarity of their benchmark, as I just read an article on Ars Technica about how C can’t catch up due to the way they deal with memory.

    Actually, I’m disappointed to see no C on there, just C++, as I understand C is a bit faster if well written.

Comments are closed.