William Scullin’s talk from PyCon 2011: Python for high performance computing.
At least in our shop [Argonne National Laboratory] we have three accepted languages for scientific computing. In this order they are C/C++, Fortran in all its dialects, and Python. You’ll notice the absolute and total lack of Ruby, Perl, Java.
If you’re interested in Python and HPC, check out SciPyTip.
7 thoughts on “Python for high performance computing”
I’ve just installed mpi4py for users of Manchester Universities new flagship supercomputer thingummy. Not used it myself yet though.
They may be the top 3, but is Python really anywhere near C/C++ and Fortran? Often what I see with Python are simply wrappers to allow the calling of routines programmed in the other two languages. This seems to me to put Python more in the league of Matlab.
R has no place in ANL.
Any time I hear someone mention ‘C/C++’ as one language, I wonder if they’ve ever seen real C code.
I also wonder if they’ve ever benchmarked Python against Ruby.
@John: So what have they got against Java? Many people formed opinions of Java in the pre-1.4 days when it was relatively slow. Nowadays, to write something a lot faster in C++ takes a fair amount of effort to inline, unfold, etc. compared to what Java does with its “Hotspot” just-in-time compiler combined with online stack profiling and unfolding.
Java’s still not nearly as tight in memory for objects as C++. But it’s much easier to distribute apps and libraries, and going back to one of other blog entries, sometimes you don’t need the efficiency or scalability. And the standard library’s much cleaner and more portable. Though Java’s harder to link into C-based interpreters like Python and R (I don’t know what’s under the hood for Matlab).
Our commercial software Lingpipe is written in Java. It’s mainly used through web services or in web applications, but also runs standalone. For the programming time versus efficiency, it’s definitely a win for one programmer.
Sometimes customers won’t even look at it because it’s written in Java. In other cases, we’ve managed to get potential customers to evaluate our packages versus similar things coded in C, and we’ve come out looking good (much of the work’s in algorithms, not in super-duper tight coding, but I do write our Java code with heavy use of arrays in a C-like [not C++-like] style).
I’ve just moved to Columbia Uni full time and have gotten back into C++ for stats and matrices. It’s very expressive (e.g. references vs. pointers vs. call-by-value), but also very complicated and opaque compared to something like Java (especially given the extensive use of template metaprogramming in libs like Boost and the way you can override basic operations like assignment and +=).
@SteveBrookline: Indeed, Python itself is more like Matlab or R than like C++ or Java. Even if you work in numpy, which hugely speeds up vectorized operations, numpy vectors have random access that’s much worse than basic Python lists. If you then work in Cython to speed up the basic loops, which are a bit faster than R, but still snail-like, you wind up losing all the nice clean Python. At that point, it starts getting more complicated than just using C++.
At least that’s what we’ve found after starting in Python.
For other apps, I love Python. It’s such a natural way to program if you don’t need speed or large structured programs.
@Chris Barts: Surprisingly, many people in the stats and numerical optimization world still program in straight up ANSI C. I see it all the time. Most of the C++ I see looks like C with some objects thrown in (i.e., mostly pointer chasing rather than references/call-by-value with copy control). An example is the Sacado package for automatic differentiation in Trilinos, from Sandia National Lab.
@Bob: Perhaps someone at ANL said “We’re going support the two traditional languages for programming to-the-metal (i.e. C and Fortran) plus one other language for higher-level programming” and then decided that the third language should be Python. Maybe the thinking was that the list of supported languages should be short so that support staff can develop depth in each one.
I think Python is the most beautiful language and very enjoyable because there are already available very good complementary packages such as Scipy (+Numpy), Matplotlib, Mayavi2, … that cover needs to mass calculations, 2D and 3D visualizations respectively.
Sometime I have to code in Fortran for speeding up the calculation however I have never been successful using graphics in Fortran as I do in Python by the mentioned above libraries.