Calling C++ from R

This post relates my experience with calling C++ from R by writing an R module from scratch and by the inline module.

The most reliable way to speed up R code is to take it out of R. I’ve looked at various tricks for speeding up R code and none have improved the performance of code more than a few percent. Rewriting R code in C++, however, can speed it up by a couple orders of magnitude.

Until recently, my preferred approach to speeding up an R program has been to rewrite it entirely in C++. Now I’m looking at ways to rewrite only key parts of the code because I’m working with collaborators who want to work in R as much as they can. I’ve tried two approaches. The first I’ll call the bare-knuckle approach, writing an R package starting from C++ sample code from a colleague. The second approach is using the R module inline to embed C++ code in R.

I’ve tested everything on six computers: two running Windows 7, and one each running Windows XP, Ubuntu 10.04, Ubuntu 11.04, and Mac OS X 10.6.

Working with R and C++ on Windows requires installing Rtools and adding it to your path. I have not been able to get Rtools to work on my Windows XP machine. It apparently installs successfully, but everything that requires it fails. On my Windows 7 machines I’ve been able to install R packages containing C++ source but I’ve not been able to build them.

(In R terminology, “building” a package means gathering the source into a format that R wants. This does not “build” in the sense of compiling the C++ source; the C++ compilation happens on install. So, in theory, one could build an R package on one operating system and then install it on another.)

I’ve been able to build R packages on Ubuntu and Mac OS X and install them on Windows 7, albeit with warnings. The install process complains bitterly, but apparently everything works. At least once I got an error message saying that R couldn’t find the compiler, but apparently it did since the code did in fact compile. (Rtools on Windows installs a Unix-like build system, including the gcc compiler.)

Building and installing R packages on Ubuntu has worked smoothly. I’ve been able to build R packages on OS X and install them on Ubuntu, but not always the other way around. Sometimes R packages built on Ubuntu will install on OS X and sometimes not. My Ubuntu 10.04 box runs a newer version of gcc but an older version of R compared to the Mac, so the portability problems may come from tool version differences rather than operating system differences.

The bare-knuckle approach to writing R packages involves a great deal of tedious work. The R package Rcpp eliminates much of this tedium. The R package inline makes the process even simpler. Using inline you can include C++ source code in your R file and have the code compile automatically when the R code runs. This is the simplest way to call C++ from R, when it works. You can read far more about Rcpp and inline on Dirk Eddelbuettel’s website.

I find the inline approach promising. It could be a convenient way for me to give C++ code to collaborators who may not be able to write C++ but who could read it. It would also eliminate the portability problems that could arise from building a package on one platform and installing it on another.

The inline module worked smoothly on my Ubuntu 11.04 machine, but not at all on Ubuntu 10.04. I was unable to install Rcpp or inline on Ubuntu 10.04 from R itself, though I was able to install these packages through the Synaptic Package Manager. The packages load from R, but fail when I try to use them.

I was able to get inline to work from Windows 7 and from OS X, though in both cases there are numerous warnings.

It was not immediately clear to me how to use inline to call a C++ function that depended on another C++ function. If you want to call a single C++ function f() from R, you could put its source code in an R string and pass that string as an argument to cxxfunction. But what if the function f() uses a function g()?

My first thought was to define g() inside f(). C++ doesn’t let you define a function inside a function, but it does let you define a struct inside a function, so you could make g() a method on a struct to get around this language limitation.

It turns out there’s a more direct method. In addition to the body argument for source code, the function cxxfunction also has an argument includes for header files. You could put the source of g() in a header file and pass a string with a #include statement for that header as the value of includes. However, I don’t know where inline looks for headers. Apparently it does not look in the working directory of the R process that calls it. However, you could pass the content of the header file as the value of includes rather than a #include statement. After all, the first thing the C++ compiler will do with a header file reference is replace it with the contents of the file. So you could put the source for g() or any other dependencies in a string passed as the includes argument.

Calling C++ from R has been painful, but I expect will be easier in the future now that I’ve learned a few things the hard way.

4 thoughts on “Calling C++ from R

  1. Dirk Eddelbuettel

    John,

    You embedded about twenty-seven distinct questions in this single post! Just about everything you describe (different Windoze, Ubuntu, OS X flavours) works out of the box for many, many other users so may I suggest that we take this over to the rcpp-devel list and deal with it one by one, preferably with a reproducible example?

    Otherwise I obviously agree that inline is a fabulous tool for developing C++ extensions for R :)

    Cheers, Dirk

  2. Installing rtools is an astonishingly difficult feat, as most guides for it are outdated. I can’t quite remember how I got it to work, as it involved a ludicrously arcane process.

  3. That ludicriously arcane process wouldn’t be “having a windows OS” by any chance?

  4. I know nothing about R, but I do know that any time you want to wrap C or C++ code for use in a scripting language, you should take a look at SWIG.

    It’s a general-purpose tool to take a C or C++ header file and auto-generate wrapper functions for use in various scripting languages. It supports Perl, TCL, Lua, R, Ruby, Python, etc.

Comments are closed.