Here’s a little example of using Hadley Wickham’s `testthat`

package for unit testing R code. You can read more about `testthat`

here. (more…)

# Rstats

# Basics of Sweave and Pweave

Sweave is a tool for embedding R code in a LaTeX file. Pweave is an analogous tool for Python. By putting your *code* in your document rather than the *results* of running your code somewhere else, results are automatically recomputed when inputs change. This is especially useful with graphs: rather than including an *image* into your document, you include the code to *create* the image. (more…)

# R without Hadley Wickham

Tim Hopper asked on Twitter today:

#rstats programming without @hadleywickham’s libraries is like ________ without _________.

Some of the replies were:

- (skydiving, a parachute)
- (gouging your own eyes out, NULL)
- (dentistry, anesthesia)
- (shaving, a razor)
- (internet shopping, credit card)

Clearly there’s a lot of love out there for Hadley Wickham’s R packages. I’m familiar with his ggplot2 graphics package, and it’s quite impressive. I need to look at his other packages as well.

# Beta inequalities in R

Someone asked me yesterday for R code to compute the probability P(*X* > *Y* + δ) where *X* and *Y* are independent beta random variables. I’m posting the solution here in case it benefits anyone else.

For an example of why you might want to compute this probability, see A Bayesian view of Amazon resellers.

(more…)

# Installing R and Rcpp

Almost exactly a year ago, I wrote about my frustration calling C++ from R. Maybe this will become an annual event because I’m back at it.

This time my experience was more pleasant. I was able to install Rcpp on an Ubuntu virtual machine and run example 2.2 from the Rcpp FAQ once I upgraded R to the latest version. I wrote up some notes on the process of installing the latest version of R and Rcpp on Ubuntu.

I have not yet been able to run Rcpp on Windows.

**Update**: Thanks to Dirk Eddelbuettel for pointing out in the comments that you can install Rcpp from the shell rather than from inside R by running `sudo apt-get install r-cran-rcpp`

. With this approach, I was able to install Rcpp without having to upgrade R first.

# Machine Learning for Hackers

Drew Conway and John Myles White have a new book out, Machine Learning for Hackers. As the name implies, the emphasis is on exploration rather than mathematical theory. Lots of code, no equations.

If you’re looking for a hands-on introduction to machine learning, maybe as a prelude to or complement to a more theoretical text, you’ll enjoy this book. Even if you’re not all that interested in machine learning, you might enjoy the examples, such as how a computer could find patterns in senatorial voting records and twitter networks. And R users will find examples of using advanced language features to solve practical problems.

# Comparing R to smoking

Francois Pinard comparing the R programming language to smoking:

Using R is a bit akin to smoking. The beginning is difficult, one may get headaches and even gag the first few times. But in the long run, it becomes pleasurable and even addictive. Yet, deep down, for those willing to be honest, there is something not fully healthy in it.

I’ve never gotten to the point that I would call using R pleasurable.

Quote via Stats in the Wild

**Related posts**:

Reviews of R in Action and The Art of R Programming

R quirks

# Running Python and R inside Emacs

Emacs org-mode lets you manage blocks of source code inside a text file. You can execute these blocks and have the output display in your text file. Or you could export the file, say to HTML or PDF, and show the code and/or the results of executing the code.

Here I’ll show some of the most basic possibilities. For much more information, see orgmode.org. And for the use of org-mode in research, see A Multi-Language Computing Environment for Literate Programming and Reproducible Research.

# R in Action

No Starch Press sent me a copy of The Art of R Programming last Fall and I wrote a review of it here. Then a couple weeks ago, Manning sent me a copy of R in Action. Here I’ll give a quick comparison of the two books, then focus specifically on *R in Action*.

# The Art of R Programming

Here are my first impressions of The Art of R Programming. I haven’t had time to read it thoroughly, and I doubt I will any time soon. Rather than sitting on it, I wanted to get something out quickly. I may say more about the book later.

The book’s author, Norman Matloff, began his career as a statistics professor and later moved into computer science. That may explain why his book seems to be more programmer-friendly than other books I’ve seen on R.

My impression is that few people actually sit down and learn R the way they’d learn, say, Java. Most learn R in the context of learning statistics. Here’s a statistical chore, and here’s a snippet of R to carry it out. Books on R tend to follow that pattern, organized more by statistical task than by language feature. That serves statisticians well, but it’s daunting to outsiders.

Matloff’s book is organized more like a typical programming book and may be more accessible to a programmer needing to learn R. He explains some things that might require no explanation if you were learning R in the context of a statistics class.

The last four chapters would be interesting even for an experienced R programmer:

- Debugging
- Performance enhancement: memory and speed
- Interfacing R to other languages
- Parallel R

No one would be surprised to see the same chapters in a Java textbook if you replaced “R” with “Java” in the titles. But these topics are not typical in a book on R. They wouldn’t come up in a statistics class because they don’t provide any statistical functionality *per se*. As long as you don’t make mistakes, don’t care how long your code takes to run, and don’t need to interact with anything else, these chapters are unnecessary. But of course these chapters are quite necessary in practice.

As I mentioned up front, I haven’t read the book carefully. So I’m going out on a limb a little here, but I think this may be the book I’d recommend for someone wanting to learn R, especially for someone with more experience in programming than statistics.

**Related post**:

# RLangTip changing hands

I’ve decided to hand my Twitter account RLangTip over to the folks at Revolution Analytics starting next week. I thought it would be better to give the account to someone who is more enthusiastic about R than I am, and so I offered it to David Smith. If you’ve enjoyed RLangTip so far, I expect you’ll like it even better under new ownership.

If you’d like to continue to hear from me on Twitter, you can follow one of my 10 other daily tip accounts or my personal account.

Descriptions of these accounts are available here.

# New Twitter account: RLangTip

I’m starting a new daily tip Twitter account: RLangTip. This account will have one regularly scheduled tip per day, Monday through Friday, on the R language and related topics. I’ll also throw in a few unscheduled tweets now and then.

If you’re interested in R, you may also be interested in other daily tip accounts such as ProbFact, StatFact, and CompSciFact.

# Regular expressions in R

Notes on using regular expressions in R. R uses POSIX regular expression syntax by default but you can ask it to use Perl’s flavor of regular expressions.

**Related links**:

Regular expressions in C++, Mathematica, Python, R, PowerShell

R for programmers coming from other languages

R: The good parts

Daily regular expression tips