Irreproducible research on 60 Minutes

If your research cannot be reproduced, you might end up on 60 Minutes. Two days ago the new show ran a story about irreproducible research at Duke. You can find the video clip here.

I believe the 60 Minutes piece was somewhat misleading. It focused on data manipulation and implied that the controversial results followed from the manipulated data. As Keith Baggerly explains here, that is not the case. The conclusions do not follow from the (erroneous) data. The analysis itself was irreproducible. That discovery started the whole saga.

Update: Here’s some footage that 60 Minutes recorded but did not include on Sunday. “The systems we have in academia, especially with something this complicated, shield sloppy science and fraud.”

Update: Guess someone took that video down. Sorry.

More posts on reproducibility

Running Python and R inside Emacs

Emacs org-mode lets you manage blocks of source code inside a text file. You can execute these blocks and have the output display in your text file. Or you could export the file, say to HTML or PDF, and show the code and/or the results of executing the code.

Here I’ll show some of the most basic possibilities. For much more information, see And for the use of org-mode in research, see A Multi-Language Computing Environment for Literate Programming and Reproducible Research.

Source code blocks go between lines of the form


On the #+begin_src line, specify the programming language. Here I’ll demonstrate Python and R, but org-mode currently supports C++, Java, Perl, etc. for a total of 35 languages.

Suppose we want to compute √42 using R.

    #+begin_src R

If we put the cursor somewhere in the code block and type C-c C-c, org-mode will add these lines:

    : 6.48074069840786

Now suppose we do the same with Python:

    #+begin_src python
    from math import sqrt

This time we get disappointing results:

    : None

What happened? The org-mode manual explains:

… code should be written as if it were the body of such a function. In particular, note that Python does not automatically return a value from a function unless a return statement is present, and so a ‘return’ statement will usually be required in Python.

If we change sqrt(42) to return sqrt(42) then we get the same result that we got when using R.

By default, evaluating a block of code returns a single result. If you want to see the output as if you were interactively using Python from the REPL, you can add :results output :session following the language name.

    #+begin_src python :results output :session
    print "There are %d hours in a week." % (7*24)

This produces the lines

    : There are 168 hours in a week.
    : 1024

Without the :session tag, the second line would not appear because there was no print statement.

I had to do a couple things before I could get the examples above to work. First, I had to upgrade org-mode. The version of org-mode that shipped with Emacs 23.3 was quite out of date. Second, the only language you can run by default is Emacs Lisp. You have to turn on support for other languages in your .emacs file. Here’s the code to turn on support for Python and R.

        'org-babel-load-languages '((python . t) (R . t)))

Update: My next post shows how to call code in written in one language from code written in another language.

Related posts

Bad science is tolerable, résumé padding is not

The Economist posted an article online this weekend about the scandal over irreproducible cancer research by Anil Potti. My colleagues Keith Baggerly and Kevin Coombes have been crying foul about this since 2007. I first blogged about it in January 2008.

The story started getting widespread attention last summer when the Cancer Letter reported that Dr. Potti had lied on grant applications. Since then there have been articles in the popular press, and people are staring to file lawsuits.

Apparently the tipping point in the story was finding a fib on Potti’s resume. According to The Economist

He falsely claimed to have been a Rhodes Scholar in Australia (a curious claim in any case, since Rhodes scholars only attend Oxford University).

So what finally got people to pay attention was not accusations of incompetent or fraudulent science, but résumé padding. As Keith Baggerly commented,

I find it ironic that we have been yelling for three years about the science, which has the potential to be very damaging to patients, but that was not what has started things rolling.

Related posts

Scientific results fading over time

A recent article in The New Yorker gives numerous examples of scientific results fading over time. Effects that were large when first measured become smaller in subsequent studies. Firmly established facts become doubtful. It’s as if scientific laws are being gradually repealed. This phenomena is known as “the decline effect.” The full title of the article is The decline effect and the scientific method.

The article brings together many topics that have been discussed here: regression to the mean, publication bias, scientific fashion, etc. Here’s a little sample.

“… when I submitted these null results I had difficulty getting them published. The journals only wanted confirming data. It was too exciting an idea to disprove, at least back then.” … After a new paradigm is proposed, the peer-review process is tilted toward positive results. But then, after a few years, the academic incentives shift—the paradigm has become entrenched—so that the most notable results are now those that disprove the theory.

This excerpt happens to be talking about “fluctuating asymmetry,” the idea that animals prefer more symmetric mates because symmetry is a proxy for good genes. (I edited out references to fluctuating asymmetry from the quote to emphasize that the remarks could equally apply to any number of topics. ) Fluctuating asymmetry was initially confirmed by numerous studies, but then the tide shifted and more studies failed to find the effect.

When such a shift happens, it would be reassuring to believe that the initial studies were simply wrong and that the new studies are right. But both the positive and negative results confirmed the prevailing view at the time they were published. There’s no reason to believe the latter studies are necessarily more reliable.

Related posts

Taking your code for a walk

walking a dog

When I was in college, a friend of mine told me he liked to take his code out for a walk every now and then. By that he meant recompiling and running all of his programs, say once a week. I asked him why he would want to do that. If a program compiled and ran the last time you touched it, why shouldn’t it compile and run now? He simply said I might be surprised.

Even when your source code isn’t changing, the environment around it is changing. That’s why your code can break without anyone touching it. Peter Deutsch made this observation in the context of networks in his Eight Fallacies of Distributed Computing.

  1. The network is reliable.
  2. Latency is zero.
  3. Bandwidth is infinite.
  4. The network is secure.
  5. Topology doesn’t change.
  6. There is one administrator.
  7. Transport cost is zero.
  8. The network is homogeneous.

Kevin Kelly made the same observation in the context of data storage. Because data formats change and physical media decay, you’ve got to keep copying your data to save it. He coined the term movage to describe the active process of preserving data.

The only way to archive digital information is to keep it moving. I call this movage instead of storage. Proper movage means transferring the material to current platforms on a regular basis … anything you want moved to the future has to be given attention to keep it moving forward.

This morning I had problems running LaTeX (with Beamer) on an old presentation and thought about my friend’s advice.

Related links

Popular research areas produce more false results

The more active a research area is, the less reliable its results are.

John Ioannidis suggested popular areas of research publish a greater proportion of false results in his paper Why most published research findings are false. Of course popular areas produce more results, and so they will naturally produce more false results. But Ioannidis is saying that they also produce a greater proportion of false results.

Now Thomas Pfeiffer and Robert Hoffmann have produced empirical support for Ioannidis’s theory in the paper Large-Scale Assessment of the Effect of Popularity on the Reliability of Research. Pfeiffer and Hoffmann review two reasons why popular areas have more false results.

First, in highly competitive fields there might be stronger incentives to “manufacture” positive results by, for example, modifying data or statistical tests until formal statistical significance is obtained. This leads to inflated error rates for individual findings: actual error probabilities are larger than those given in the publications. … The second effect results from multiple independent testing of the same hypotheses by competing research groups. The more often a hypothesis is tested, the more likely a positive result is obtained and published even if the hypothesis is false.

In other words,

  1. In a popular area there’s more temptation to fiddle with the data or analysis until you get what you expect.
  2. The more people who test an idea, the more likely someone is going to find data in support of it by chance.

The authors produce evidence of the two effects above in the context of papers written about protein interactions in yeast. They conclude that “The second effect is about 10 times larger than the first one.”

Related posts

Camtasia as a software deployment tool

Last week .NET Rocks mentioned a good idea in passing: start a screencast tool like Camtasia before you do a software install. Michael Learned, told the story of a client that asked him to take screen shots of every step in the installation of Microsoft’s Team Foundation Server. Carl Franklin commented “What a great idea to throw Camtasia on there and record the whole process.”

It would be better if the installation process were scripted and not just recorded, but sometimes that’s not practical. Sometimes clicking a few buttons is absolutely necessary or at least far easier than writing a script. And even if you think your entire process is automated with a script, a screencast might be a good idea. It could record little steps you have to do in order to run your script, details that are easily forgotten.

Another way to use this idea would be to have one person do a practice install on a test server while recording the process. Then another person could document and script the process by studying the video. This would be helpful when the person who knows how to do the installation lacks either the verbal skills to explain the process or the scripting skills to automate it.

Related posts

Highlights from Reproducible Ideas

Here are some of my favorite posts from the Reproducible Ideas blog.

Three reasons to distrust microarray results
Provenance in art and science
Forensic bioinformatics (continued)
Preserving (the memory of) documents
Programming is understanding
Musical chairs and reproducibility drills
Taking your code out for a walk

The most popular and most controversial was the first in the list, reasons to distrust microarray results.

The emphasis shifts from science to software development as you go down the list, though science and software are intertwined throughout the posts.

[Update: Reproducible Ideas has gone away.]

Blogging about reproducible research

I’m in the process of folding into the new site. I will be giving the .org domain name to the folks now running the .net site. (See the announcement for a little more information.)

As part of this process, I’m winding down the blog that I started last July as part of the site. I plan to keep the links to my old posts valid, but I do not know whether the new site will have a new blog. I wrote about reproducible research on this blog before starting the site, and I will go back to writing about reproducible research here. (See reproducibility in the tag cloud.)

I wanted to point out an article by Steve Eddins posted this morning: Reproducible research in signal processing. His article comments on the article by Patrick Vandewalle, Jelena Kovačević, and Martin Vetterli announced recently on

Readers interested in reproducible research may also want to take a look at the Science in the open blog.

Related posts

Taking your code out for a walk

I posted two articles on the Reproducible Ideas blog this morning.

Taking your code out for a walk
Just because you haven’t changed your code doesn’t mean it still works.

CiSE special issue on reproducible research
The latest issue of Computing in Science and Engineering has five articles on reproducible research.

[Update: Reproducible Ideas has gone away.]