Posts tagged as:

Reproducibility

Highlights from Reproducible Ideas

by John on May 5, 2009

Here are some of my favorite posts from the Reproducible Ideas blog.

Three reasons to distrust microarray results
Provenance in art and science
Forensic bioinformatics (continued)
Preserving (the memory of) documents
Programming is understanding
Musical chairs and reproducibility drills
Taking your code out for a walk

The most popular and most controversial was the first in the list, reasons to distrust microarray results.

The emphasis shifts from science to software development as you go down the list, though science and software are intertwined throughout the posts.

{ 0 comments }

Blogging about reproducible research

by John on May 5, 2009

I’m in the process of folding ReproducibleResearch.org into the new ReproducibleResearch.net site. I will be giving the .org domain name to the folks now running the .net site. (See the announcement for a little more information.)

As part of this process, I’m winding down the blog that I started last July as part of the ReproducibleResearch.org site. I plan to keep the links to my old posts valid, but I do not know whether the new site will have a new blog. I wrote about reproducible research on this blog before starting the ReproducibleResearch.org site, and I will go back to writing about reproducible research here. (See reproducibility in the tag cloud.)

I wanted to point out an article by Steve Eddins posted this morning: Reproducible research in signal processing. His article comments on the article by Patrick Vandewalle, Jelena Kovačević, and Martin Vetterli announced recently on ReproducibleResearch.org.

Readers interested in reproducible research may also want to take a look at the Science in the open blog.

Related posts:

Irreproducible analysis
Using Photoshop on experimental results
Highlights from Reproducible Ideas

{ 0 comments }

Taking your code out for a walk

by John on January 6, 2009

I posted two articles on the Reproducible Ideas blog this morning.

Taking your code out for a walk
Just because you haven’t changed your code doesn’t mean it still works.

CiSE special issue on reproducible research
The latest issue of Computing in Science and Engineering has five articles on reproducible research.

{ 0 comments }

Rotating programmers

by John on October 1, 2008

I just posted an article on my other blog, Reproducible Ideas, called Musical chairs and reproducibility drills. The post is about rotating programmers, in classes and in professional software development. The post ends with some thoughts on having a build master and rotating that position.

{ 0 comments }

On my other blog, Reproducible Ideas, I wrote two short posts about this morning about reproducibility.

The first post is a pointer to an interview with Roger Barga about Trident, a workflow system for reproducible oceanographic research using Microsoft’s workflow framework.

The second post highlights a paragraph from the interview explaining the idea of provenance in art and scientific research.

{ 0 comments }

New blog on reproducible research

by John on July 24, 2008

Yesterday I added a blog to the ReproducibleResearch.org web site. You can visit the site here or subscribe via RSS.

I’d like a couple people to join me in writing this blog, and I would greatly appreciate suggestions, guest posts, etc. If you’re interested, please send a note to contribute at the domain name.

{ 0 comments }

Using Photoshop on experimental results

by John on June 7, 2008

Greg Wilson pointed out an article in The Chronicle of Higher Education about scientists using Photoshop to manipulate the graphs of their results. The article has this to say about The Journal of Cell Biology.

So far the journal’s editors have identified 250 papers with questionable figures. Out of those, 25 were rejected because the editors determined the alterations affected the data’s interpretation.

This immediately raises suspicions of fraud which is, of course. However, I’m more concerned about carelessness than fraud. As Goethe once said,

…misunderstandings and neglect create more confusion in this world than trickery and malice. At any rate, the last two are certainly much less frequent. 

Even if researchers had innocent motivations for manipulating their graphs, they’ve made it impossible for someone else to reproduce their results and have cast doubts on their integrity.

{ 2 comments }

ReproducibleResearch.org

by John on May 30, 2008

I started a new web site this week, http://www.reproducibleresearch.org, to promote reproducible research.

I’d like to see this become a community site. Depending on how much interest the site stirs up, I may add a blog, a Wiki, etc. For now, if you’d like to contribute, send me articles or links and I’ll add them to the site. You can send email to “contribute” at the domain name.

{ 0 comments }

Reproducible scientific computing

by John on May 26, 2008

Greg Wilson gave a great interview on the IT Conversations podcast recently. He says the emphasis on HPC draws time and energy away from quality concerns, and may not even help scientists get their results faster. While some problems definitely require HPC, most could be solved faster by developing software in the simpler environment of a single PC and waiting longer for it to run.

I’ve written here about reproducibility problems in statistics and in general software development. Apparently there are similar problems in every area of scientific computing. For example, Wilson quotes a survey of computational economics articles that found that 70% of the results could not be reproduced a year after publication. I doubt that computational economics is worse than other fields.

Wilson says he wants to make raise the reproducibility expectations in computational research closer to those common in physical research. I admire his efforts, but it’s a sad commentary that reproducibility standards are lower in computational science than in physical science.

{ 1 comment }

Barriers to good statistical software

by John on May 16, 2008

I attended a National Cancer Institute workshop yesterday entitled “Barriers to producing well-tested, user-friendly software for cutting-edge statistical methodology.” I was pleased that everyone there realized there is a huge difference between code created for personal use and reliable software that others would willingly use. Not all statisticians appreciate the magnitude of the difference.

I was also pleased that several people at the workshop were aware of the problem of irreproducible statistical analyses. Not everyone was aware how serious or how common the problem is, but those who were aware were adamant that something needs to be done about it, such as journals requiring authors to publish the code used to analyze their data.

{ 3 comments }

Publishing correct sample code

by John on May 9, 2008

It’s infuriating to read published sample code that’s wrong. Sometimes code given in books is not even syntactically correct. I’ve wondered why publishers didn’t have a way to verify that the code at least compiles, and maybe even check that it gives the stated output.

Dave Thomas said in recent interview that his publishing company, The Pragmatic Programmers, does just that. Authors write in a logical mark-up language and software turns that into a publishable form, compiling code samples and inserting the output. Sample code from one of their books is more likely to work the first time you type it in than code from other publishers.

{ 0 comments }

Preventing an unpleasant Sweave surprise

by John on April 29, 2008

Sweave is a tool for making statistical analyses more reproducible by using literate programming in statistics. Sweave embeds R code inside LaTeX and replaces the code with the result of running the code, much like web development languages such as PHP embed code inside HTML.

Sweave is often launched from an interactive R session, but this can defeat the whole purpose of the tool. When you run Sweave this way, the Sweave document inherits the session’s state. Here’s why that’s a bad idea.

Say you’re interactively tinkering with some plots to make them look like you want. As you go, you’re copying R code into an Sweave file. When you’re done, you run Sweave on your file, compile the resulting LaTeX document, and get beautiful output. You congratulate yourself on having gone to the effort to put all your R code in an Sweave file so that it will be self-contained and reproducible. You forget about your project then revisit it six months later. You run Sweave and to your chagrin it doesn’t work. What happened? What might have happened is that your Sweave file depended on a variable that wasn’t defined in the file itself but happened to be defined in your R session. When you open up R months later and run Sweave, that variable may be missing. Or worse, you happen to have a variable in your session with the right name that now has some unrelated value.

I recommend always running Sweave from a batch file. On Windows you can save the following two lines to a file, say sw.bat, and process a file foo.Rnw with the command sw foo.

  R.exe -e "Sweave('%1.Rnw')"
  pdflatex.exe %1.tex

This assumes R.exe and pdflatex.exe are in your path. If they are not, you could either add them to your path or put their full paths in the batch file.

Running Sweave from a clean session does not insure that your file is self-contained. There could still be other implicit dependencies. But running from a clean session improves the chances that someone else will be able to reproduce the results.

See Troubleshooting Sweave for some suggestions for how to prevent or recover from other possible problems with Sweave.

Update: See the links provided by Gregor Gorjanc in the first comment below for related batch files and bash scripts.

{ 1 comment }

Automated software builds

by John on April 20, 2008

My first assignment as a professional programmer was to build another person’s program. I learned right away not to assume a project will build just because the author says it will. I’ve seen the same pattern repeated everywhere I’ve worked. Despite version control systems and procedures, there’s usually some detail in the developer’s head that doesn’t get codified and only the original developer can build the project easily.

The first step in making software builds reproducible is documentation. There’s got to be a document explaining how to extract the project from version control and build it. Requiring screen shots helps since developers have to rehearse their own instructions in order to produce the shots.

The second step is verification. Documentation needs to be tested, just like software. Someone who hasn’t worked on the project needs to extract the code onto a clean machine and build the project using only written instructions — no conversation with the developer allowed. Everyone thinks their code is easy to build; experience says most people are wrong.

The verifiers need to rotate. If one person serves as build master very long, they develop the same implicit knowledge that the original programmers failed to codify.

The third step is automation. Automated instructions are explicit and testable. If automation also saves time, so much the better, but automation is worthwhile even if it does not save time. Clift Norris and I just wrote an article on CodeProject entitled Automated Extract and Build from Team System using PowerShell that helps with this third step if you’re using Visual Studio and VSTS.

{ 0 comments }

Programming the last mile

by John on January 29, 2008

In any programming project there comes a point where the programming ends and manual processes begin. That boundary is where problems occur, particularly for reproducibility.

Before you can build a software project, there are always things you need to know in addition to having all the source code. And usually at least one of those things isn’t documented. Statistical analyses are perhaps worse. Software projects typically yield their secrets after a moderate amount of trial and error; statistical analyses may remain inscrutable forever.

The solution to reproducibility problems is to automate more of the manual steps. It is becoming more common for programmers to realize the need for one-click builds. (See Pragmatic Project Automation for a good discussion of why and how to do this.  Here’s a one-page summary of the book.) Progress is slower on the statistical side, but a few people have discovered the need for reproducible analysis.

It’s all a question of how much of a problem should be solved with code. Programming has to stop at some point, but we often stop too soon. We stop when it’s easier to do the remaining steps by hand, but we’re often short-sighted in our idea of “easier”. We mean easier for me to do by hand this time. We don’t think about someone else needing to do the task, or the need for someone (maybe ourselves) to do the task repeatedly. And we don’t think of the possible debugging/reverse-engineering effort in the future.

I’ve tried to come up with a name for the discipline of including more work in the programming portion of problem solving. “Extreme programming” has already been used for something else. Maybe “turnkey programming” would do; it doesn’t have much of a ring to it, but it sorta captures the idea.

{ 0 comments }

Literate programming and statistics

by John on January 15, 2008

Sweave, mentioned in my previous post, is a tool for literate programming. Donald Knuth invented literate programming and gives this description of the technique in his book by the same name:

I believe that the time is ripe for significantly better documentation of programs, and that we can best achieve this by considering programs to be works of literature. Hence, my title: “Literate Programming.”

Let us change our traditional attitude to the construction of programs: Instead of imagining that our main task is to instruct a computer what to do, let us concentrate rather on explaining to human beings what we want a computer to do.

The practitioner of literate programming can be regarded as an essayist, whose main concern is with exposition and excellence of style. Such an author, with thesaurus in hand, chooses the names of variables carefully and explains what each variable means. He or she strives for a program that is comprehensible because its concepts have been introduced in an order that is best for human understanding, using a mixture of formal and informal methods that reinforce each other.

Knuth says the quality of his code when up dramatically when he started using literate programming. When he published the source code for TeX as a literate program and a book, he was so confident in the quality of the code that he offered cash rewards for bug reports, doubling the amount of the reward with each edition. In one edition, he goes so far as to say “I believe that the final bug in TeX was discovered and removed on November 27, 1985.” Even though TeX is a large program, this was not an idle boast. A few errors were discovered after 1985, but only after generations of Stanford students studied the source code carefully and multitudes of users around the world put TeX through its paces.

While literate programming is a fantastic idea, it has failed to gain a substantial following. And yet Sweave might catch on even though literate programming in general has not.

In most software development, documentation is an after thought. When push comes to shove, developers are rewarded for putting buttons on a screen, not for writing documentation. Software documentation can be extremely valuable, but it’s most valuable to someone other than the author. And the benefit of the documentation may only be realized years after it was written.

But statisticians are rewarded for writing documents. In a statistical analysis, the document is the deliverable. The benefits of literate programming for a statistician are more personal and more immediate. Statistical analyses are often re-run, with just enough time between runs for the previous work to be completely flushed from term memory. Data is corrected or augmented, papers come back from review with requests for changes, etc. Statisticians have more self-interest in making their work reproducible than do programmers.

Patrick McPhee gives this analysis for why literate programming has not caught on.

Without wanting to be elitist, the thing that will prevent literate programming from becoming a mainstream method is that it requires thought and discipline. The mainstream is established by people who want fast results while using roughly the same methods that everyone else seems to be using, and literate programming is never going to have that kind of appeal. This doesn’t take away from its usefulness as an approach.

But statisticians are more free to make individual technology choices than programmers are. Programmers typically work in large teams and have to use the same tools as their colleagues. Statisticians often work alone. And since they deliver documents rather than code, statisticians are free to use use Sweave without their colleagues’ knowledge or consent. I doubt whether a large portion of statisticians will ever be attracted to literate programming, but technological minorities can thrive more easily in statistics than in mainstream software development.

{ 0 comments }