In any programming project there comes a point where the programming ends and manual processes begin. That boundary is where problems occur, particularly for reproducibility.
Before you can build a software project, there are always things you need to know in addition to having all the source code. And usually at least one of those things isn’t documented. Statistical analyses are perhaps worse. Software projects typically yield their secrets after a moderate amount of trial and error; statistical analyses may remain inscrutable forever.
The solution to reproducibility problems is to automate more of the manual steps. It is becoming more common for programmers to realize the need for one-click builds. (See Pragmatic Project Automation for a good discussion of why and how to do this. Here’s a one-page summary of the book.) Progress is slower on the statistical side, but a few people have discovered the need for reproducible analysis.
It’s all a question of how much of a problem should be solved with code. Programming has to stop at some point, but we often stop too soon. We stop when it’s easier to do the remaining steps by hand, but we’re often short-sighted in our idea of “easier”. We mean easier for me to do by hand this time. We don’t think about someone else needing to do the task, or the need for someone (maybe ourselves) to do the task repeatedly. And we don’t think of the possible debugging/reverse-engineering effort in the future.
I’ve tried to come up with a name for the discipline of including more work in the programming portion of problem solving. “Extreme programming” has already been used for something else. Maybe “turnkey programming” would do; it doesn’t have much of a ring to it, but it sorta captures the idea.