New spin on the cathedral and the bazaar

Eric Raymond’s famous essay The Cathedral and the Bazaar compares commercial software projects to cathedrals and open source software projects to bazaars. Cathedrals are carefully planned. Bazaars are not. The organizational structure a bazaars emerges without deliberate coordination of its participants. The open source community has embraced the metaphor of the bazaar and the informality and spontaneity it implies.

Shmork wrote the following observation in the comments to a Coding Horror post yesterday that discussed the difficulties of using Linux software.

Almost nobody in the Western world shops at real-life bazaars either, because they are dodgy, unsafe, and unregulated. And in the Western world, we like things to be reliable, working, safe. So cathedral it is. Even our flea markets aren’t bazaars, really, they’re just knock-off cathedrals.

Good, fast, or cheap: Can you really pick two?

There’s a saying that clients can have good, fast, or cheap. Pick two, but then the third will be whatever it has to be based on the other two choices. You can have good and fast if you’re willing to spend a lot of money. You can have fast and cheap, but the quality will be poor. You might even be able to get good and cheap, if you’re willing to wait a long time.

A variation on this theme is the iron triangle. You draw a triangle with vertices labeled “features”, “time” and “resources.” If you make two of the sides longer, the third has to become longer too. Here goodness is defined as a feature set rather than quality, but the same principle applies.

There’s a problem with this line of reasoning: no matter what clients say, they want quality. They may say they want fast and cheap, and if you tell them you’ll sacrifice quality to deliver fast and cheap, you’ll be a hero — until you deliver. Then they want quality. As Howard Newton put it

People forget how fast you did a job, but they remember how well you did it.

Sometimes you can cut features as long as you do a good job on the features that remain, but only to a point. Clients are not going to be happy unless you meet their expectations, even if those expectations are explicitly contradicted in a contract. You can tell a client you’ll cut out frills to give them something fast and cheap, and they’ll gladly agree. But they still want their frills, or they will want them. The client may be silently disappointed. Or they may be vocally disappointed, demanding excluded features for free and complaining about your work. Eventually you learn what features to insist on including, even if a client says they can live without them.

Proofs of false statements

Mark Dominus brought up an interesting question last month: have there been major screw-ups in mathematics? He defines a “major screw-up” to be a flawed proof of an incorrect statement that was accepted for a significant period of time. He excludes the case of incorrect proofs of statements that were nevertheless true.

It’s remarkable that he can even ask the question. Can you imagine someone asking with a straight face whether there have ever been major screw-ups in, say, software development? And yet it takes some hard thought to come up with examples of really big blunders in math.

No doubt there are plenty of flawed proofs of false statements in areas too obscure for anyone to care about. But in mainstream areas of math, blunders are usually uncovered very quickly. And there are examples of theorems that were essentially correct but neglected some edge case. Proofs of statements that are just plain wrong are hard to think of. But Mark Dominus came up with a few.

Yesterday he gave an example of a statement by Kurt Gödel that was flat-out wrong but accepted for over 30 years. Warning: reader discretion advised. His post is not suitable for those who get queasy at the sight of symbolic logic.

Paper doesn’t abort

My daughter asked me recently what I thought about a Rube Goldberg machine she sketched for a school project. I immediately thought about how difficult it would be to implement parts of her design. I asked her if she really had to build it or just had to sketch it. When she said she didn’t really have to build it, I told her it was great.

I thought of a friend’s comment about designing code versus actually writing code: “I’ve never seen a piece of paper abort.” The hard part about writing software is that people can tell when you’re wrong.

Programming the last mile

In any programming project there comes a point where the programming ends and manual processes begin. That boundary is where problems occur, particularly for reproducibility.

Before you can build a software project, there are always things you need to know in addition to having all the source code. And usually at least one of those things isn’t documented. Statistical analyses are perhaps worse. Software projects typically yield their secrets after a moderate amount of trial and error; statistical analyses may remain inscrutable forever.

The solution to reproducibility problems is to automate more of the manual steps. It is becoming more common for programmers to realize the need for one-click builds. (See Pragmatic Project Automation for a good discussion of why and how to do this.  Here’s a one-page summary of the book.) Progress is slower on the statistical side, but a few people have discovered the need for reproducible analysis.

It’s all a question of how much of a problem should be solved with code. Programming has to stop at some point, but we often stop too soon. We stop when it’s easier to do the remaining steps by hand, but we’re often short-sighted in our idea of “easier”. We mean easier for me to do by hand this time. We don’t think about someone else needing to do the task, or the need for someone (maybe ourselves) to do the task repeatedly. And we don’t think of the possible debugging/reverse-engineering effort in the future.

I’ve tried to come up with a name for the discipline of including more work in the programming portion of problem solving. “Extreme programming” has already been used for something else. Maybe “turnkey programming” would do; it doesn’t have much of a ring to it, but it sorta captures the idea.

Empirical support for TDD

Phil Haack gives his summary of a recent study on the benefits of test-driven development (TDD). The study had two groups of students write unit tests for their programming assignments. Students assigned to the test-first group were instructed to write their unit tests before writing their production code, as required by TDD. Students assigned to the test-last group were told to write their tests after writing their production code. Students in the test-first group wrote higher quality code.

The study concluded that code quality was correlated with the number of unit tests, independent of whether the test were written first or last. However, the test-first students wrote more tests in the same amount of time.

Note that students were assigned to test-first or test-last. Most programming studies are just surveys.  The results are always questionable because professional programmers decide their tools. So, for example, you cannot conclude from a survey that technique X makes programmers more productive than technique Y. The survey may say more about the programmers who chose each technique than about the techniques themselves.

Programs and Proofs

Edsgar Dijkstra quipped that software testing can only prove the existence of bugs, not the absence of bugs. His research focused on formal techniques for proving the correctness of software, with the implicit assumption that proofs are infallible. But proofs are written by humans, just as software is, and are also subject to error. Donald Knuth had this in mind when he said “Beware of bugs in the above code; I have only proved it correct, not tried it.” The way to make progress is to shift from thinking about the possibility of error to thinking about the probability of error.

Testing software cannot prove the impossibility of bugs, but it can increase your confidence that there are no bugs, or at least lower your estimate of the probability of running into a bug. And while proofs can contain errors, they’re generally less error-prone than source code. (See a recent discussion by Mark Dominus about how reliable proofs have been.) At any rate, people tend to make different kinds of errors when proving theorems than when writing software. If software passes tests and has a formal proof of correctness, it’s more likely to be correct. And if theoretical results are accompanied by numerical demonstrations, they’re more believable.

Leslie Lamport wrote an article entitled How to Write a Proof where he addresses the problem of errors in proofs and recommends a pattern of writing proofs which increases the probability of the proof being valid. Interestingly, his proofs resemble programs. And while Lamport is urging people to make proofs more like programs, the literate programming folks are urging us to write programs that are more like prose. Both are advocating complementary modes of validation, adding machine-like validation to prosaic proofs and adding prosaic explanations to machine instructions.

Related: Formal validation methods