Edsgar Dijkstra quipped that software testing can only prove the existence of bugs, not the absence of bugs. His research focused on formal techniques for proving the correctness of software, with the implicit assumption that proofs are infallible. But proofs are written by humans, just as software is, and are also subject to error. Donald Knuth had this in mind when he said “Beware of bugs in the above code; I have only proved it correct, not tried it.” The way to make progress is to shift from thinking about the **possibility** of error to thinking about the **probability** of error.

Testing software cannot prove the impossibility of bugs, but it can increase your confidence that there are no bugs, or at least lower your estimate of the probability of running into a bug. And while proofs can contain errors, they’re generally less error-prone than source code. (See a recent discussion by Mark Dominus about how reliable proofs have been.) At any rate, people tend to make different kinds of errors when proving theorems than when writing software. If software passes tests and has a formal proof of correctness, it’s more likely to be correct. And if theoretical results are accompanied by numerical demonstrations, they’re more believable.

Leslie Lamport wrote an article entitled How to Write a Proof where he addresses the problem of errors in proofs and recommends a pattern of writing proofs which increases the probability of the proof being valid. Interestingly, his proofs resemble programs. And while Lamport is urging people to make proofs more like programs, the literate programming folks are urging us to write programs that are more like prose. Both are advocating **complementary modes of validation**, adding machine-like validation to prosaic proofs and adding prosaic explanations to machine instructions.

**Related**: Formal validation methods

Keith Devlin has a recently published post on what is a proof. I have to say that I have a lot more sympathy with his viewpoint than with Lamport’s.

We test until we run out of money. Then, we allocate testing to the user and user’s organization. When they no longer pay their maintenance fee or subscription, the user and user’s organization found enough bugs. In the meantime, we live another day and seed a few more bugs for testing.

The question of bugs isn’t if, but when.