Software sins of omission

The Book of Common Prayer contains the confession

… we have left undone those things which we ought to have done, and we have done those things which we ought not to have done.

The things left undone are called sins of omission; things which ought not to have been done are called sins of commission.

In software testing and debugging, we focus on sins of commission, code that was implemented incorrectly. But according to Robert Glass, the majority of bugs are sins of omission. In Frequently Forgotten Fundamental Facts about Software Engineering Glass says

Roughly 35 percent of software defects emerge from missing logic paths, and another 40 percent are from the execution of a unique combination of logic paths.

If these figures are correct, three out of four software bugs are sins of omission, errors due to things left undone. These are bugs due to contingencies the developers did not think to handle. Three quarters seems like a large proportion, but it is plausible. I know I’ve written plenty of bugs that amounted to not considering enough possibilities, particularly in graphical user interface software. It’s hard to think of everything a user might do and all the ways a user might arrive at a particular place. (When I first wrote user interface applications, my reaction to a bug report would be “Why would anyone do that?!” If everyone would just use my software the way I do, everything would be OK.)

It matters whether bugs are sins of omission or sins of commission. Different kinds of bugs are caught by different means. Developers have come to appreciate the value of unit testing lately, but unit tests primarily catch sins of commission. If you didn’t think to program something in the first place, you’re not likely to think to write a test for it. Complete test coverage could only find 25% of a projects bugs if you assume 75% of the bugs come from code that no one thought to write.

The best way to spot sins of omission is a fresh pair of eyes. As Glass says

Rigorous reviews are more effective, and more cost effective, than any other error-removal strategy, including testing. But they cannot and should not replace testing.

One way to combine the benefits of unit testing and code reviews would be to have different people write the unit tests and the production code.

Related posts:

Abstractions are never perfect

Better to have a simple system than a complex system with a simple abstraction on top.

Abstractions are never perfect. Every new layer creates failure points, interoperability hassles, and scalability problems. New tools can hide complexity, but they can’t justify it … The more complex the system, the more difficult it is to fix when something goes wrong.

From the preface to RESTful Web Services.

Related posts:

Camtasia as a software deployment tool

Last week .NET Rocks mentioned a good idea in passing: start a screencast tool like Camtasia before you do a software install. Michael Learned, told the story of a client that asked him to take screen shots of every step in the installation of Microsoft’s Team Foundation Server. Carl Franklin commented “What a great idea to throw Camtasia on there and record the whole process.”

It would be better if the installation process were scripted and not just recorded, but sometimes that’s not practical. Sometimes clicking a few buttons is absolutely necessary or at least far easier than writing a script. And even if you think your entire process is automated with a script, a screencast might be a good idea. It could record little steps you have to do in order to run your script, details that are easily forgotten.

Another way to use this idea would be to have one person do a practice install on a test server while recording the process. Then another person could document and script the process by studying the video. This would be helpful when the person who knows how to do the installation lacks either the verbal skills to explain the process or the scripting skills to automate it.

Related posts:

Better tools, less productivity?

Can better tools make you less productive? Here’s a quote from Frequently Forgotten Fundamental Facts about Software Engineering by Robert Glass:

Most software tool and technique improvements account for about a 5- to 30-percent increase in productivity and quality. … Learning a new tool or technique actually lowers programmer productivity and product quality initially. You achieve the eventual benefit only after overcoming this learning curve.

If you’re always learning new tools, you may be less productive than if you stuck with your old tools a little longer, even if the new tools really are better. And especially if you’re a part-time developer, you may never reach the point where a new tool pays for itself before you throw it away and pick up a new one. Kathleen Dollard wrote an editorial to this effect in 2004 entitled Save The Hobbyist Programmer.

Miners know they have a significant problem when the canary they keep with them stops singing. Hobbyist/part-time programmers are our industry’s version of the canary, and they have stopped singing. People who program four to eight hours a week are being cut out of the picture because they can’t increase their skills as fast as technology changes. That’s a danger signal for the rest of us.

So what do you do? Learn quickly or change slowly. The first option is to commit to learning a new tool quickly, invest heavily in up-front training, and use the tool as much as you can before the next one comes along. This is the favored option for ambitious programmers who want to maximize their marketability by always using the latest tools.

The second option is to develop a leap frog strategy, letting some new things pass you by. The less time you spend per week programming, the less often you should change tools. Change occasionally, yes, but wait for big improvements.

Related posts:

Calendars, Connections, and Cats

James Burke had a television series Connections in which he would create a connection between two very different things. For example, in one episode he starts with the discovery of the touchstone for testing precious metals and tells a winding tale of how the touchstone led centuries later to the development of nuclear weapons.

I had a Connections-like moment when a calendar led to some physics, which then lead to Andrew Lloyd Webber’s musical Cats.

A few days ago I stumbled on Ron Doerfler’s graphical computing calendar and commented on the calendar here. When I discovered Ron Doerfler’s blog, I bookmarked his article on Oliver Heaviside to read later. (Heaviside was a pioneer in what was later called distribution theory, a way of justifying such mathematical mischief as differentiating non-differentiable functions.) As I was reading the article on Heaviside, I came to this line:

At one time the ionosphere was called the Heaviside layer …

Immediately the lyrics “Up, up, up to the Heaviside layer …” started going through my head. These words come from the song “The Journey to the Heaviside Layer” from Cats. I had never thought about “Heaviside” in that song as being related to Mr. Heaviside. I’ve never seen the lyrics in print, so I thought the words were “heavy side” and didn’t stop to think what they meant.

Andrew Lloyd Webber based Cats on Old Possum’s Book of Practical Cats by T. S. Eliot. The song “The Journey to the Heaviside Layer” in particular is based on the poem Old Deuteronomy from Eliot’s book. Webber used the Heaviside layer as a symbol for heaven, based on an allusion in one of T. S. Eliot’s letters. The symbolism is obvious in the musical, but I hadn’t thought about “Heaviside layer” as meaning “the heavens” (i.e. the upper atmosphere) as well as heaven in the theological sense.

How the central limit theorem began

The Central Limit Theorem says that if you average enough independent copies of a random variable, the result has a nearly normal (Gaussian) distribution. Of course that’s a very rough statement of the theorem. What are the precise requirements of the theorem? That question took two centuries to resolve. You can see the final answer here.

The first version of the Central Limit Theorem appeared in 1733, but necessary and sufficient conditions weren’t known until 1935. I won’t recap the entire history here. I just want to comment briefly on how the Central Limit Theorem began and how different the historical order of events was from the typical order of presentation.

A typical probability course might proceed as follows.

  1. Define the normal distribution.
  2. State and prove a special case of the Central Limit Theorem.
  3. Present the normal approximation to the binomial as a corollary.

This is the opposite of the historical order of events.

Abraham de Moivre discovered he could approximate binomial distribution probabilities using the integral of exp(-x2) and proved an early version of the Central Limit Theorem in 1733. At the time, there was no name given to his integral. Only later did anyone think of exp(-x2) as the density of a probability distribution. De Moivre certainly didn’t use the term “Gaussian” since Gauss was born 44 years after de Moivre’s initial discovery. De Moivre also didn’t call his result the “Central Limit Theorem.” George Pólya gave the theorem that name in 1920 as it was approaching its final form.

For more details, see The Life and Times of the Central Limit Theorem.

The Life and Times of the Central Limit Theorem by William Adams

Related links:

For daily posts on probability, follow @ProbFact on Twitter.

ProbFact twitter icon

Regular expressions in Mathematica

Regular expressions are fairly portable. There are two main flavors of regular expressions — POSIX and Perl — and more languages these days use the Perl flavor. There are some minor differences in what it means to be “like Perl” but for the most part languages that say they follow Perl’s lead specify regular expressions the same way. The differences lie in how you use regular expressions: how you form matches, how you replace strings, etc.

Mathematica uses Perl’s regular expression flavor. But how do you use regular expressions in Mathematica? I’ll give a few tips here and give more details in the notes Regular expressions in Mathematica.

First of all, unlike Perl, Mathematica specifies regular expressions with ordinary strings. This means that metacharacters have to be doubly escaped. For example, to represent the regular expression d{4} you must use the string "\d{4}".

The function StringCases returns a list of all matches of a regular expression in a string. If you simply want to know whether there was a match, you can use the function StringFreeQ. However, note the you probably want the opposite of the return value from StringFreeQ because it returns whether a string does not contain a match.

By default, the function StringReplace replaces all matches of a regular expression with a given replacement pattern. You can limit the number of replacements it makes by specifying an addition argument.

Related links:

For daily tips on regular expressions, follow @RegexTip on Twitter.

Regex tip icon

2010 calendar of lost mathematical art

Rod Carvalho wrote a post this morning announcing a beautiful 2010 calendar created by Ron Doerfler. Doerfler’s blog is entitled Dead Reckonings: Lost Art in the Mathematical Sciences. The calendar is an example of such lost art. It is illustrated with nomograms, ingenious ways of computing with graphs before electronic calculators were common. The illustrations are pleasant to look at even if you have no idea what they mean.

Image via Ron Doerfler.

Related posts:

Spherical trig is a lost art. Why care about spherical trig?

The Gudermannian function gd(x) is another interesting relic of an early time. It is closely related to the Mercator projection and shows how to relate ordinary and hyperbolic trig functions without using complex numbers.

The image above shows solutions to the equation u + v + w = uvw. Here’s a post explaining the significance of that equation.