Maybe NASA could use some buggy software

In Coders at Work, Peter Norvig quotes NASA administrator Don Goldin saying

We’ve got to do the better, faster, cheaper. These space missions cost too much. It’d be better to run more missions and some of them would fail but overall we’d still get more done for the same amount of money.

NASA has extremely rigorous processes for writing software. They supposedly develop bug-free code; I doubt that’s true, thought I’m sure they do have exceptionally low bug rates. But this quality comes at a high price. Rumor has it that space shuttle software costs $1,500 per line to develop. When asked about the price tag, Norvig said “I don’t know if it’s optimal. I think they might be better off with buggy software.” At some point it’s certainly not optimal. If it doubles the price of a project to increase your probability of a successful mission from 98% to 99%, it’s not worth it; you’re better off running two missions with a 98% chance of success each.

Few people understand that software quality is all about probabilities of errors. Most people think the question is whether you’d rather have bug-free software or buggy software. I’d rather have bug-free software, thank you. But bug-free is not an option. Nothing humans do is perfect. All we can do is lower the probabilities of bugs. But as the probability of bugs goes to zero, the development costs go to infinity. (Actually it’s not all about probabilities of errors. It’s also about the consequences of errors. Sending back a photo with a few incorrect pixels is not the same as crashing a probe.)

Norvig’s comment makes sense regarding unmanned missions. But what about manned missions? Since one of the possible consequences of error is loss of life, the stakes are obviously higher. But does demanding flawless software increase the probability of a safe mission? One of the consequences of demanding extremely high quality software is that some tasks are too expensive to automate and so humans have to be trained to do those tasks. But astronauts make mistakes just as programmers do. If software has a smaller probability of error than an astronaut would have for a given task, it would be safer to rely on the software.

Related post: Software in space

13 thoughts on “Maybe NASA could use some buggy software

  1. I interviewed once (didn’t get the job) at a company that did firmware for aeronautics.

    There are legal reasons why one can’t just program normally as suggested above.

  2. Nice post.

    I’ve always though we Software people are too much focused on technical aspects and forgot how important economics are.

  3. Jason: There are political reasons as well, as Norvig points out after the quote above. If you’re just looking at successful missions per dollar, you might lower your quality standards. But politics isn’t measured in missions per dollar. Politically, the cost of a failed mission is much greater than the benefit of a successful mission.

  4. Yes, I can see political reasons playing a huge factors. Do you think the government still inject funding if more and more mission failed?

  5. Whenever humans and machines cooperate the human should always be the weakest link. What it means is that the machine has to be far more reliable than the human. And it’s up to us as programmers to make that happen. Sure programmers make mistakes to. But we have the advantage that we can simulate our programs forwards, backward, up, down and through a model checker. We can’t do that with human beings. Not yet, and hopefully never.

  6. If it doubles the cost to go from 98% to 99%, but if that failure happens, the cost could be in the tens of billions of dollars (and human lives), then yes, it’s probably worth it.

  7. Would you like it if the electronically controlled brake system on a train failed at the wrong time? Some software must be “perfect” – where human lives are at risk, this is especially true. Guess what, if you’re the responsible party for software that maims or kills someone, you’re liable, just like if an Engineering firm builds a structure that fails.

    On the point of unmanned missions. In software, as with many other places, as soon as you save money, it’s not viewed as a surplus – it’s viewed as a place to cut the budget. So a mission that would cost $500M is done for $375M because of slightly lower quality code (they outsourced to MIT), that other $125M pretty much disappears, you didn’t spend it, you don’t keep it. And anyway, I honestly don’t think that $250M of $500M goes to the software on these missions, I don’t even think software contributes 25% of the cost of any of these missions.

    A great article on the way NASA writes software:

  8. Andrew, good points. I don’t disagree that perfect software is preferable to buggy software. But sometimes the realistic alternative to buggy software is no software, and sometimes buggy software is better than no software.

    Your point about budget surplus is right on. The savings evaporates, so you might as well spend all you can get.

  9. “as the probability of bugs goes to zero, the development costs go to infinity. ”

    If this is true, then you will be able to do one of the following:
    – beggar Knuth
    – establish your reputation as the man who found scads of bugs where Knuth found none.

  10. The space shuttle software certainly doesn’t sound bug free to me! The routines that control the physical systems my be bug free but the system as a whole certainly sounds like it isn’t. When you have ui that tells you things are off when they are on but it is too cost prohibitive to change them, so you have 200 pounds of errata and manuals does not sounds like a resounding software engineering feat to me.

  11. “It’d be better to run more missions and some of them would fail but overall we’d still get more done…”

    Like the Challenger?

    Why not be an astronaut in a capitalist society bent on maximizing profits?

  12. I agree on the NASA has political reasons to be careful. First what kind of PR nightmare is it everytime something goes wrong? It takes a year or more to get permission to do missions again. So it isn’t just the one mission failing it is a year of doing nothing while you investigate that you save.

    Next up: lives are at stake. I’m not sure how much in dollars per line of code terms but I suspect code for ABS systems in cars is very very expensive too. There is a different standard of care for things where lives are at stake. I wonder how much of the $1500 per line is real too. I suspect they had grad students in university coding away for a 4 year PhD program at ~25-35k a year and then quote the value of their time at 100k per year.

    Added to the astronauts lives is the risk of crashing into something say a piece of the shuttle knocks out the Eiffel tower. You’d have a lot of pissed french people to explain why you didn’t test your code properly.

Comments are closed.