Maintenance costs

Posted on 31 March 2010 by John

No engineered structure is designed to be built and then neglected or ignored. — Henry Petroski

The quote above comes from Henry Petroski’s recent interview on Tech Nation. In the same interview, Petroski says that a common rule of thumb is that maintenance costs about 4% of construction cost per year. For a structure as old as the Golden Gate Bridge (completed in 1937), for example, that’s a lot of 4%’s.

Golden Gate Bridge

Painting the bridge has cost far more than building it. The bridge is painted continuously: as soon as the painters reach the end of the bridge, they turn around and start over. The engineers who designed the bridge knew this would happen. When you build something out of steel and put it outside, it will need to be painted. It was all part of the design.

Image credit: Wikipedia

Estimating the chances of something that hasn’t happened yet

Posted on 30 March 2010 by John

Suppose you’re proofreading a book. If you’ve read 20 pages and found 7 typos, you might reasonably estimate that the chances of a page having a typo are 7/20. But what if you’ve read 20 pages and found no typos. Are you willing to conclude that the chances of a page having a typo are 0/20, i.e. the book has absolutely no typos?

To take another example, suppose you are testing children for perfect pitch. You’ve tested 100 children so far and haven’t found any with perfect pitch. Do you conclude that children don’t have perfect pitch? You know that some do because you’ve heard of instances before. Your data suggest perfect pitch in children is at least rare. But how rare?

The rule of three gives a quick and dirty way to estimate these kinds of probabilities. It says that if you’ve tested N cases and haven’t found what you’re looking for, a reasonable estimate is that the probability is less than 3/N. So in our proofreading example, if you haven’t found any typos in 20 pages, you could estimate that the probability of a page having a typo is less than 15%. In the perfect pitch example, you could conclude that fewer than 3% of children have perfect pitch.

Note that the rule of three says that your probability estimate goes down in proportion to the number of cases you’ve studied. If you’d read 200 pages without finding a typo, your estimate would drop from 15% to 1.5%. But it doesn’t suddenly drop to zero. I imagine most people would harbor a suspicion that that there may be typos even though they haven’t seen any in the first few pages. But at some point they might say “I’ve read so many pages without finding any errors, there must not be any.” The situation is a little different with the perfect pitch example, however, because you may know before you start that the probability cannot be zero.

If the sight of math makes you squeamish, you might want to stop reading now. Just remember that if you haven’t seen something happen in N observations, a good estimate is that the chances of it happening are less than 3/N.

What makes the rule of three work? Suppose the probability of what you’re looking for is p. If we want a 95% confidence interval, we want to find the largest p so that the probability of no successes out of n trials is 0.05, i.e. we want to solve (1-p)ⁿ = 0.05 for p. Taking logs of both sides, n log(1-p) = log(0.05) ≈ -3. Since log(1-p) is approximately –p for small values of p, we have p ≈ 3/n.

The derivation above gives the frequentist perspective. I’ll now give the Bayesian derivation of the same result. Then you can say “p is probably less than 3/N” in clear conscience since Bayesians are allowed to make such statements.

Suppose you start with a uniform prior on p. The posterior distribution on p after having seen 0 successes and N failures has a beta(1, N+1) distribution. If you calculate the posterior probability of p being less than 3/N you get an expression that approaches 1 – exp(-3) as N gets large, and 1 – exp(-3) ≈ 0.95.

Update: Italian translation of this post.

The secret to understanding recursion

Posted on 30 March 2010 by John

nested Russian dolls

Recursion is the process of solving a problem in terms of smaller versions of the same problem. Since the problem gets smaller each time, the process eventually terminates in a problem (the “base case”) that can be solved directly. Be sure of three things:

The problem gets smaller each time.
You include a solution for the base case.
Each case is handled correctly.

That’s really all there is to it.

But what about visualizing how the code runs? You don’t have to. And that’s the point: sometimes a recursive solution is easier precisely because you don’t have to understand in detail how it executes. Graham explains in his Lisp book why tracing the code execution is unnecessary if not harmful.

Students learning about recursion are sometimes encouraged to trace all the invocations of a recursive function on a piece of paper. … This exercise could be misleading: a programmer defining a recursive function usually does not think explicitly about the sequence of invocations that results from calling it. … How do you visualize all those invocations? You don’t have to.

I promise I’m not trying to learn anything

Posted on 29 March 2010 by John

Medical experiments come under greater scrutiny than ordinary medical practice. There are good reasons for such precautions, but this leads to a sort of paradox. As Frederick Mosteller observed

We have a strange double standard now. As long as a physician treats a patient intending to cure, the treatment is admissible. When the object is to find out whether the treatment has value, the physician is immediately subject to many constraints.

If a physician has two treatment options, A and B, he can assign either treatment as long as he believes that one is best. But if he admits that he doesn’t know which is better and says he wants to treat some patients each way in order to get a better idea how they compare, then he has to propose a study and go through a long review processes.

I agree with Mosteller that we have a strange double standard, that a doctor is free to do what he wants as long as he doesn’t try to learn anything. On the other hand, review boards reduce the chances that patients will be asked to participate in ill-conceived experiments by looking for possible conflicts of interest, weaknesses in statistical design, etc. And such precautions are more necessary in experimental medicine than in more routine medicine. Still, there is more uncertainty in medicine than we may like to admit, and the line between “experimental” and “routine” can be fuzzy.

Confusing familiar with simple

Posted on 25 March 2010 by John

Is Spanish simpler than Chinese? Most English speakers would think so, though that may not be true. Spanish is more familiar than Chinese if you’re an English speaker, but that does not mean the language is objectively simpler. In fact, linguists have a theory that all human languages are about equally complex, though they allocate their complexity in different areas. For example, Chinese has a complex tonal system, but I’ve been told its grammar is relatively simple.

We often confuse familiar with simple. Rich Hickey makes this observation in the context of programming languages, though the principle applies much more generally.

I think programmers have become inured to incidental complexity, in particular by confusing familiar or concise with simple. And when they encounter complexity, they consider it a challenge to overcome, rather than an obstacle to remove. Overcoming complexity isn’t work, it’s waste.

In some sense the familiar is simple. Familiar things have less perceived complexity, and sometimes perceived complexity is all that counts. But perceived complexity is personal. We can forget that familiarity clouds our judgment about complexity. We may recommend something familiar but complex to someone else who finds it unfamiliar and complex. Teachers have to keep in mind what students find complex. Programmers have to keep in mind what users find complex. Doctors have to keep in mind what patients find complex.

However, familiarity and perceived complexity can be deceiving even though no one else is involved. You may find something familiar and not realize how much effort you’re devoting to fighting its complexity. It’s easy to assume that things must be as complex as they are. I didn’t realize how complex clarinet was until I learned to play saxophone. I didn’t realize how complex C++ was until I had some experience with other programming languages. I didn’t realize how complex some desktop software was until I tried online alternatives.

The complexity of the familiar may not be apparent until you look closer. Nothing could be more familiar than the experience that the sun and planets go around the earth. That is a simple explanation until you look at orbits more carefully. Then you start introducing epicycles on top of epicycles to preserve the earth-centric model. You may find that what you thought was simple was only familiar, and that what you dismissed as more complex was only less familiar.

A dozen posts on simplicity

Posted on 23 March 2010 by John

Here are a dozen posts I’ve written about simplicity.

Ford-Chevy arguments in tech

Posted on 22 March 2010 by John

If you’ve never heard a Ford-Chevy argument, you may find it hard to believe that such things exist. People actually get into arguments, sometimes violent arguments, over which trucks are better, Ford or Chevy.

More generally, a Ford-Chevy argument is an emotionally charged debate over the merits of two similar things with each side fiercely loyal to its position. These arguments look silly to outsiders but are serious to insiders. We all have our Ford-Chevy topics.

Have you ever gotten into a Mac versus PC argument? Emacs versus vi? Your favorite programming language versus some inferior language? How about your profession versus some rival profession? Your favorite sports team versus a competitor?

Thomas Gideon recently recorded a podcast on software tools. Gideon gives a good explanation for why we have technical Ford-Chevy arguments.

The time needed to gain mastery over a single deep tool usually precludes being able to learn anything else in that category. Pointing out feature differences, ones that may paint your chosen tool in an unflattering light, can make you defensive without realizing it. … how much effort you put in is being called into question, and to a degree, if only subconsciously, your intelligence or judgment may also be questioned by implication.

When you’ve made a large investment in time or money, you don’t want to hear someone question that investment. You may feel that your intelligence or judgment is being called into question. You may fear that you’ve picked the wrong tool but don’t have the time or energy to learn an alternative.

I’d like to think I’m above Ford-Chevy arguments, but I’m not. I would never get into a literal Ford-Chevy argument because I don’t care about trucks. But I could easily fall into a Ford-Chevy–type argument about something I care more about.

It’s no surprise that emotional factors influence our choice of music or clothes. But it is surprising how much emotional factors influence even highly technical decisions. For example, people often choose statistical methods for emotional reasons, though they would never admit it. Once we make a decision, we come up with rational justifications after the fact. This applies to choosing a computer or a statistical method just as much as it applies to choosing a truck or pair of shoes.

Read a few of the over 6,500 comments on the video to get a taste of a real Ford-Chevy argument.

Related post: Doing good work with bad tools

Four mechanical devices better than their newer counterparts

Posted on 18 March 2010 by John

Here are four mechanical devices I prefer to their modern counterparts.

French press. It makes better coffee than a typical coffee machine. Also, a French press work without electricity. Next time a hurricane comes through Houston and knocks out our power, I can still make my coffee.

Reel mower. I had gasoline powered lawn mowers until last year. Sometimes they’d start, sometimes they wouldn’t. My reel mower always starts. And it’s quiet.

Rake. I had a leaf blower once. It was obnoxiously loud and a nuisance to my neighbors. I much prefer raking leaves even though it takes longer.

Pencil sharpener. With four children, we sharpen a fair number of pencils. We have owned a couple electric pencil sharpeners. They were noisy, hard to use, and soon wore out. Our mechanical pencil sharpener is cheaper and far more reliable.

I’m no Luddite, but I firmly believe that newer isn’t necessarily better.

Kiss me, I might be Irish!

Posted on 17 March 2010 by John

Happy St. Patrick’s Day everyone.

They say everyone’s Irish on St. Patrick’s Day. I’ve heard that I actually am part Irish (as well as Scottish, German, Cherokee, …) In any case, it’s nice of the Irish to share their holiday with the rest of the world.

St. Patrick’s Day 2007 in Seoul, Korea. Image credit: here via Wikipedia

Emacs

Posted on 16 March 2010 by John

Emacs is a text editor with ambitions to be an operating system. I do not use Emacs, though I once did, and I still find it intriguing. I’d like to find something similar that acts more like a Windows program.

GNU Emacs began in 1984 and has been in constant development ever since. The current version is 23.1. How many applications from 1984 are still in widespread use today? The only other one that comes to mind is TeX.

I used Emacs in graduate school and for a few years after that. I was fairly fluent with Emacs, though I never customized it much. I intended to learn Emacs Lisp and all that, but it never happened.

When I started developing Windows software I used Emacs at first, but the benefits of Visual Studio soon persuaded me give up my old editor. It was much easier to go with the flow.

I’ve revisited Emacs a couple times over the years. I still have some of the keystrokes burned into my memory. I use it on Linux now and then, but I mostly work on Windows, and my experience using Emacs on Windows has been frustrating to say the least. Tasks that are trivial in any Windows application, such as printing and spell checking, are surprisingly difficult to set up in Emacs. I’m sure it is possible to resolve these problems, though I never did.

The problems with printing and spell checking are part of the larger issue that Emacs is so idiosyncratic. It behaves nothing like a typical Windows program. Some people may say that’s a good thing. But it makes life more complicated if you switch between Emacs and more conventional Windows software.

Emacs is no more a typical Mac application than it is a typical Windows application. And yet my impression is that this is less of a problem for Mac users. I’d like to understand whether this is true and if so why.

One of the things I liked about Emacs was the way you could “live” there. An expert Emacs user might work inside Emacs all day, using it as an editor, debugger, shell, file system explorer, email program, etc. Steve Yegge is such an expert. When he blogged about his move from Windows to Mac, he said the main reason for the switch was that he prefers the appearance of the fonts on a Mac. Changing operating systems was not a big deal for Yegge because he didn’t really live in Windows before, nor does he live in OS X now. He lives in Emacs. He concluded his essay by saying

So I’ll keep using my Macs. They’re all just plumbing for Emacs, anyway. And now my plumbing has nicer fonts.

Living inside Emacs comes at a price. Part of that price is writing lots of Emacs Lisp to glue things together. Another part of that price is the commitment to practicing using Emacs. As Yegge says elsewhere

… you need to make a serious, lifelong commitment to Emacs in order to master it. … So it’s not an editor for the faint of heart …

Yikes! I’m not ready to make a serious, lifelong commitment to a piece of software. To my wife? Yes. To my text editor? No.

One of the best features of Emacs is that it has custom “modes” for various kinds of files. Instead of using a separate program for editing every kind of file, Emacs users use one program with different modes. As soon as a new file type comes out, say for a new programming language, someone will post an Emacs mode for that new language.

I’d like to find an editor on Windows that is analogous to Emacs. By that I mostly have in mind a powerful, highly configurable editor with support for many file types. I’d want it to behave like a Windows application, not a foreign transplant, and integrate well with .NET.

There was a project to create such an editor, nicknamed Emacs.NET. It was announced in late 2007. It sounds like the project is still alive, but it doesn’t seem all that promising. [Update: maybe it’s dead.]

I’ve looked at a few Windows editors that claim to be highly configurable but are not well documented. So if such an editor is configurable, it’s configurable for the person who wrote it or possibly for anyone else willing to study the source code.

Any suggestions for a general purpose Windows editor? For starters, I’d be pleased to find something that’s good at editing LaTeX and HTML.

Update (2 April 2010): I’ve decided to give Emacs another try.

Related post: This post started out as an update to my earlier post One program to rule them all.

Month: March 2010

Maintenance costs

Estimating the chances of something that hasn’t happened yet

More probability articles

The secret to understanding recursion

I promise I’m not trying to learn anything

Related posts

Confusing familiar with simple

A dozen posts on simplicity

Ford-Chevy arguments in tech

Four mechanical devices better than their newer counterparts

Related posts

Kiss me, I might be Irish!

Related posts

Emacs