Broken windows theory and programming

The broken windows theory says that cracking down on petty crime reduces more serious crime. The name comes from the explanation that if a building has a few broken windows, it invites vandals to break more windows and eventually burn down the building. Turned around, this suggests that punishing vandalism could lead to a reduction in violent crime. Rudy Giuliani is perhaps the most visible proponent of the theory.  His first initiative as mayor of New York was to go after turnstile jumpers and squeegeemen as a way of reducing crime in city. Crime rates dropped dramatically during his tenure.

In the book Pragmatic Thinking and Learning, Andy Hunt applies the broken windows theory to software development.

Known problems (such as bugs in code, bad process in an organization, poor interfaces, or lame management) that are uncorrected have a debilitating, viral effect that ends up causing even more damage.

I’ll add a couple of my pet peeves to Andy Hunt’s list.

The first is compiler warnings. I can’t understand why some programmers are totally comfortable with their code having dozens of compiler warnings. They’ll say “Oh yeah, I know about that. It’s not a problem.” But then when a warning shows up that is trying to tell them something important, the message gets lost in the noise. My advice: Just fix the code. In very exceptional situations, explicitly turn off the warning.

The second is similar. Many programmers blithely ignore run-time exceptions that are written to an event log. As with compile warnings, they justify that these exceptions are not really a problem. My advice: If it’s not really a problem, then don’t log it. Otherwise, fix it.

Are men better than women at chess?

The most recent 60-Second Science podcast discusses the abilities of men and women in playing chess. One can argue that men are better than women at playing chess because all world champions have been men. However, that only suggests that the best men are better than the best women. It is possible that the distribution of chess ability is identical for men and women. Since more men than women play chess, the best men are the best of a larger population.

I looked at this exact issue in an earlier post on Olympic performance. That posts asks what to expect if men and women had equal ability in a sport that more men chose to compete in. The same considerations apply to country sizes. If two countries have equal ability at a sport, the larger country is likely to field a better team. The best performers from a larger group are typically better than the best performers from a smaller group. This post looks at how to quantify this observation using order statistics.

The podcast mentioned above says that the difference in male and female championship performance “can be almost entirely explained by statistics.” I assume this means that an order statistic model with identical distributions fits the data well.

Top 10 posts of 2008

This blog started in January 2008, so the best posts of the year are also the best posts of all time!

Here’s a list of a couple of the most popular posts on this site in each of five categories.

Business and management

Medieval project management
Peter Drucker and abandoning projects

Creativity

Getting to the bottom of things
Simple legacy

Software development

Experienced programmers and lines of code
Programmers aren’t reading programming books

Math

Jenga mathematics
How to compute binomial coefficients

Statistics

Wine, Beer, and Statistics
Why microarray study conclusions are so often wrong

Early evidence-based medicine

In the 1840’s, Ignaz Semmelweis, an assistant professor in the maternity ward of Vienna General Hospital, demonstrated that mortality rates dropped from 12 percent to 2 percent when doctors washed their hands between seeing patients.

His colleagues resisted his findings for a couple reasons. First, they didn’t want to wash their hands so often. Second, Semmelweis had demonstrated association but did not give an underlying cause. (This was a few years before, and led to, the discovery of the germ theory.) He was fired, had a nervous breakdown, and died in a mental hospital at age 47. (Reference: Super Crunchers)

We know now Semmelweis was right and his colleagues wrong. It’s tempting to think that people in the 1840’s were either ignorant or lazy and that we’re different now. But human nature hasn’t changed. If someone asked you to do something you didn’t want to do and couldn’t explain exactly why you should do it, would you listen? You would naturally be skeptical, and it’s a good thing, since most published research results are false.

One thing that has changed since 1840 is the level of sophistication in interpreting data.  Semmelweis could argue today that his results warrant consideration despite the lack of a causal explanation, based on the strength of his data. Such an argument could be evaluated more readily now that we have widely accepted ways of measuring the strength of evidence. On the other hand, even the best statistical evidence does not necessarily cause people to change their behavior.

This New York Times editorial is a typical apologetic for evidence-based medicine. Let’s base medical decisions on evidence! But of course medicine does base decisions on evidence. The question is how medicine should use evidence, and this question is far more complex than it first appears.

Related: Adaptive clinical trial design

My favorite Christmas carol

A few years ago I noticed the words to Hark, the Herald Angels Sing as if I’d never heard the song before. Since then I’ve decided that it is my favorite carol because of its rich language and deep theology. Here are the words from the second verse that jumped out at me the first time I really listened to the carol.

Veiled in flesh the Godhead see,
Hail the incarnate Deity!
Pleased as man with man to dwell,
Jesus, our Emmanuel.

I often prefer the second and third verses of famous hymns. They may be no better than first verses, but they are less familiar and more likely to grab my attention.

Merry Christmas everyone.

Small advantages show up in the extremes

I’ve been reading Malcolm Gladwell’s book Outliers: The Story of Success. One of the examples he gives early in his book studies the best Canadian hockey players. A disproportionate number of the top players were born in the first quarter of the year.

The eligibility cutoff for age-class hockey league assignments is January 1. Those with birthdays early in the year will be older when they are first eligible to play for a given age group. On average, these children will be larger and more skilled than those born later in the year. Being a few months older is initially an advantage, but It would seem that it should wear off over time. It doesn’t. Those who had an age advantage, coupled with talent, developed a little more confidence and received a little more attention than those who did not. The advantages of extra confidence and attention carried on after the direct advantage of age disappeared.

I wrote a post a while back that looks at this sort of situation in some mathematical detail. Suppose the abilities of two groups are normally distributed with the same variance but the mean of one group is shifted just slightly. (The post I referred to looks at male and female Olympic athletes, but we could as easily think about Canadian hockey players born in December and January.) The further you go out in the extremes, the more significant that shift becomes.

For another example, think of how heights are distributed. Men are taller than women on average, but it’s not unheard of for the tallest person in a small group to be a woman. However, as the group gets larger, the odds that the tallest person in the group is male increase exponentially. As it turns out, average heights of men and women differ by about six inches. But even if average heights differed by the slightest amount, the odds in favor of the tallest person in a group being male would still increase exponentially as the group size increases.

Debasing the word "technology"

It bugs me to hear people say “technology” when they really mean “computer technology”, as if drug design, for example, isn’t technology. But now I’ve noticed some folks are even more narrow in their use of the term. They use “technology” to mean essentially blogs and podcasts.

So if you design satellites, program supercomputers, or clone sheep, but don’t read blogs and listen to podcasts, you’re just out of it. Someone should tell the Rice University nanotechnology group that they should change their name since they’re not really into technology. Unless of course they blog or podcast about their work.