Maybe you only need it because you have it

Some cities need traffic lights because they have traffic lights. If one traffic light goes out, it causes a traffic jam. But sometimes when all traffic lights go out, say due to a storm, traffic flows better than before.

Some buildings need air conditioning because they have air conditioning. Because they were designed to be air conditioned, they have no natural ventilation and would be miserable to inhabit without air conditioning.

Some people need to work because they work. A family may find that their second income is going entirely to expenses that would go away if one person stayed home.

It’s hard to tell when you’ve gotten into a situation where you need something because you have it. I had a friend that worked for a company that sold expensive software development tools. He said that one of the best perks of his job was that he could buy these tools at a deep discount. But he didn’t realize that without his job, he wouldn’t need these tools! He wasn’t using them to develop software. He was only using them so he could demonstrate and sell them.

It may be even harder for an organization to realize it has been caught in a cascade of needs. Suppose a useless project adds staff. These staff need to be managed, so they hire a manager. Then they hire people for IT, accounting, marketing, etc. Eventually they have their own building. This building needs security, maintenance, and housekeeping. No one questions the need for the security guard, but the guard would not have been necessary without the original useless project.

When something seems absolutely necessary, maybe it’s only necessary because of something else that isn’t necessary.

Related post: Defining minimalism

Military intelligence from serial numbers

During World War II, America and her allies needed to estimate the number of Panzer V tanks Germany had produced. The solution was simple: Look at the serial numbers of the captured tanks. If you assume the tanks had been sequentially numbered — as in fact they were — you could view the serial numbers of the captured tanks as random samples from the entire range. You could then use statistics to estimate the range and hence the number of tanks produced. More details available here.

A few years later America tried to use the serial number trick to estimate the number of Soviet strategic bombers. This time the trick backfired.

In 1958, American military intelligence believed the USSR would soon have four hundred Bison and three hundred Bear bombers capable of striking the American heartland. Their evidence was the high serial number of a Bison that had flown at a May Day parade in Moscow. In fact, the Soviets knew the Americans were watching, and intentionally inflated that number. — Rocket Men, page 118.

The Panzer estimate was accurate because the Allies had hundreds of data points, enough to support the assumption that the tanks were sequentially numbered and to make a good estimate of the total number.

The Bison bomber was only one data point, but it was consistent with what intelligence services (wrongly) believed. At that time, the US had grossly over-estimated the military capabilities of the USSR. According to Rocket Men, Khrushchev turned down US offers to cooperate in space exploration because he feared that such cooperation would give the US a more accurate assessment of his country’s military.

Related post: Selection bias and bombers

Click to learn more about Bayesian statistics consulting

You can be a hero with a simple idea

Yesterday I mentioned someone who published a scholarly paper in 1994 for a technique commonly taught in freshman calculus. There’s been a lot of discussion of this (the paper, not my blog post) on the web. The general take has been that this was an egregious failure in the peer review system. No one recognized a simple, centuries-old idea. No one called up a high school math teacher and asked “Hey, have you seen this before?” All that is true, but here’s a different take on the situation.

The paper reinventing the trapezoid rule has been cited 75 times. It must have filled a need. Yes, the author was ignorant of basic calculus. But apparently a lot of other doctors are just as ignorant of calculus. The author did the medical profession a service by pointing out a simple way to estimate the area under a glucose-response curve. The technique was not original, and should not have been published as original research, but it was valuable.

Surely some doctors already knew how to find the area under a glucose-response curve. But apparently many others did not, and they learned something useful from the article. The article did some good, more good than original but arcane articles that no one reads, even though it was bad scholarship.

The author made a connection that not everyone else had made. This reminds me of Picasso’s sculpture Head of a Bull.

Picasso: Head of a Bull

All Picasso did was put handle bars on top of a bicycle seat and say “Hey, that looks like a bull.” His sculpture took zero technical skill, but it was clever. Was Picasso the first human to ever have this idea? Maybe.

Sometimes you can be a hero by taking what is common as dirt in one context and applying it to a new context.

Related posts:

NASA did not find arsenic-based life

Headlines are saying today that NASA found microbes that use arsenic the way all other known life uses phosphorous. The NASA web site says NASA-Funded Research Discovers Life Built With Toxic Chemical. Some other headlines include “NASA finds ‘alien life’ made of arsenic,” “NASA finds arsenic-based life,” and “NASA finds arsenic-loving bacterium.” These headlines are misleading.

The phrase arsenic-based life is misleading because most people would assume this is in contrast to carbon-based life. No, the discovery involves substituting arsenic for phosphorous. So this new microbe is only arsenic-based in the sense that most life is phosphorous-based. Actually, even that is not correct. This is a phosphorous-based life form that has been tricked into using arsenic.

NASA did not find a microbe that substitutes arsenic for phosphorous. They coaxed a microbe into substituting arsenic for phosphorous. Here’s the relevant paragraph from NASA’s story:

The newly discovered microbe, strain GFAJ-1, is a member of a common group of bacteria, the Gammaproteobacteria. In the laboratory, the researchers successfully grew microbes from the lake on a diet that was very lean on phosphorus, but included generous helpings of arsenic. When researchers removed the phosphorus and replaced it with arsenic the microbes continued to grow. Subsequent analyses indicated that the arsenic was being used to produce the building blocks of new GFAJ-1 cells.

So it seems that NASA found a microbe that could use arsenic, not a microbe that naturally does use arsenic. Perhaps some are inferring that because NASA was able to make this happen in a lab, it may also have happened naturally, though no one has seen that. Maybe so.

NASA goes on to say

The key issue the researchers investigated was when the microbe was grown on arsenic did the arsenic actually became incorporated into the organisms’ vital biochemical machinery, such as DNA, proteins and the cell membranes.

This is an amazing discovery, but it’s not quite the discovery that headlines imply.

Update: More detailed criticism of the NASA announcement from Nature News. Experts challenge the claim that the microbes actually incorporate arsenic in organic compounds.

Three surprises with the trapezoid rule

The trapezoid rule is a very simple method for estimating integrals. The idea is to approximate the area under a curve by a bunch of thin trapezoids and add up the areas of the trapezoids as suggested in the image below.

This is an old idea, probably older than the formal definition of an integral. In general it gives a crude estimation of the integral. If the width of the trapezoids is h, the error in using the trapezoid rule is roughly proportional to h2. It’s easier to do better. For example, Simpson’s rule is a minor variation on the trapezoid rule that has error proportional to h5.

So if the trapezoid rule is old and inaccurate, why does anyone care about it? Here are the surprises.

  1. You can still get a publication out of the trapezoid rule! In 1994, a doctor published a paper reinventing the trapezoid rule. Not only did the editors not recognize this ancient algorithm, the paper has been cited many times since it was published. (Update: more about the trapezoid paper here.)
  2. Although the trapezoid rule is inefficient in general, it can be shockingly efficient for periodic functions.
  3. The trapezoid rule can also be shockingly efficient for analytic functions that go to zero quickly, so called double exponential functions.

The last two observations are more widely applicable than you might think at first. What if you want to integrate something that isn’t periodic and isn’t a double exponential function? You may be able to do a change of variables that makes your integrand have one of these special forms. The article Fast Numerical Integration explains an integration method based on double exponential functions and includes C++ source code.

The potential efficiency of the trapezoid rule illustrates a general principle: a crude method cleverly applied can beat a clever technique crudely applied. The simplest numerical integration technique, one commonly taught in freshman calculus, can be extraordinarily efficient when applied with skill to the right problem. Conversely, a more sophisticated integration technique such as Gauss quadrature can fail miserably when naively applied.

Click to learn more about numerical integration consulting


Related posts:

Static versus dynamic typing

Static versus typing is a Ford-Chevy argument among programmers. Here’s the best comment on the subject I’ve seen lately.

Very briefly put, the Haskell [strongly, statically typed] perspective emphasizes safety, while the dynamic outlook favors flexibility. If someone had already discovered one way of thinking about types that was always best, we imagine that everyone would know about it by now.

Source: Real World Haskell.

The second sentence applies equally well to all Ford-Chevy arguments: if one alternative were uniformly better than all others, word would get out. These arguments rage because they involve comparisons along multiple (often implicit) criteria and no alternative is simultaneously better by all criteria.

The relative advantages of programming languages depend on how the languages are used. Although dynamic languages place less emphasis on safety, programs written in dynamic languages may be safer in practice than this would imply. Also, in general it is easier to reason about code written in a statically typed language. However, a programmer can easily subvert strong static typing by writing stringly typed code. Code written in well in a dynamically typed  language will be easier to read than code written poorly in a statically typed language.

Comparisons of the advantages static and dynamic typing are nearly impossible. You could try to argue about what would happen “all other things being equal,” but of course all other things are never equal.

Related post: Questioning the Hawthorne effect