God is in the details

Some say “The devil is in the details,” meaning solutions break down when you examine them closely enough. Some say “God is in the details,” meaning opportunities for discovery and creativity come from digging into the details. Both are true, but the latter is more interesting.

I posted something along these lines a few weeks ago, Six quotes on digging deep. In that post I quote Richard Feynman

… nearly everything is really interesting if you go into it deeply enough …

I thought about this again last night when I ran across a post by Andrew Gelman entitled God is in every leaf of every tree. He has a similar quote from Feynman.

No problem is too small or too trivial if we really do something about it.

From there he links to a post where he describes what he calls the paradox of importance. Sometimes we can do our most creative work on the least important problems. The important problems often demand quick solutions, so we fall back on familiar methods.

Everything in this post applies equally well to creativity in other fields: graphic design, music composition, literature, etc.  However, Gelman is talking about creativity specifically in the context of statistics. Statistics is a prime example of something that appears dull from the outside but becomes fascinating in the details. A course in statistics can be mind-numbingly dull when the emphasis is on rote application of black-box procedures. Looking inside the boxes is more interesting, and designing the boxes is most interesting.

Related post: Simple legacy

Dose-finding: why start at the lowest dose?

You’ve got a new drug and it’s time to test it on patients. How much of the drug do you give? That’s the question dose-finding trials attempt to answer.

The typical dose-finding procedure starts by selecting a small number of dose levels, say four or five. The trial begins by giving the lowest dose to the first few patients, and there is some procedure for deciding when to try higher doses. Convention says it is unethical to start at any dose other than lowest dose. I will give several reasons to question convention.

Suppose you want to run a clinical trial to test the following four doses of Agent X: 10 mg, 20 mg, 30 mg, 50 mg. You want to start with 20 mg. Your trial goes for statistical review and the reviewer says your trial is unethical because you are not starting at the lowest dose. You revise your protocol saying you only want to test three doses: 20 mg, 30 mg, and 50 mg. Now suddenly it is perfectly ethical to start with a dose of 20 mg because it is the lowest dose.

The more difficult but more important question is whether a dose of 20 mg of Agent X is medically reasonable. The first patient in the trial does not care whether higher or lower doses will be tested later. He only cares about the one dose he’s about to receive. So rather than asking “Why are you starting at dose 2?” reviewers should ask “How did you come up with this list of doses to test?”

A variation of the start-at-the-lowest-dose rule is the rule to always start at “dose 1”. Suppose you revise the original protocol to say dose 1 is 20 mg, dose 2 is 30 mg, and dose 3 is 50 mg. The protocol also includes a “dose -1” of 10 mg. You explain that you do not intend to give dose -1, but have included it as a fallback in case the lowest dose (i.e. 20 mg) turns out to be too toxic. Now because you call 20 mg “dose 1” it is ethical to begin with that dose. You could even begin with 30 mg if you were to label the two smaller doses “dose -2” and “dose -1.” With this reasoning, it is ethical to start at any dose, as long as you call it “dose 1.” This approach is justified only if the label “dose 1” carries the implicit endorsement of an expert that it is a medically reasonable starting dose.

Part of the justification for starting at the lowest dose is that the earliest dose-finding methods would only search in one direction. This explains why some people still speak of “dose escalation” rather than “dose-finding.” More modern dose-finding methods can explore up and down a dose range.

The primary reason for starting at the lowest dose is fear of toxicity. But when treating life-threatening diseases, one could as easily justify starting at the highest dose for fear of under treatment. (Some trials do just that.) Depending on the context, it could be reasonable to start at the lowest, highest, or any dose in between.

The idea of first selecting a range of doses and then deciding where to start exploring seems backward. It makes more sense to first pick the starting dose, then decide what other doses to consider.

Related: Adaptive clinical trial design

Feasibility studies

Jeff Atwood gives a summary of Facts and Fallacies of Software Engineering by Robert Glass on his blog. I was struck by point #14:

The answer to a feasibility study is almost always “yes”.

I hadn’t thought about that before, but it certainly rings true. I can’t think of an exception.

Some say about half of all large software projects fail, and presumably many of these failures passed a feasibility study. Why can’t we predict whether a project stands a good chance of succeeding? Are committees sincerely overly optimistic, or do they recognize doomed projects but tell the sponsor what the sponsor wants to hear?

Related post: Engineering statistics

Innovation IV

John Tukey said

efficiency = statistical efficiency x usage.

I don’t know the context of this quote, but here’s what I think Tukey meant. The usefulness of a statistical method depends not just on the method’s abstract virtues, but also on how often the method can be used and how often in fact it is used. This ties in with Michael Schrage’s comment that innovation is not what innovators do but what customers adopt.

Linear interpolator

I added a form to my website yesterday that does linear interpolation. If you enter (x1, y1) and (x2, y2), it will predict x3 given y3or vice versa by fitting a straight line to the first two points. It’s a simple calculation, but it comes up just often enough that it would be handy to have a page to do it.

Innovation III

In his book Diffusion of Innovations Everett Rogers lists five factors in determining rate of adoption of an innovation.

First is the relative advantage of the innovation. This is not limited to objective improvements but also includes factors such as social prestige.

The second is compatibility with existing systems and values.

Third is complexity, especially perceived complexity.

The fourth is trialability, how easily someone can try out the innovation without making a commitment.

The fifth is observability, whether the advantages of the innovation are visible.

Innovators are often criticized for compatibility, for not making a larger break from the past. After Bjarne Stroustrup invented the C++ programming language, many people said he should have sacrificed compatibility with C in order to make C++ a better language. However, had he done so, C++ would not have become popular enough to gain the critics’ attention. As Stroustrup said in an interview, “There are just two kinds of languages: the ones everybody complains about and the ones nobody uses.”

Innovation II

In 1601, an English sea captain did a controlled experiment to test whether lemon juice could prevent scurvy.  He had four ships, three control and one experimental.  The experimental group got three teaspoons of lemon juice a day while the control group received none. No one in the experimental group developed scurvy while 110 out of 278 in the control group died of scurvy. Nevertheless, citrus juice was not fully adopted to prevent scurvy until 1865.

Overwhelming evidence of superiority is not sufficient to drive innovation.

Source: Diffusion of Innovations

Innovation I

Innovation is not the same as invention. According to Peter Denning,

An innovation is a transformation of practice in a community. It is not the same as the invention of a new idea or object. The real work of innovation is in the transformation of practice. … Many innovations were preceded or enabled by inventions; but many innovations occurred without a significant invention.

Michael Schrage makes a similar point.

I want to see the biographies and the sociologies of the great customers and clients of innovation. Forget for a while about the Samuel Morses, Thomas Edisons, the Robert Fultons and James Watts of industrial revolution fame. Don’t look to them to figure out what innovation is, because innovation is not what innovators do but what customers adopt.

Innovation in the sense of Denning and Schrage is harder than invention. Most inventions don’t lead to innovations.

The simplest view of the history of invention is that Morse invented the telegraph, Fulton the steamboat, etc. A sophomoric view is that men like Morse and Fulton don’t deserve so much credit because they only improved on and popularized the inventions of others. A more mature view is that Morse and Fulton do indeed deserve the credit they receive. All inventors build on the work of predecessors, and popularizing an invention (i.e. encouraging innovation) requires persistent hard work and creativity.

The simplest thing that might work

Ward Cunningham‘s design advice is to try the simplest thing that might work. If that doesn’t work, try the next simplest thing that might work. Note the word “might.”

We all like simplicity in theory, and we may think we’re following Cunningham’s advice when we’re not. Instead, we try the simplest thing that we’re pretty sure will work. Solutions usually get more complex as they’re fleshed out, so we miss out on simple solutions by starting from an idea that is too complex to begin with.

Once you have a simple idea that might work, you have to protect it. Simple solutions are magnets for complexity. People immediately suggest “improvements.” As design guru Donald Norman says “The hardest part of design … is keeping features out.”