Acupuncture and confirmation bias

Here’s another excerpt from The decline effect and the scientific method that I wrote about a couple weeks ago.

Between 1966 and 1995, there were forty-seven studies of acupuncture in China, Taiwan, and Japan, and every single trial concluded that acupuncture was an effective treatment. During the same period, there were ninety-four clinical trials of acupuncture in the United States, Sweden, and the U.K., and only fifty-six per cent of these studies found any therapeutic benefits.

Related posts:

A motivational speaker with integrity

I wonder how many motivational speakers live out their own advice. Of those who do,  how many will continue to live out their advice as they raise a family? How many will continue to walk their talk into old age? Time has a way of exposing the hypocritical and the naive.

Zig Ziglar practices what he preaches. He is 82 years old and suffered a major head injury in 2007, but he continues to speak and write. His latest book explains how he lives out what he has taught for 40 years.

Pilots and pair programming

From Outliers by Malcolm Gladwell:

In commercial airlines, captains and first officers split the flying duties equally. But historically, crashes have been far more likely to happen when the captain is in the “flying seat.” At first this seems to make no sense, since the captain is almost always the pilot with the most experience. … Planes are safer when the least experienced pilot is flying, because it means the second pilot isn’t going to be afraid to speak up.

The context of this excerpt is an examination of airplane crashes in which the copilot was aware of the pilot’s errors but did not speak up assertively.

I wonder whether an analogous result holds for pair programming. Do more bugs slip into the code when the more experienced programmer has the keyboard? The German aerospace company DLR thinks so. The company pairs junior and senior programmers. The junior programmer writes all the code while the senior programmer watches.

Related posts:

When it works, it works really well

Stephen Stigler [1] compares least-squares methods to the iPhone:

In the United States many consumers are entranced by the magic of the new iPhone, even though they can only use it with the AT&T system, a system noted for spotty coverage — even no receivable signal at all under some conditions. But the magic available when it does work overwhelms the very real shortcomings. Just so, least-squares will remain the tool of choice unless someone concocts a robust methodology that can perform the same magic, a step that would require the suspension of the laws of mathematics.

In other words, least-squares, like the iPhone, works so well when it does work that it’s OK that it fails miserably now and then. Maybe so, but that depends on context.

In his quote, Stigler argues that Americans feel that missing a phone call occasionally is an acceptable trade-off for the features of the iPhone. Many people would agree. But if you’re If you’re on a transplant waiting list, you might prefer more reliable coverage to a nicer phone.

It’s not enough to talk about probabilities of failure without also talking about consequences of failure. For example, the consequences of missing a phone call are greater for some people than for others.

Least-squares is a mathematically convenient way to place a cost on errors: the cost is proportional to the square of the size of the error. That’s often reasonable in application, but not always. In some applications, the cost is simply proportional to the size of error. In other applications, it doesn’t matter how large an error is once it above some threshold. Sometimes the cost of errors is asymmetric: over-estimating has a different cost than under-estimating by the same amount. Sometimes you’re more worried about the worst case than the average case. One size does not fit all.

[1] Stephen M. Stigler, The Changing History of Robustness, American Statistician, Vol. 64, No. 4. November 2010. (Written before Verizon announced it would be supporting the iPhone)

Related posts:

Python-based data/science environment from Microsoft

See Microsoft Research’s announcement of the the Sho project.

Sho is an interactive environment for data analysis and scientific computing that lets you seamlessly connect scripts (in IronPython) with compiled code (in .NET) to enable fast and flexible prototyping. The environment includes powerful and efficient libraries for linear algebra as well as data visualization that can be used from any .NET language, as well as a feature-rich interactive shell for rapid development.

Maybe this is why Microsoft contracted Enthought this summer to port NumPy and SciPy to .NET.

Coming full circle

Experts often end up where they started as beginners.

If you’ve never seen the word valet, you might pronounce it like VAL-it. If you realize the word has a French origin, you would pronounce it val-A. But the preferred pronunciation is actually VAL-it.

Beginning musicians play by ear, to the extent that they can play at all. Then they learn to read music. Eventually, maybe years later, they realize that music really is about what you hear and not what you see.

Beginning computer science students think that computer science is all about programming. Then they learn that computer science is actually about computation in the abstract and not about something so vulgar as a computer. But eventually they come back down to earth and realize that 99.44% of computer science is ultimately motivated by the desire to get computers to do things.

In a beginning physics class, an instructor will ask students to assume a pulley has no mass and most students will simply comply. A few brighter students may snicker, knowing that pulleys really do have mass and that some day they’ll be able to handle problems with realistic pulleys. In a more advanced class, it’s the weaker students who snicker at massless pulleys. The better students understand a reference to a massless pulley to mean that in the current problem, the rotational inertia of the pulley can safely be ignored, simplifying the calculations without significantly changing the result. Similar remarks hold for frictionless planes and infinite capacitors as idealizations. Novices accept them uncritically, sophomores sneer at them, and experts understand their uses and limitations. (Two more physics examples.)

Here’s an example from math. Freshmen can look at a Dirac function δ(x) without blinking. They accept the explanation that it’s infinite at the origin, zero everywhere else, and integrates to 1. Then when they become more sophisticated, they realize this explanation is nonsense. But if they keep going, they’ll learn the theory that makes sense of things like δ(x). They’ll realize that the freshman explanation, while incomplete, is sometimes a reasonable intuitive guide to how δ(x) behaves. They’ll also know when such intuition leads you astray.

In each of these examples, the experts don’t exactly return to the beginning. They come to appreciate their initial ideas in a more nuanced way.

“When we travel, we travel not to see new places with new eyes; but that when we come home we see home with new eyes.” — G. K. Chesterton

Related posts:

More theoretical power, less real power

Suppose you’re deciding between two statistical methods. You pick the one that has more power. This increases your chances of making a correct decision in theory while possibly lowering your chances of actually concluding the truth. The subtle trap is that the meaning of “in theory” changes because you have two competing theories.

When you compare the power of two methods, you’re evaluating each method’s probability of success under its own assumptions. In other words, you’re picking the method that has the better opinion of itself. Thus the more powerful method is not necessarily the method that has the better chance of leading you to a correct conclusion.

Comparing power alone is not enough. You also need to evaluate whether a method makes realistic assumptions and whether it is robust to deviations from its assumptions.

Related posts:

Pseudo-commons and anti-commons

Here are a couple variations on the tragedy of the commons, the idea that shared resources can be exhausted by people acting in their individual best interests.

The first is a recent podcast by Thomas Gideon discussing the possibility of a tragedy of the pseudo-commons. His idea of a pseudo-commons is a creative commons with some barriers. He gives the example of open core companies.

The other is Michael Heller’s idea of the tragedy of the anti-commons. If too many people own a resource, the difficulties in coordination may keep the resource from being used effectively. Having too many owners can create problems similar to those caused by having no owners.

If you’re looking for something to blog about, it would be interesting to compare the pseudo-commons and the anti-commons in depth.

Related posts:

Efficiency of regular expressions

I’ve never optimized a regular expression.  I typically use regular expressions in scripts where efficiency doesn’t matter. And sometimes I do some regular expression processing as part of a larger program in which the bottleneck is somewhere else. But I’ve never worried about the efficiency of a regular expression.

Regular expression efficiency can matter. There are some regular expressions that can be astonishingly slow to match with some regular expression implementations. Russ Cox gives an example of a regular expression that takes Perl a minute to match against a string that’s only 29 characters long. Another regular expression implementation does the same match six orders of magnitude faster.

The example is contrived to show the difference between the approaches. I’m not sure whether I’ve ever run into a horribly inefficient regular expression on accident. Maybe once. I imagine people run into efficiency issues in practical applications, though I don’t.

I’d suggest worrying about regex efficiency if and when it becomes a problem. If your experience matches mine, that need may never come.

But if you do have a regex efficiency problem, or simply find regex implementation details interesting, I’d recommend Russ Cox’s article.

Related links:

Notes on using regular expressions in:

For daily tips on regular expressions, follow @RegexTip on Twitter.

Regex tip icon

Hanlon’s razor and corporations

Hanlon’s razor says

Never attribute to malice that which is adequately explained by stupidity.

At first it seems just an amusing little aphorism, something you might read on a bumper sticker, but I believe it’s profound. It’s a guide to understanding so much of the world. Here I’ll focus on what it says about corporations.

I hear a lot of complaints that corporations are evil. Sometimes corporations in general, but more often specific corporations like Apple, Google, or Microsoft. I don’t deny that large, powerful corporations have the potential to do harm. But many accusations of malice are mis-attributed frustrations with stupidity. As Grey’s law says, any sufficiently advanced incompetence is indistinguishable from malice.

Corporations aren’t evil; they’re stupid. Not stupid in general, but in a specific way: they don’t handle edge cases well.

Organizations scale by creating procedures to replace human judgment. This is mostly a good thing. For example, electronic devices are affordable in part because companies can hire unskilled teenagers rather than electrical engineers to sell them. But if you have a question or problem that’s off the beaten path, you’re out of luck. Many complaints about evil corporations come from outliers, the 1% that corporations strategically decide to ignore. It’s not that that the concerns of the outliers are not legitimate, it’s that they are not profitable to satisfy. When some people say that a corporation is evil, they should just say that they are outside the company’s market.

Large organizations have similar problems internally. Policies written to handle the most common situations don’t handle edge cases well. For example, an HR department told me that my baby girl couldn’t be added to my insurance because she wasn’t born in a hospital. Fortunately I was able to argue with enough people resolve the problem despite her falling outside the usual procedures. It’s harder to deal with corporate rigidity as an employee than as a customer because it’s harder to change jobs than to change brands.

Related posts:

Daily tips update

RegexTip, a Twitter account for learning regular expressions, starts over today with basics and will progress to more advanced properties over time.

SansMouse, an account for Windows keyboard shortcuts, started over with basics two weeks ago.

Both RegexTip and SansMouse are in a loop, progressing from most basic to more advanced features. (Or perhaps I should say progressing from most familiar to less familiar. Calling some features “basic” and others “advanced” isn’t quite right, especially for keyboard shortcuts.)

The other daily tip accounts don’t post in any particular sequence. I try to alternate elementary and advanced content to some extent, but other than that there’s no order.

Six weeks ago I started two new accounts: CompSciFact and StatFact. In a few days CompSciFact will be the most popular of the daily tip accounts if the current trend continues.

Here are all the accounts:

SansMouse iconRegexTip iconTeXtip iconProbFact iconStatFact iconAlgebraFact iconTopologyFact iconAnalysisFact iconCompSciFact icon

I use Hoot Suite to schedule these accounts. I use the paid version because I have too many accounts for the free version and because the paid version has an API that lets me upload files to schedule tips in bulk. (Hoot Suite has an affiliate program, so I make a little money if you sign up through this link.)

If you have suggestions for tweets, please contact me.

Scientific results fading over time

A recent article in The New Yorker gives numerous examples of scientific results fading over time. Effects that were large when first measured become smaller in subsequent studies. Firmly established facts become doubtful. It’s as if scientific laws are being gradually repealed. This phenomena is known as “the decline effect.” The full title of the article is The decline effect and the scientific method.

The article brings together many topics that have been discussed here: regression to the mean, publication bias, scientific fashion, etc. Here’s a little sample.

“… when I submitted these null results I had difficulty getting them published. The journals only wanted confirming data. It was too exciting an idea to disprove, at least back then.” … After a new paradigm is proposed, the peer-review process is tilted toward positive results. But then, after a few years, the academic incentives shift—the paradigm has become entrenched—so that the most notable results are now those that disprove the theory.

This excerpt happens to be talking about “fluctuating asymmetry,” the idea that animals prefer more symmetric mates because symmetry is a proxy for good genes. (I edited out references to fluctuating asymmetry from the quote to emphasize that the remarks could equally apply to any number of topics. ) Fluctuating asymmetry was initially confirmed by numerous studies, but then the tide shifted and more studies failed to find the effect.

When such a shift happens, it would be reassuring to believe that the initial studies were simply wrong and that the new studies are right. But both the positive and negative results confirmed the prevailing view at the time they were published. There’s no reason to believe the latter studies are necessarily more reliable.

Related posts: