From the category archives:

Science

Does gaining weight make you taller?

by John on March 12, 2010

In his autobiography, The Pleasures of Statistics, Frederick Mosteller gives an amusing example of why observational studies are no substitute for doing experiments.

We are all familiar with the idea that we can estimate height in male adults from their weight. … But not one of us believes that adding 20 pounds by eating and minimizing exercise will add an inch to our height.

The problem is not simply that the direction of causality backward, it’s that we cannot use a static description to predict what will happen if we change something.

Although regression situations may give one the illusion of finding out what would happen if we changed something, in the absence of an experiment they offer merely offer guesses.

He summarizes his point by quoting George Box:

To find out what happens to a system when you interfere with it, you have to interfere with it (and not just passively observe it).

Remember this next time you hear claims such as every dollar spent on X saves so many dollars spent on Y. Or every minute spent exercising increases your life expectancy by so many minutes. Or every time you do some activity you increase or decrease your risk of cancer by so much. First of all, these kinds of statements are linear extrapolations on situations that are not linear. Second, they may be observations that do not describe what will happen when you change something. They may be no more true than the idea that gaining weight makes you taller.

Here’s an example of how observation and intervention differ. Lottery winners often go bankrupt within a couple years of receiving their prize. If you suddenly make someone a millionaire, they’re not a typical millionaire.

Related posts:

Numerator-only data
Randomized trials of parachute use

{ 3 comments }

A childhood question about heat

by John on March 10, 2010

When I was a little kid, I asked some adults the following question.

If hot things cool, and cool things warm up, could something hot cool down and warm back up?

The people I asked didn’t understand my question and just laughed. I have no idea how old I was, but I wasn’t old enough to articulate what I was thinking.

Here’s what I had in mind. I knew that hot things like a cup of coffee grew cold. And I knew that cold things, say a glass of milk, get warm. Well, could the coffee get so cold that it becomes a cold thing and start to warm back up?

Could the coffee become as cold as the glass of milk? Common sense suggests that can’t happen. When we say coffee grows cold, we mean that it becomes relatively colder, closer to room temperature. And when we say the milk is getting warm, we also mean it is getting closer to room temperature. We’ve never left a hot cup of coffee on a table and come back later to find that it has cooled off so much that it is colder than room temperature. But could there be small fluctuations?

As the coffee and milk head toward room temperature, could they overshoot the target, just by a little bit? Say room temperature is 70 °F, the coffee starts out at 150 °F, and the milk starts out at 40 °F. We don’t expect the coffee to cool down to 40 °F or the milk to warm up to 150 °F. But could the coffee cool down to 69.5 °F and then go back up to 70 °F? Could the milk warm up to 70.5 °F and then cool back down to 70 °F?

I didn’t get a satisfactory answer to my childhood question until I was in college. Then I found out about Newton’s law of cooling. It says that the rate at which a warm body cools is proportional to the difference between its current temperature and the ambient temperature. This law can be written as a differential equation whose solution shows that the temperature of a warm body decreases exponentially to the ambient temperature. The temperature curve always slopes downward. It doesn’t wiggle even a little on its journey to room temperature. Cold bodies warm up the opposite way, exponentially approaching room temperature but never exceeding it.

In case it this seems obvious, think about thermostats. They don’t work this way. Say the temperature in a room is 85 °F and you’d like it to be 72 °F, so you turn on the air conditioning. Will the temperature steadily lower to 72 °F? Not exactly. If you were to plot the temperature in the room over time and look at the graph from far enough away, it would look like it is steadily going down to the desired temperature. But if you look at the graph more closely, you’ll see wiggles. The AC may cool the room to a little below 72 °F, maybe to 70 °F. The AC would cut off and the temperature would rise to 72 °F. Unlike the cup of hot coffee, the AC will often overshoot its target, though not by too much. The temperature may feel constant, but it is not. It oscillates around the desired temperature.

{ 9 comments }

Does lightning prefer metal or wood?

by John on March 5, 2010

The video below features a demonstration that lightning is as likely to strike wood as metal.

I want to focus on one line from the video. After showing simulated lightning strikes that hit a wooden rod five times and a copper rod five times, the narrator says

It’s five all, proof that metal does not attract lightning.

No, such an experiment would prove no such thing. I imagine the researchers conducted a much larger experiment and selected a representative sample. And I’m willing to accept their conclusion that metal does not attract lightning. But I would not accept such a conclusion from an experiment with 10 samples. What the experiment proves is that, under their experimental conditions, lightning will sometimes strike wood even a metal rod is nearby.

I have two complementary criticisms of this made-for-video science.

  1. The results could easily happen if their conclusion were not true.
  2. The results could easily not have happened if there conclusion were true.

Suppose in reality, lightning will not always strike the metal rod, but will prefer the metal. Suppose in the long run, lightning will strike the metal rod 60% of the time. It would not be unusual in that case to do an experiment with 10 strikes and find that half or more of the strikes hit wood.

Now suppose the researchers are exactly correct. In the long run, lightning has no preference for one rod or the other. What would viewers have thought if they showed a clip of 10 strikes, of which 6 hit metal and 4 hit wood? Many would have howled in protest. If lightning really had no preference for metal, the result should have been an even split, right? This is an example of the Law of Small Numbers. People underestimate the variability of small samples.

If the probability of lightning striking each rod is 50%, then in a sequence of experiments each containing 10 strikes, most will not have an exact 5-5 split. If you flip 10 fair coins, the most likely outcome is a 5-5 split, but this will happen only about 1/4 of the time. It’s more likely that you’ll get near a 5-5 split, sometimes with more heads and sometimes with more tails.

The exact 5-5 split in the video is good showmanship, but it’s misleading science.

Related posts:

Law of small numbers
Example of the law of small numbers
Law of medium numbers

{ 2 comments }

The Law of Medium Numbers

by John on February 25, 2010

There’s a law of large numbers, a law of small numbers, and a law of medium numbers in between.

The law of large numbers is a mathematical theorem. It describes what happens as you average more and more random variables.

The law of small numbers is a semi-serious statement about about how people underestimate the variability of the average of a small number of random variables.

The law of medium numbers is a term coined by Gerald Weinberg in his book An Introduction to General Systems Thinking. He states the law as follows.

For medium number systems, we can expect that large fluctuations, irregularities, and discrepancy with any theory will occur more or less regularly.

The law of medium numbers applies to systems too large to study exactly and too small to study statistically. For example, it may be easier to understand the behavior of an individual or a nation than the dynamics of a small community. Atoms are simple, and so are stars, but medium-sized things like birds are complicated. Medium-sized systems are where you see chaos.

Weinberg warns that medium-sized systems challenge science because scientific disciplines define their boundaries by the set of problems they can handle. He says, for example, that

Mechanics, then, is the study of those systems for which the approximations of mechanics work successfully.

He warns that we should not be mislead by a discipline’s “success with systems of its own choosing.”

Weinberg’s book was written in 1975. Since that time there has been much more interest in the emergent properties of medium-sized systems that are not explained by more basic sciences. We may not understand these systems well, but we may appreciate the limits of our understanding better than we did a few decades ago.

Related posts:

Laws of large numbers and small numbers
Gerald Weinberg’s law of twins
Subnatural and supernatural

{ 4 comments }

The more active a research area is, the less reliable its results are.

John Ioannidis suggested popular areas of research publish a greater proportion of false results in his paper Why most published research findings are false. Of course popular areas produce more results, and so they will naturally produce more false results. But Ioannidis is saying that they also produce a greater proportion of false results.

Now Thomas Pfeiffer and Robert Hoffmann have produced empirical support for Ioannidis’s theory in the paper Large-Scale Assessment of the Effect of Popularity on the Reliability of Research. Pfeiffer and Hoffmann review two reasons why popular areas have more false results.

First, in highly competitive fields there might be stronger incentives to ‘‘manufacture’’ positive results by, for example, modifying data or statistical tests until formal statistical significance is obtained. This leads to inflated error rates for individual findings: actual error probabilities are larger than those given in the publications. … The second effect results from multiple independent testing of the same hypotheses by competing research groups. The more often a hypothesis is tested, the more likely a positive result is obtained and published even if the hypothesis is false.

In other words,

  1. In a popular area there’s more temptation to fiddle with the data or analysis until you get what you expect.
  2. The more people who test an idea, the more likely someone is going to find data in support of it by chance.

The authors produce evidence of the two effects above in the context of papers written about protein interactions in yeast. They conclude that “The second effect is about 10 times larger than the first one.”

Related posts:

Why microarray conclusions are so often wrong
Using Photoshop on experimental results
Irreproducible analysis
Make up your own rules of probability

{ 3 comments }

Malaria on the prairie

by John on February 9, 2010

My family loves the Little House on the Prairie books. We read them aloud to our three oldest children and we’re in the process of reading them with our fourth child. We just read the chapter describing when the entire Ingalls family came down with malaria, or “fever ‘n’ ague” as they called it.

The family had settled near a creek that was infested with mosquitoes. All the settlers around the creek bottoms came down with malaria, though at the time (circa 1870) they did not know the disease was transmitted by mosquitoes. One of the settlers, Mrs. Scott, believed that malaria was caused by eating the watermelons that grew in the creek bottoms. She had empirical evidence: everyone who had eaten the melons contracted malaria. Charles Ingalls thought that was ridiculous. After he recovered from his attack of malaria, he went down to the creek and brought back a huge watermelon and ate it. His reasoning was that “Everybody knows that fever ‘n’ ague comes from breathing the night air.”

It’s easy to laugh at Mrs. Scott and Mr. Ingalls. What ignorant, superstitious people. But they were no more ignorant than their contemporaries, and both had good reasons for their beliefs. Mrs. Scott had observational data on her side. Ingalls was relying on the accepted wisdom of his day. (After all, “malaria” means “bad air.”)

People used to believe all kinds of things that are absurd now, particularly in regard to medicine. But they were also right about many things that are hard to enumerate now because we take them for granted. Stories of conventional wisdom being correct are not interesting, unless there was some challenge to that wisdom. The easiest examples of folk wisdom to recall may be the instances in which science initially contradicted folk wisdom but later confirmed it. For example, we have come back to believing that breast milk is best for babies and that a moderate amount of sunshine is good for you.

Related posts:

A little coffee on the prairie
Galen and clinical trials
Randomized trials of parachute use

{ 3 comments }

Breast cancer stem cells identified

by John on December 5, 2009

From the article Proverbial new “Twist” in Breast Cancer Detection:

… scientists at Johns Hopkins … have shown that a protein made by a gene called “Twist” may be the proverbial red flag that can accurately distinguish stem cells that drive aggressive, metastatic breast cancer from other breast cancer cells.

Related posts:

Detecting breast cancer from a hair sample
Visualizing cancer DNA scrambling
Killing too much of a tumor

{ 0 comments }

Subnatural and supernatural

by John on November 17, 2009

I recently ran across a discussion of quantum mechanics from C. S. Lewis.

The older scientists believed that the smallest particles of matter moved according to strict laws: in other words, that the movements of each particle were “interlocked” with the total system of Nature. Some modern scientists seem to think — if I understand them — that this is not so. They seem to think that the individual unit of matter … moves in an indeterminate or random fashion; moves, in fact, “on its own” or “of its own accord.”

He goes on to explain that the macroscopic behavior of matter appears deterministic because the average behavior of billions of particles is very regular. His explanation is remarkably cogent for a professor of medieval literature writing in the 1940’s. He then discusses the philosophical consequences of quantum mechanics.

Now it will be noticed that if this theory is true we have really admitted something other than Nature. If the movements of the individual units is “on their own,” … then those movements are not part of Nature. It would be, indeed, too great a shock to our habits to describe them as super-natural. I think we should call them sub-natural. But all our confidence that Nature has no doors, and no reality outside herself for doors to open on, would have disappeared. There is something outside her, the Subnatural. … And clearly if she thus has a back door opening on the Subnatural, it is quite on the cards that she may also have a front door opening on the Supernatural …

From Miracles by C. S. Lewis, chapter 3.

Related post:

The world looks more mathematical than it is

{ 3 comments }

Div, grad, and curl videos

by John on November 13, 2009

Open University videos on gradient, divergence, and curl

Grad:

Div:

Curl:

Related link:

Div, Grad, Curl and All That

{ 0 comments }

A third of dinosaur species never existed?

by John on October 11, 2009

According to this article from National Geographic News, some experts now believe the number of dinosaur species has been overestimated. Some specimens that were previously believed to be distinct species are now believed to be juvenile specimens of other species. (Hat tip to Eric Geiger.)

{ 2 comments }

Make up your own rules of probability

by John on September 18, 2009

Keith Baggerly and Kevin Coombes just wrote a paper about the analysis errors they commonly see in bioinformatics articles. From the abstract:

One theme that emerges is that the most common errors are simple (e.g. row or column offsets); conversely, it is our experience that the most simple errors are common.

The full title of the article by Keith Baggerly and Kevin Coombes is “Deriving chemosensitivity from cell lines: forensic bioinformatics and reproducible research in high-throughput biology.” The article will appear in the next issue of Annals of Applied Statistics and is available here. The key phrase in the title is forensic bioinformatics: reverse engineering statistical analysis of bioinformatics data. The authors give five case studies of data analyses that cannot be reproduced and infer what analysis actually was carried out.

One of the more egregious errors came from the creative application of probability. One paper uses innovative probability results such as

P(ABCD) = P(A) + P(B) + P(C) + P(D) – P(A) P(B) P(C) P(D)

and

P(AB) = max( P(A), P(B) ).

Baggerly and Coombes were remarkably understated in their criticism: “None of these rules are standard.” In less diplomatic language, the rules are wrong.

To be fair, Baggerly and Coombes point out

These rules are not explicitly stated in the methods; we inferred them either from formulae embedded in Excel files … or from exploratory data analysis …

So, the authors didn’t state false theorems; they just used them. And nobody would have noticed if Baggerly and Coombes had not tried to reproduce their results.

Related posts:

Irreproducible analysis
Highlights from Reproducible Ideas
Reproducible Ideas blog winding down

{ 6 comments }

Termites and programmers

by John on September 1, 2009

There are more termites in the world than there are elephants. Not only that, the total mass of the world’s elephants is roughly 1/1000 the total mass of the world’s termites. The big, visible animals, the ones that first come to mind, are a small fraction of the total.

Something similar is true of software projects: the big, visible projects, the ones people write about, are a small fraction of the total. Certainly there are more small projects in the world than large projects. And I imagine more programmers in total work on small projects than on large projects. I don’t have any hard numbers on this, and I doubt anyone else does. Most hard numbers come from large, visible projects! Who is going to do a census of all the little one-man projects that go unnoticed?

This post is a continuation of a comment I made as part of the discussion following my blog post on medieval software project management. My contention there was that most projects involve one developer, have no written requirements, and no external testing. That may not be correct, but I imagine it’s closer to the truth than assuming everyone works on projects with a dozen developers, formal requirements documents, and a staff of testers.

The first books on the “right” way to develop software codified the experience gained from working on enormous federally funded software projects. For example, the recommended practice was to spend huge proportion of the total effort in up-front planning. While that made sense when coordinating the efforts of thousands of contractors in the days of punch cards, it doesn’t make as much sense now. The agile software development movement began when people realized that the world had changed and the “best practices” of a previous generation were not optimal for smaller projects and vastly superior hardware.

Agile software development has replaced the best practices of the 1960’s in many organizations. However, there is still a strong tendency to think that small projects should use the same tools and techniques as large, enterprise projects. Most books are written about medium to large projects and many developers worry unnecessarily about scaling up their projects. (”What if I get a million visitors an hour to my web site?” You should be so lucky. Worry about that after it becomes a remote possibility.) Few pundits give advice that scales down, that is, advice appropriate for small projects. I wrote about one exception in a previous post in which Rob Page suggests different methods for projects with a budget of less than $1M and projects with a larger budget.

Related posts:

Million dollar cutoff for software technique
Enterprising software
Medieval software project management

{ 0 comments }

Questioning the Hawthorne effect

by John on June 16, 2009

The Hawthorne effect is the idea that people perform better when they’re being studied. The name comes from studies conducted at Western Electric’s Hawthorne Works facility. Increased lighting improved productivity in the plant. Later, lowering the lighting also increased productivity. The Hawthorne effect says that the productivity increase wasn’t due to changes in lighting per se but either the variety of changing something about the plant or the attention that workers got by being measured, a sort of placebo effect.

The Alternative Blog has a post this morning entitled Hawthorne effect debunked. The original Hawthorne effect was apparently due to a flaw in the study design; correcting for that flaw eliminates the effect.

The term “debunked” in the post title may imply too much. The effect in the original studies may have been debunked, but that does not necessarily mean there is no Hawthorne effect. Perhaps there are good examples of the Hawthorne effect elsewhere. On the other hand, I expect closer examination of the data could debunk other reported instances of the Hawthorne effect as well.

The Hawthorne effect makes sense. It has been ingrained in pop culture. I heard a reference to it on a podcast just this morning before reading the blog post mentioned above. Everyone knows it’s true. And maybe it is. But at a minimum, there is at least one example suggesting the effect is not as wide-spread as previously thought.

It would be interesting to track the popularity of the Hawthorne effect in scholarly literature and in pop culture. If the effect becomes less credible in scholarly circles, will it also become less credible in pop culture? And if so, how quickly will pop culture respond?

{ 4 comments }

Killing too much of a tumor

by John on May 30, 2009

The traditional approach to cancer treatment has been to try to eradicate tumors. Eliminating a tumor is better than shrinking a tumor, so this approach makes sense. But if you try to eradicate the tumor and fail, you may leave the patient worse off. If you kill 90% of a tumor with some treatment but leave 10%, the remaining 10% is resistant to that treatment. You may have made the tumor more deadly by removing the weaker portions that were suppressing its growth. This explains why cancer treatments sometimes appear to be quite successful, dramatically reducing the size of tumors, without improving survival.

Sometimes one treatment will shrink a tumor as much as possible as a prelude to another treatment, such as shrinking a tumor with chemotherapy prior to surgery. But if only one treatment is being used, the situation may be like the old saying that you don’t want to wound the king. If you’re going try to kill the king, you’d better succeed.

In a recent interview on the Nature podcast, Robert Gatenby of Moffitt Cancer Center advocates an alternative approach, treating cancer as a chronic disease. Instead of killing as much of a tumor as possible, it may be better to kill as little of tumor as necessary to keep it under control. Patients would continue to take anti-cancer treatments for the rest of their lives, just as patients with heart disease or diabetes take medication indefinitely.

Related post:
Repairing tumors

{ 6 comments }

Variations on a theme of Newton

by John on May 26, 2009

Isaac Newton famously said

If I have seen farther than others it is because I have stood on the shoulders of giants.

Later Mathematician R. W. Hamming added

Mathematicians stand on each other’s shoulders while computer scientists stand on each other’s toes.

Finally, computer scientist Hal Abelson quipped

If I have not seen farther, it is because giants were standing on my shoulders.

(Thanks to Mark Reid for the Hamming quote.)

{ 2 comments }