**Related post**: Take chances, make mistakes, and get messy

Suppose you have a space ship that could accelerate at 1 g for as long as you like. Inside the ship you would feel the same gravity as on earth. You could travel wherever you like by accelerating at 1 g for the first half of the flight then reversing acceleration for the second half of the flight. This approach could take you to Mars in three days.

If you could accelerate at 1 g for a year you could reach the speed of light, and travel half a light year. So you could reverse your acceleration and reach a destination a light year away in two years. But this ignores relativity. Once you’re traveling at near the speed of light, time practically stops for you, so you could keep going as far as you like without taking any more time from your perspective. So you could travel **anywhere** in the universe in two years!

Of course there are a few problems. We have no way to sustain such acceleration. Or to build a ship that could sustain an impact with a spec of dust when traveling at relativistic speed. And the calculation ignores relativity until it throws it in at the end. Still, it’s fun to think about.

**Update**: Dan Piponi gives a calculation on G+ that addresses the last of the problems I mentioned above, sticking relativity on to the end of a classical calculation. He does a proper relativistic calculation from the beginning.

]]>If you take the radius of the observable universe to be 45 billion light years, then I think you need about 12.5 g to get anywhere in it in 2 years. (Both those quantities as measured in the frame of reference of the traveler.)

If you travel at constant acceleration a for time t then the distance covered is c^2/a (cosh(a t/c) – 1) (Note that gives the usual a t^2/2 for small t.)

Which side is correct depends on what’s out there waiting to be discovered, which of course we don’t know. We can only guess. Timid research is rational if you believe there are only marginal improvements that are likely to be discovered.

Sample size increases quickly as the size of the effect you’re trying to find decreases. To establish small differences in effect, you need very large trials.

If you think there are only small improvements on the status quo available to explore, you’ll explore each of the possibilities very carefully. On the other hand, if you think there’s a miracle drug in the pipeline waiting to be discovered, you’ll be willing to risk falsely rejecting small improvements along the way in order to get to the big improvement.

Suppose there are 500 drugs waiting to be tested. All of these are only 10% effective except for one that is 100% effective. You could quickly find the winner by giving each candidate to one patient. For every drug whose patient responded, repeat the process until only one drug is left. One strike and you’re out. You’re likely to find the winner in three rounds, treating fewer than 600 patients. But if all the drugs are 10% effective except one that’s 11% effective, you’d need hundreds of trials with thousands of patients each.

The best research strategy depends on what you believe is out there to be found. People who know nothing about cancer often believe we could find a cure soon if we just spend a little more money on research. Experts are more sanguine, except when they’re asking for money.

]]>However, a more fundamental point has been lost. At the core of Ioannidis’ paper is the assertion that **the proportion of true hypotheses under investigation matters**. In terms of Bayes’ theorem, the *posterior* probability of a result being correct depends on the *prior* probability of the result being correct. This prior probability is vitally important, and it varies from field to field.

In a field where it is hard to come up with good hypotheses to investigate, most researchers will be testing false hypotheses, and most of their positive results will be coincidences. In another field where people have a good idea what ought to be true before doing an experiment, most researchers will be testing true hypotheses and most positive results will be correct.

For example, it’s very difficult to come up with a better cancer treatment. Drugs that kill cancer in a petri dish or in animal models usually don’t work in humans. One reason is that these drugs may cause too much collateral damage to healthy tissue. Another reason is that treating human tumors is more complex than treating artificially induced tumors in lab animals. Of all cancer treatments that appear to be an improvement in early trials, very few end up receiving regulatory approval and changing clinical practice.

A greater proportion of physics hypotheses are correct because physics has powerful theories to guide the selection of experiments. Experimental physics often succeeds because it has good support from theoretical physics. Cancer research is more empirical because there is little reliable predictive theory. This means that a published result in physics is more likely to be true than a published result in oncology.

Whether “most” published results are false depends on context. The proportion of false results varies across fields. It is high in some areas and low in others.

]]>I’m not sure whether I agree with Brenner’s quote, but I find it interesting. You could argue that techniques are most important because they have the most leverage. A new technique may lead to many new discoveries and new ideas.

]]>

]]>“Oh, the intellectual freedom of academia” he thought while filling out a time sheet which checks that he does not work on non-grant science.

When Coleridge, the most famous poet of the day, wrote his tract on scientific method in 1817 it was not considered an oddity; by 1833, the time of the third meeting of the British Association for the Advancement of Science, it was already remarkable, and in the years that followed it was almost inconceivable.

**Related post**: How the term “scientist” came to be

To me, the subject of “information theory” is badly named. That discipline is devoted to finding ideal compression schemes for messages to be sent quickly and accurately across a noisy channel. It deliberately does not pay any attention to what the messages mean. To my mind this should be called compression theory or redundancy theory. Information is inherently meaningful—that is its purpose—any theory that is unconcerned with the meaning is not really studying information per se. The people who decide on speed limits for roads and highways may care about human health, but a study limited to deciding ideal speed limits should not be called “human health theory”.

Despite what was said above, Information theory has been extremely important in a diverse array of fields, including computer science but also in neuroscience and physics. I’m not trying to denigrate the field; I am only frustrated with its name.

From David Spivak, footnotes 13 and 14 here.

]]>I was surprised by the articles on the bombing of Hiroshima and Nagasaki. New York Times reporter William Lawrence was allowed to go on the mission over Nagasaki. He was not on the plane that dropped the bomb, but was in one of the other B-29 Superfortresses that were part of the mission. Lawrence’s story was published September 9, 1945, exactly one month later. Lawrence was also allowed to tour the ruins of Hiroshima. His article on the experience was published September 5, 1945. I was surprised how candid these articles were and how quickly they were published. Apparently military secrecy evaporated rapidly once WWII was over.

Another thing that surprised me was that some stories were newsworthy more recently than I would have thought. I suppose I underestimated how long it took to work out the consequences of a major discovery. I think we’re also biased to think that whatever we learned as children must have been known for generations, even though the dust may have only settled shortly before we were born.

]]>When you see something that is technically sweet, you go ahead and do it and argue about what to do about it only after you’ve had your technical success. That is the way it was with the atomic bomb.

]]>

Like all the books in the series, The Drug Book is a collection of alternating one-page articles and full page color photographs, arranged chronologically. These books make great coffee table books because they’re colorful and easy to dip in and out of. The other books in the series are The Space Book, The Physics Book, and The Medical Book.

The book’s definition of “drug” is a little broad. In addition to medicines, it also includes related chemicals such as recreational drugs and poisons. It also includes articles on drug-related reference works and legislation.

]]>

In other words, integers are not inputs of the theory, as Bohr thought. They are outputs. The integers are an example of what physicists call an emergent quantity. In this view, the term “quantum mechanics” is a misnomer. Deep down, the theory is not quantum. In systems such as the hydrogen atom, the processes described by the theory mold discreteness from underlying continuity. … The building blocks of our theories are not particles but fields: continuous, fluid-like objects spread throughout space. … The objects we call fundamental particles are not fundamental. Instead they are ripples of continuous fields.

Source: The Unquantum Quantum, Scientific American, December 2012.

]]>]]>Pure mathematics and physics are becoming ever more closely connected, though their methods remain different. One may describe the situation by saying that the mathematician plays a game in which he himself invents the rules while the physicist plays a game in which the rules are provided by Nature, but as time goes on it becomes increasingly evident that the rules which the mathematician finds interesting are the same as those which Nature has chosen.

Here’s something I learned while skimming through the book: Asteroids can have moons. (That’s the title of the article on page 414.) This has been known since the early 1990’s, but it’s news to me.

The first example discovered was a satellite now named Dactyl orbiting the asteroid 243 Ida. The Space Book says Dactyl was discovered in 1992. Wikipedia says Dactyl was photographed by the Galileo spacecraft in 1993 and discovered by examining the photos in February of 1994. Since that time, “more than 220 minor planet moons have been found.”

]]>]]>… applied science, purposeful and determined, and pure science, playful and freely curious, continuously support and stimulate each other. The great nation of the future will be the one which protects the freedom of pure science as much as it encourages applied science.

If universities simply paid their faculty a salary rather than giving them a hunting license for grants, the faculty could spend 80% of their time on research rather than 40%. Of course the numbers wouldn’t actually work out so simply. But it is safe to say that if you remove something that takes 40% of their time, researchers could spend more time doing research. (Researchers working in the private sector are often paid by grants too, so to some extent this applies to them as well.)

Universities depend on grant money to pay faculty. But if the money allocated for research were given to universities instead of individuals, universities could afford to pay their faculty.

Not only that, universities could reduce the enormous bureaucracies created to manage grants. This isn’t purely hypothetical. When Hillsdale College decided to refuse all federal grant money, they found that the loss wasn’t nearly as large as it seemed because so much of the grant money had been going to administering grants.

]]>In addition to presenting the advanced physics, which mathematicians find so easy, I also want to explore the workings of elementary physics, and mysterious maneuvers — which physicists seem to find so natural — by which one reduces a complicated physical problem to a simple mathematical question, which I have always found so hard to fathom.

That’s exactly how I feel about physics. I’m comfortable with differential equations and manifolds. It’s blocks and pulleys that kick my butt.

]]>The subtitle may be a little misleading. There is a fair amount of math in the book, but the ratio of history to math is pretty high. You might say the book is more about the role of mathematicians than the role of mathematics. As Roger Penrose says on the back cover, the book has “illuminating descriptions and minimal technicality.”

Someone interested in weather prediction but without a strong math background would enjoy reading the book, though someone who knows more math will recognize some familiar names and theorems and will better appreciate how they fit into the narrative.

**Related posts**:

All medicine is personalized. If you are in an emergency room with a broken leg and the person next to you is lapsing into a diabetic coma, the two of you will be treated differently.

The aim of personalized medicine is to increase the *degree* of personalization, not to introduce personalization. In particular, there is the popular notion that it will become routine to sequence your DNA any time you receive medical attention, and that this sequence data will enable treatment uniquely customized for you. All we have to do is collect a lot of data and let computers sift through it. There are numerous reasons why this is incredibly naive. Here are three to start with.

- Maybe the information relevant to treating your malady is in how DNA is expressed, not in the DNA per se, in which case a sequence of your genome would be useless. Or maybe the most important information is not genetic at all. The data may not contain the answer.

- Maybe the information a doctor needs is not in one gene but in the interaction of 50 genes or 100 genes. Unless a small number of genes are involved, there is no way to explore the combinations by brute force. For example, the number of ways to select 5 genes out of 20,000 is 26,653,335,666,500,004,000. The number of ways to select 32 genes is over a googol, and there isn’t a googol of anything in the universe. Moore’s law will not get us around this impasse.
- Most clinical trials use no biomarker information at all. It is exceptional to incorporate information from one biomarker. Investigating a handful of biomarkers in a single trial is statistically dubious. Blindly exploring tens of thousands of biomarkers is out of the question, at least with current approaches.

Genetic technology has the potential to incrementally increase the degree of personalization in medicine. But these discoveries will require new insight, and not simply more data and more computing power.

**Related posts**:

- Acute myeloid leukemia and myelodysplastic syndrome (AML and MDS)
- Chronic lymphocytic leukemia (CLL)
- Lung cancer
- Melanoma
- Prostate cancer
- Triple negative breast and ovarian cancer

These special areas of research are being called “moon shots” by analogy with John F. Kennedy’s challenge to put a man on the moon. This isn’t a new idea. In fact, a few months after the first moon landing, there was a full-page ad in the Washington Post that began “Mr. Nixon: You can cure cancer.” The thinking was the familiar refrain “If we can put a man on the moon, we can …” President Nixon and other politicians were excited about the idea and announced a “war on cancer.” Scientists, however, were more skeptical. Sol Spiegelman said at the time

An all-out effort at this time would be like trying to land a man on the moon without knowing Newton’s laws of gravity.

The new moon shots are not a national attempt to “cure cancer” in the abstract. They are six initiatives at one institution to focus research on specific kinds of cancer. And while we do not yet know the analog of Newton’s laws for cancer, we do know far more about the basic biology of cancer than we did in the 1970’s.

There are results that suggest that there is some unity beyond the diversity of cancer, that ultimately there are a few common biological pathways involved in all cancers. Maybe some day we will be able to treat cancer in general, but for now it looks like the road forward is specialization. Perhaps specialized research programs will uncover some of these common patters in all cancer.

**Related links**:

JDC: If some scientists were more candid, they’d say “I don’t care whether my results aretrue, I care whether they’republishable. So I need myp-value less than 0.05. Make as strong assumptions as you have to.”

JMW: My sense of statistical education in the sciences is basically Upton Sinclair’s view of the Gilded Age: “It is difficult to get a man to understand something when his salary depends upon his not understanding it.”

Perhaps I should have said that scientists *know* that their conclusions are true. They just need the statistics to confirm what they know.

Brian Nosek talks about this theme on the EconTalk podcast. He discusses the conflict of interest between creating publishable results and trying to find out what is actually true. However, he doesn’t just grouse about the problem; he offers specific suggestions for how to improve scientific publishing.

**Related post**: More theoretical power, less real power

To verify this figure, we’ll do a **very rough** calculation. Accelerating at 1 g for time t covers a distance is g *t*^{2}/2. Let *d* be the distance to Mars in meters, *T* the total of the trip in seconds, and g = 9.8 m/s^{2}. In half the trip you cover half the distance, so 9.8 (*T*/2)^{2}/2 = *d*/2. So *T* = 0.64 √*d*.

The hard part is picking a value for *d*. To keep things simple, assume you head straight to Mars, or rather straight toward where Mars will be by the time you get there. (In practice, you’d take more of a curved path.) Next, what do you want to use as your straight-line distance? The distance between Earth and Mars varies between about 55 million km and 400 million km. That gives you a time *T* between 1.7 and 4.7 days.

We don’t have the technology to accelerate for a day at 1 g. As Richard Campbell points out, spacecraft typically accelerate for maybe 20 minutes and coast for most of their journey. They may also pick up speed by slinging around a planet, but there are no planets between here and Mars.

]]>Nahin’s latest book is The Logician and the Engineer: How George Boole and Claude Shannon Created the Information Age. The title may be a little misleading. The book includes brief biographies of Boole and Shannon, but it is more about the *ideas* of Boole and Shannon (and others) than about the lives of these men. It discusses logic and information theory, and contains a fair amount of history, but it is not a rigorous historical account. Nahin uses Boole and Shannon a device for writing his book, something like the way Douglas Hofstadter uses Gödel, Escher, and Bach in Gödel, Escher, Bach.

The Logician and the Engineer dives into logic and probability from the perspective of an electrical engineer. The book moves seamlessly between abstract mathematics and electronic circuits. You don’t need to know much about electronics before reading the book, but you will see how logic concepts correspond directly to hardware. This is the heart of the book, and it is well done.

The last chapter of the book quickly discusses thermodynamics, and quantum computing. You could say The Logician and the Engineer is a book about basic electrical engineering, sandwiched between a historical introduction and a view of the future.

**Other posts** about Nahin’s books:

Einstein on radio

Richard Feynman and Captain Picard try to prove Fermat’s Last Theorem

“Denier” is an ugly word. It implies that someone has no rational basis for his beliefs. He’s either an apologist for evil, as in a Holocaust denier, or mentally disturbed, as in someone in psychological denial. The term “denier” is inflammatory and has no place in scientific discussion.

]]>In point of fact, their growth is strictly allotted; at the appropriate day and hour they approach in greater volume or less according as they are attracted by the lunar orb, at whose sway the ocean wells up.

Seneca doesn’t just mention an association between lunar and tidal cycles, but he says tides are *attracted* by the moon. That sounds awfully Newtonian for someone writing 16 centuries before Newton. The ancients may have understood that gravity wasn’t limited to the pull of the earth, that at least the moon also had a gravitational pull. That’s news to me.

The other two are about how complex systems break.

]]>Offhand I can only think of a couple things on which there seems to be near unanimous agreement: smoking is bad for you, and moderate exercise is good for you.

Here are a couple suggestions for evaluating health studies.

Be suspicious of linear extrapolation. It does not follow that because moderate exercise is good for you, extreme exercise is extremely good for you. Nor does it follow that because extreme alcohol consumption is harmful, moderate alcohol consumption is moderately harmful.

Start from a default assumption that something natural or traditional is probably OK. This should not be dogmatic, only a starting point. In statistical terms, it’s a prior distribution informed by historical experience. The more a claim is at odds with nature and tradition, the more evidence it requires. If someone says fresh fruit is bad for you, for example, they need to present more evidence than someone who says an newly synthesized chemical compound is harmful. Extraordinary claims require extraordinary evidence.

**Related post**:

The Hubble classification of elliptical galaxies uses a scale E0 through E7 where the number following ‘E’ is 10(1 – *b*/*a*) where *a* is the major semi-axis and *b* is the minor semi-axis. An E0 galaxy is essentially spherical. The most common classification is near E3. The limit is believed to be around E7.

The image above is a photo of Messier 49, an E4 galaxy, taken by the Hubble telescope.

For an E3 galaxy, the minor and major axes are around 7 and 10 in some unit. The average of these is 8.5. A sphere with the same volume would have radius 8.88 and a sphere with the same surface area would have radius 8.98, about 5.7% larger than the average of the axes.

For an E7 galaxy, the minor and major axes would have a ratio of 3 to 10. This gives an average of 6.5. Matching volumes gives a radius of 6.69 and matching surface area gives a ratio of 7.67, about 18% larger than the average of the axes.

]]>This post will discuss the more general problem of finding the radius when approximating any ellipsoid by a sphere. We will give the answer for Earth in particular, and we’ll show how to carry out the calculations. Most of the calculations are easy, but some involve elliptic integrals and we show how to compute these in Python.

First of all, what is an ellipsoid? It is a surface whose (*x*, *y*, *z*) coordinates satisfy

Earth is an oblate spheroid, which means *a* = *b* > *c*. Specifically, *a* = *b* = 6,378,137 meters, and *c* = 6,356,752 meters.

If you wanted to approximate an ellipsoid by a sphere, you could use

*r* = (*a* + *b* + *c*)/3.

Why? Because the knee-jerk reaction whenever you need to reduce a set of numbers to one number is to average them.

We could do a little better, depending on what property of the ellipsoid we’d like to preserve in our approximation. For example, we might want to create a sphere with the **same volume** as the ellipsoid. In that case we’d use the geometric mean

*r* = (*abc*)^{1/3}.

This is because the volume of an ellipsoid is 4π*abc*/3 and the volume of a sphere is 4π*r*^{3}/3.

For the particular case of the earth, we’d use

(*a*^{2}*c*)^{1/3} = 6371000.7

For some applications we might want a sphere with the **same surface area** as the ellipsoid rather than the same volume.

The surface area of an ellipsoid is considerably more complicated than the volume. For the special case of an oblate spheroid, like earth, the area is given by

where

The surface area of a sphere is 4 π*r*^{2} and so the following code computes *r*.

from math import sqrt, atanh e = sqrt(1 - (c/a)**2) r = a*sqrt(1 + (1-e**2)*atanh(e)/e) / sqrt(2)

This gives *r* = 6371007.1 for the earth, about 6.4 meters more than the number we got matching volume rather than area.

For a general ellipsoid, the surface area is given by

where

and

Here *F* is the “incomplete elliptic integral of the first kind” and *E* is the “incomplete elliptic integral of the second kind.” The names are historical artifacts, but the “elliptic” part of name comes from the fact that these functions were discovered in the context of arc lengths with ellipses, so it shouldn’t be too surprising to see them here.

In SciPy, *F*(φ, *k*) is given by `ellipkinc`

and *E*(φ, *k*) is given by `ellipeinc`

. Both function names start with `ellip`

because they are elliptic functions, and end in `inc`

because they are “incomplete.” In the middle, `ellipeinc`

has an “e” because it computes the mathematical function *E*(φ, *k*).

But why does `ellipkinc`

have a “k” in the middle? The “complete” elliptic integral of the first kind is *K*(*k*) = F(π/2, *k*). The “k” in the function name is a reminder that we’re computing the incomplete version of the *K* function.

Here’s the code for computing the surface area of a general ellipsoid:

from math import sin, cos, acos, sqrt, pi from scipy.special import ellipkinc, ellipeinc def area(a, b, c): phi = acos(c/a) k = a*sqrt(b**2 - c**2)/(b*sqrt(a**2 - c**2)) E = ellipeinc(phi, k) F = ellipkinc(phi, k) elliptic = E*sin(phi)**2 + F*cos(phi)**2 return 2.0*pi*c**2 + 2*pi*a*b*elliptic/sin(phi)

The differences between the various approximation radii are small for Earth. See my next post on elliptical galaxies where the differences are much larger.

**Related posts**: