The study of the planet Mercury provides two examples of the bandwagon effect. In her new book Worlds Fantastic, Worlds Familiar, planetary astronomer Bonnie Buratti writes

The study of Mercury … illustrates one of the most confounding bugaboos of the scientific method: the bandwagon effect. Scientists are only human, and they impose their own prejudices and foregone conclusions on their experiments.

Around 1800, Johann Schroeter determined that Mercury had a rotational period of 24 hours. This view held for eight decades.

In the 1880’s, Giovanni Schiaparelli determined that Mercury was tidally locked, making one rotation on its axis for every orbits around the sun. This view also held for eight decades.

In 1965, radar measurements of Mercury showed that Mercury completes 3 rotations in every 2 orbits around the sun.

Studying Mercury is difficult since it is only visible near the horizon and around sunrise and sunset, i.e. when the sun’s light interferes. And it is understandable that someone would confuse a 3:2 resonance with tidal locking. Still, for two periods of eight decades each, astronomers looked at Mercury and concluded what they expected.

The difficulty of seeing Mercury objectively was compounded by two incorrect but satisfying metaphors. First that Mercury was like Earth, rotating every 24 hours, then that Mercury was like the moon, orbiting the sun the same way the moon orbits Earth.

Buratti mentions the famous Millikan oil drop experiment as another example of the bandwagon effect.

… Millikan’s value for the electron’s charge was slightly in error—he had used a wrong value for the viscosity of air. But future experimenters all seemed to get Millikan’s number. Having done the experiment myself I can see that they just picked those values that agreed with previous results.

Buratti explains that Millikan’s experiment is hard to do and “it is impossible to successfully do it without abandoning most data.” This is what I like to call acceptance-rejection modeling.

Acceptance-rejection modeling: Throw out data that don’t fit with your model, and what’s left will.

— Data Science Fact (@DataSciFact) July 2, 2015

The name comes from the acceptance-rejection method of random number generation. For example, the obvious way to generate truncated normal random values is to generate (unrestricted) normal random values and simply throw out the ones that lie outside the interval we’d like to truncate to. This is inefficient if we’re truncating to a small interval, but it always works. We’re conforming our samples to a pre-determined distribution, which is OK when we do it intentionally. The problem comes when we do it unintentionally.

Photo of Mercury above via NASA

]]>In his paper Mindless statistics, Gerd Gigerenzer uses a Freudian analogy to describe the mental conflict researchers experience over statistical hypothesis testing. He says that the “statistical ritual” of NHST (null hypothesis significance testing) “is a form of conflict resolution, like compulsive hand washing.”

In Gigerenzer’s analogy, the **id** represents Bayesian analysis. Deep down, a researcher wants to know the probabilities of hypotheses being true. This is something that Bayesian statistics makes possible, but more conventional frequentist statistics does not.

The **ego** represents R. A. Fisher’s significance testing: specify a null hypothesis only, not an alternative, and report a *p*-value. Significance is calculated after collecting the data. This makes it easy to publish papers. The researcher never clearly states his hypothesis, and yet takes credit for having established it after rejecting the null. This leads to feelings of guilt and shame.

The **superego** represents the Neyman-Pearson version of hypothesis testing: pre-specified alternative hypotheses, power and sample size calculations, etc. Neyman and Pearson insist that hypothesis testing is about what to *do*, not what to *believe*. [1]

* * *

I assume Gigerenzer doesn’t take this analogy too seriously. In context, it’s a humorous interlude in his polemic against rote statistical ritual.

But there really is a conflict in hypothesis testing. Researchers naturally think in Bayesian terms, and interpret frequentist results as if they were Bayesian. They really do want probabilities associated with hypotheses, and will imagine they have them even though frequentist theory explicitly forbids this. The rest of the analogy, comparing the ego and superego to Fisher and Neyman-Pearson respectively, seems weaker to me. But I suppose you could imagine Neyman and Pearson playing the role of your conscience, making you feel guilty about the pragmatic but unprincipled use of *p*-values.

* * *

[1] “No test based upon a theory of probability can by itself provide any valuable evidence of the truth or falsehood of a hypothesis. But we may look at the purpose of tests from another viewpoint. Without hoping to know whether each separate hypothesis is true or false, we may search for rules to govern behaviour in regard to them, in following which we insure that, in the long run of experience, we shall not often be wrong.”

Neyman J, Pearson E. On the problem of the most efficient tests of statistical hypotheses. *Philos Trans Roy Soc A*, 1933;231:289, 337.

First imagine a thin wire running through the coil of the shell. In cylindrical coordinates, this wire follows the parameterization

*r* = *e*^{kθ}

*z* = *Tt*

If *T* = 0 this is a logarithmic spiral in the (*r*, θ) plane. For positive *T*, the spiral is stretched so that its vertical position is proportional to its radius.

Next we build a shell by putting a tube around this imaginary wire. The radius *R* of the tube at each point is proportional to the *r* coordinate: *R = Dr.*

The image above was created using *k* = 0.1, *T* = 2.791, and *D* = 0.8845 using Øyvind Hammer’s seashell generating software. You can download Hammer’s software for Windows and experiment with your own shell simulations by adjusting the parameters.

See also Hammer’s book and YouTube video:

]]>“A bird may love a fish but where would they build a home together?” — Fiddler on the Roof

]]>

The above quote makes me think of a connection Fourier made between triangles and thermodynamics.

Trigonometric functions were first studied because they relate angles in a right triangle to ratios of the lengths of the triangle’s sides. For the most basic applications of trigonometry, it only makes sense to consider positive angles smaller than a right angle. Then somewhere along the way someone discovered that it’s convenient to define trig functions for any angle.

Once you define trig functions for any angle, you begin to think of these functions as being associated with *circles* rather than triangles. More advanced math books refer to trig functions as *circular* functions. The triangles fade into the background. They’re still there, but they’re drawn inside a circle. (Hyperbolic functions are associated with hyperbolas the same way circular functions are associated with circles.)

Now we have functions that historically arose from studying triangles, but they’re defined on the whole real line. And we ask the kinds of questions about them that we ask about other functions. How fast do they change from point to point? How fast does their rate of change change? And here we find something remarkable. The rate of change of a sine function is proportional to a cosine function and vice versa. And if we look at the rate of change of the rate of change (the second derivative or acceleration), sine functions yield more sine functions and cosine functions yield more cosine functions. In more sophisticated language, sines and cosines are eigenfunctions of the second derivative operator.

Here’s where thermodynamics comes in. You can use basic physics to derive an equation for describing how heat in some object varies over time and location. This equation is called, surprisingly enough, the heat equation. It relates second derivatives of heat in space with first derivatives in time.

Fourier noticed that the heat equation would be easy to solve if only he could work with functions that behave very nicely with regard to second derivatives, i.e. sines and cosines! If only everything were sines and cosines. For example, the temperature in a thin rod over time is easy to determine if the initial temperature distribution is given by a sine wave. Interesting, but not practical.

However, the initial distribution doesn’t have to be a sine, or a cosine. We can still solve the heat equation if the initial distribution is a sum of sines. And if the initial distribution is approximately a sum of sines and cosines, then we can compute an approximate solution to the heat equation. So what functions are approximately a sum of sines and cosines? All of them!

Well, not quite all functions. But lots of functions. More functions than people originally thought. Pinning down exactly what functions can be approximated arbitrarily well by sums of sines and cosines (i.e. which functions have convergent Fourier series) was a major focus of 19th century mathematics.

So if someone asks what use they’ll ever have for trig identities, tell them they’re important if you want to solve the heat equation. That’s where I first used some of these trig identities often enough to remember them, and that’s a fairly common experience for people in math and engineering. Solving the heat equation reviews everything you learn in trigonometry, even though there are not necessarily any triangles or circles in sight.

]]>*N*_{5}, the 95th percentile of loudness, measured in sone (which is confusingly called the 5th percentile)- ω
_{S}, a function of sharpness in asper and of loudness - ω
_{FR}, fluctuation strength (in vacil), roughness (in asper), and loudness.

Specifically, Zwicker calculates *PA*, psychoacoutic annoyance, by

A geometric visualization of the formula is given below.

Here’s an example of computing roughness using two sound files from previous posts, a leaf blower and a simulated kettledrum. I calibrated both to have sound pressure level 80 dB. But because of the different composition of the sounds, i.e. more high frequency components in the leaf blower, the leaf blower is much louder than the kettledrum (39 sone vs 15 sone) at the same sound pressure level. The annoyance of the leaf blower works out to about 56 while the kettledrum was only about 19.

]]>

The most recent episode of 99% Invisible tells the story of the Corp of Engineers’ enormous physical model of the Mississippi basin, nearly half of the area of the continental US. Spanning over 200 acres, the model was built during WWII and was shut down in 1993.

Here are some of my favorite lines from the show:

The reason engineers continue to rely on [physical models] is because today, in 2016, we still do not have the computers or the science to do all the things that physical models can do. …

Hydraulic engineering gets into some of the most complicated math there is. Allegedly when Albert Einstein’s son Hans said he wanted to study how sediment moves underwater, Einstein asked him why he wanted to work on something so complicated.

The physics involved happen on such a small scale that we still haven’t built equations complex enough to capture them. And so Stanford Gibson, a world-class numerical modeler, is one of the most ardent supporters of physical models.

But then I have a quibble. The show goes on to say “a physical model doesn’t require equations at all.” That’s not true. When you build a small thing to study how a big thing works, you’ve got to have some theory relating the behavior of the two. If the real thing is 1000 times bigger than the model, does that mean you can simply multiply measurements from the model by 1000 to predict measurements of reality? Sometimes. **Some effects scale linearly, but some do not**.

It would have been more accurate to say a physical model doesn’t require *as many* equations as an entirely mathematical model. **The physical model is a partial mathematical model** because of the mathematics necessary to extrapolate from the model to the thing being modeled.

Another line from the show that I liked was a quote from Stanford Gibson, introduced above.

The idea that science demystifies the world, I just don’t understand that. I feel that the deeper down the scientific rabbit hole I go, the bigger and grander and more magical the world seems.

**Related post**: Toy problems

In a previous post I explained the rationale behind using names of colors to refer to different kinds of noise. The basis is an analogy between the spectra of sounds and the spectra of light. Red noise is biased toward the low end of the audio spectrum just as red light is toward the low end of the visible spectrum. Blue noise is biased toward the high end, just as blue light is toward the high end of the visible spectrum.

Green noise is based on a slightly different analogy with light as described here:

Blue, green and other noise colours seem not to be rigorously defined although the word “colour” is used a lot in describing noise. Some define the 7 rainbow colours to correspond to a width of about three critical bands in the Bark frequency scale such that green lies in the corresponding point of greatest sensitivity … [just as green light has] the greatest sensitivity for the eye. This identifies green noise as the most troublesome for speech systems.

This is different than the usual definition of red noise etc. in that it speaks of colors **limited** to a particular frequency range rather than being **weighted** toward that range. Usually red noise contains a broad spectrum of frequencies, but the weighted like 1/*f*^{2}, so the spectrum decreases fairly quickly as frequency increases.

So what is this Bark frequency scale? First of all, the Bark scale was named in honor of acoustician Heinrich Barkhausen. On this scale, the audible spectrum runs from 0 to 24, each Bark being a sort of psychologically equal division. Lots of things in psychoacoustics work on the Bark scale rather than the scale of Hertz.

There are multiple ways to convert from Hz to Bark and back, each slightly different but approximately equivalent. A convenient form is

*z* = 6 arcsinh(*f*/600)

where *f* is frequency in Hertz and *z* is frequency in Bark. One reason this form is convenient is that it’s easy to invert:

*f* = 600 sinh(*z*/6)

A frequency of 24 Bark corresponds to around 16 kHz, so the audible spectrum doesn’t quite end at 24, at least for most young people, but applications are most concerned with the range of 0–24 Bark.

**Update**: Here’s an online calculator to convert between Hz and Bark.

The paragraph above is a little vague about where the color boundaries should be. When it says there are seven intervals, each “a width of about three critical bands,” I assume it means to divide the range of 0–24 Bark into seven equal pieces, making each 24/7 or 3.4 Barks wide. If we do this, red would run from 0–3.43 Barks, orange from 3.43–6.86, yellow from 6.86–12.29, green from 10.29–13.71, etc.

This would put green noise in the range of 1612 to 2919 Hz. Human hearing is most sensitive around 2000 Hz, near the middle of this interval.

In musical notation, the frequency range of green noise runs from G6 to F#7. See this post for an explanation of the pitch notation and Python code for computing it from frequency.

**Update**: See the next post for how to create an audio file of green noise in Python. Here’s a spectral plot from that post showing that the frequencies in the noise are in the expected range.

Need help with signal processing or acoustics?

]]>]]>In fact, there’s no single set of genes that all living things need in order to exist. When scientists first began searching for such a thing 20 years ago, they hoped that simply comparing the genome sequences from a bunch of different species would reveal an essential core shared by all species. But as the number of genome sequences blossomed, that essential core disappeared. In 2010, David Ussery, a biologist at Oak Ridge National Laboratory in Tennessee, and his collaborators compared 1,000 genomes. They found that not a single gene is shared across all of life.

This post will look at a roughness in a different setting, amplitude modulation. Several psychoacoustics researchers have suggested that perceived roughness increases as a power of modulation depth, up to a maximum. That is,

where the signal is

Some have suggested, based on empirical studies, that *p* = 2, while other have suggested that *p* varies as a function of the frequency *f*_{c} of the carrier wave.

Here is an audio (.wav) file where the the modulation depth varies as a function of time, *m* = 0.1*t* where *t* is time in seconds.

In this example the carrier frequency *f*_{c} is 1000 Hz and the modulation frequency *f*_{m} is 60 Hz.

Reference: Psychoacoustical Roughness: Implementation of an Optimized Model. P. Daniel and R. Weber. Acoustia 83 (1997) 113–123

]]>It’s not hard to measure the physical intensity of a sound, but loudness is the *perceived* intensity of a sound. It is not a physical phenomena but a psychological phenomena.

Loudness is subjective, but not entirely so. There is general consensus regarding what it means for two sounds to be equally loud, and even for ratios, such as saying when one sound is twice as loud as the other. Loudness is quantifiable, but not easily so.

Loudness depends on several properties of a sound, such as its frequency, bandwidth, and duration. Loudness must depend on frequency because sounds that are too low or too high have no loudness at all because we simply cannot hear them. But even with the range of audible frequencies, loudness varies quite a bit by pitch. The graph below, via Wikipedia, shows equal loudness contours. The blue lines are from work by Fletcher and Munson in 1937. The red lines are the revised curves per the ISO 226:2003 standard.

The horizontal axis is frequency in Hz and the vertical axis is sound pressure level in decibels. The contour lines represent combinations of frequency and sound pressure level that are perceived to be equally loud. If a tuba and a flute sound equally loud, the sound pressure level coming from the tuba is much higher.

Notice that the curves are not parallel, They’re much closer together for low frequencies than for midrange frequencies, though they are roughly parallel for high frequencies. This means that if you recorded a piano, for example, playing each of its keys at equal loudness, the pitches wouldn’t sound equally loud unless you played the recording back at its original volume.

As complicated as this is, it’s still a simplification. It is based on pure tones, simple sine waves. A single musical instrument, much less an orchestra or a jackhammer, are more complicated. Loudness is highly nonlinear, and so you cannot say that the loudness of two sounds is the sum of their individual loudnesses. A-weighting is a relatively simple way to convert sound pressure levels to loudness, but is only accurate for pure tones at fairly low loudness levels.

To simplify thing further, consider a single pure tone, a sine wave at 1 kHz. (This is almost two octaves above middle C. See details here.) Loudness level in phons is defined to match sound pressure level in decibels for a 1 kHz pure tone. So a sound has a loudness level of 40 phons, for example, if it is perceived to be as loud as a pure 1 hKz tone at 40 dB.

At 1 kHz, loudness increases by a factor of 2 for every 10 dB increase in sound pressure level. But because nothing is simple in psychoacoustics, even this is a simplification. It only holds for sounds with loudness level 40 dB or greater. A quiet room is around 40 phons, so the added complications below 40 phons may not be relevant in many applications.

A pure tone at 1 kHz and 20 dB sounds more than four times softer than the same tone at 40 dB. The definition of loudness level in phons still holds below 40 phons. An oboe has a loudness level of 20 phons if it has the same loudness as a sine wave with frequency 1 kHz and sound pressure level 20 dB. But an oboe at 30 phons will sound more than twice as loud as one at 20 phons.

**Update**: New blog post comparing guitar samples at the same sound pressure level but with differing loudness and sharpness.

So where are we as far as calculating loudness? We’ve said a lot about what you *can’t* do, what complications have to be considered. But we’ve concluded this much: for a pure 1 kHz tone, the loudness in phons equals (by definition) the sound pressure level in decibels. And we’ve said how in principle you could define the loudness of other sounds: compare them to a 1 kHz tone that’s just as loud. We haven’t said how to compute this, only how you could determine it empirically.

In future posts I may write about how you do this using the ISO 532B standard or the newer ANSI S3.4-2007 standard.

“Reproducible” and “randomized” don’t seem to go together. If something was unpredictable the first time, shouldn’t it be unpredictable if you start over and run it again? As is often the case, we want incompatible things.

But the combination of reproducible and random can be reconciled. Why would we want a randomized controlled trial (RCT) to be random, and why would we want it to be reproducible?

**One of the purposes** in randomized experiments is the hope of scattering complicating factors evenly between two groups. For example, one way to test two drugs on a 1000 people would be to gather 1000 people and give the first drug to all the men and the second to all the women. But maybe a person’s sex has something to do with how the drug acts. If we randomize between two groups, it’s likely that about the same number of men and women will be in each group.

The example of sex as a factor is oversimplified because there’s reason to suspect *a priori* that sex might make a difference in how a drug performs. The bigger problem is that factors we can’t anticipate or control may matter, and we’d like them scattered evenly between the two treatment groups. If we knew what the factors were, we could assure that they’re evenly split between the groups. The hope is that randomization will do that for us with things we’re unaware of. For this purpose we don’t need a process that is “truly random,” whatever that means, but a process that matches our expectations of how randomness should behave. So a pseudorandom number generator (PRNG) is fine. No need, for example, to randomize using some physical source of randomness like radioactive decay.

**Another purpose** in randomization is for the assignments to be unpredictable. We want a physician, for example, to enroll patients on a clinical trial without knowing what treatment they will receive. Otherwise there could be a bias, presumably unconscious, against assigning patients with poor prognosis if the physicians know the next treatment be the one they hope or believe is better. Note here that the randomization only has to be unpredictable from the perspective of the people participating in and conducting the trial. The assignments could be predictable, in principle, by someone *not* involved in the study.

And why would you want an randomization assignments to be **reproducible**? One reason would be to test whether randomization software is working correctly. Another might be to satisfy a regulatory agency or some other oversight group. Still another reason might be to defend your randomization in a law suit. A physical random number generator, such as using the time down to the millisecond at which the randomization is conducted would achieve random assignments and unpredictability, but not reproducibility.

Computer algorithms for generating random numbers (technically pseudo-random numbers) can achieve reproducibility, practically random allocation, and unpredictability. The randomization outcomes are predictable, and hence reproducible, to someone with access to the random number generator and its state, but unpredictable in practice to those involved in the trial. The internal state of the random number generator has to be saved between assignments and passed back into the randomization software each time.

Random number generators such as the Mersenne Twister have good statistical properties, but they also carry a large amount of state. The random number generator described here has very small state, 64 bits, and so storing and returning the state is simple. If you needed to generate a trillion random samples, Mersenne Twitster would be preferable, but since RCTs usually have less than a trillion subjects, the RNG in the article is perfectly fine. I have run the Die Harder random number generator quality tests on this generator and it performs quite well.

Image by Ilmicrofono Oggiono, licensed under Creative Commons

]]>H. and B. S. Jeffreys, Methods of Mathematical Physics, 2nd ed., Cambridge University Press, 1950, p. 8.

**Related post**: Just an approximation

The planets have elliptical orbits with the sun at one focus, but these ellipses are nearly circles centered at the sun. We’ll assume the orbits are perfectly circular and lie in the same plane. (Now that Pluto is not classified as a planet, we can say without qualification that the planets have nearly circular orbits. Pluto’s orbit is much more elliptical than any of the planets.)

We can work in astronomical units (AUs) so that the distance from the Earth to the sun is 1. We can also work in units of years so that the period is also 1. Then we could describe the position of the Earth at time *t* as exp(2π*it*).

Mars has a larger orbit and a longer period. By Kepler’s third law, the size of the orbit and the period are related: the square of the period is proportional to the cube of the radius. Because we’re working in AUs and years, the proportionality constant is 1. If we denote the radius of Mars’ orbit by *r*, then its orbit can be described by

*r* exp(2π*i* (*r*^{-3/2} *t* ))

Here we pick our initial time so that at *t* = 0 the two planets are aligned.

The distance between the planets is just the absolute value of the difference between their positions:

| exp(2π*it*) – *r* exp(2π*i* (*r*^{-3/2} *t*)) |

The following code computes and plots the distance from Earth to Mars over time.

from scipy import exp, pi, absolute, linspace import matplotlib.pyplot as plt def earth(t): return exp(2*pi*1j*t) def mars(t): r = 1.524 # semi-major axis of Mars orbit in AU return r*exp(2*pi*1j*(r**-1.5*t)) def distance(t): return absolute(earth(t) - mars(t)) x = linspace(0, 20, 1000) plt.plot(x, distance(x)) plt.xlabel("Time in years") plt.ylabel("Distance in AU") plt.ylim(0, 3) plt.show()

And the output looks like this:

Notice that the distance varies from about 0.5 to about 2.5. That’s because the radius of Mars’ orbit is about 1.5 AU. So when the planets are exactly in phase, they are 0.5 AU apart and when they’re exactly out of phase they are 2.5 AU apart. In other words the distance ranges from 1.5 – 1 to 1.5 + 1.

The distance function seems to be periodic with period about 2 years. We can do a little calculation by hand to show that is the case and find the period exactly.

The distance squared is the distance times its complex conjugate. If we let ω = *r *^{-3/2} then the distance squared is

*d*^{2}(*t*) = (exp(2π*it*) – *r* exp(2π*i*ω*t*)) (exp(-2π*it*) – *r* exp(-2π*i*ω*t*))

which simplifies to

1 + *r*^{2} – 2*r* cos(2π(1 – ω)*t*)

and so the (squared) distance is periodic with period 1/(1 – ω) = 2.13.

Notice that the plot of distance looks more angular at the minima and more rounded near the maxima. Said another way, the distance changes more rapidly when the planets leave their nearest approach than their furthest approach. You can prove this by taking square root of *d*^{2}(*t*) and computing its derivative.

Let *f*(*t*) = 1 + *r*^{2} – 2*r* cos(2π(1 – ω)*t*). By the chain rule, the derivative of the square root of *f*(*t*) is 1/2 *f*(*t*)^{-1/2} *f*‘(*t*). Near a maximum or a minimum, *f*‘(*t*) takes on the same values. But the term *f*(*t*)^{-1/2} is largest when *f*(*t*) is smallest and vice versa because of the negative exponent.

**Related links**:

Or maybe not. A new study of three contemporary hunter-gatherer tribes found that they stay awake long after dark and sleep an average of 6.5 hours a night. They also don’t nap much [1]. This suggests the way we sleep may not be that different from our ancient forebears.

Historian A. Roger Ekirch suggested that before electric lighting it was common to sleep in two four-hour segments with an hour or so of wakefulness in between. His theory was based primarily on medieval English texts that refer to “first sleep” and “second sleep” and has other literary support as well. A small study found that subjects settled into the sleep pattern Ekirch predicted when they were in a dark room for 14 hours each night for a month. But the hunter-gatherers don’t sleep this way.

Maybe latitude is an important factor. The hunter-gatherers mentioned above live between 2 and 20 degrees south of the equator whereas England is 52 degrees north of the equator. Maybe two-phase sleep was more common at high latitudes with long winter nights. Of course there are many differences between modern/ancient [2] hunter-gatherers and medieval Western Europeans besides latitude.

Two studies have found two patterns of how people sleep without electric lights. Maybe electric lights don’t have as much impact on how people sleep as other factors.

**Related post**: Paleolithic nonsense

* * *

[1] The study participants were given something like a Fitbit to wear. The article said that naps less than 15 minutes would be below the resolution of the monitors, so we don’t know how often the participants took cat naps. We only know that they rarely took longer naps.

[2] There is an implicit assumption that the contemporary hunter-gatherers live and, in particular, sleep like their ancient ancestors. This seems reasonable, though we can’t be certain. There is also the bigger assumption that the tribesmen represent not only *their* ancestors but all paleolithic humans. Maybe they do, and we don’t have much else to go on, but we don’t know. I suspect there was more diversity in the paleolithic era than we assume.

After the alphabet and the tables of multiplication, nothing has proved quite so useful in my professional life as these six little expressions.

The six expressions he refers to are nicknamed the *vergeet-me-nietjes *in Dutch, which translates to forget-me-nots in English. They are also known as Dr. Myosotis’s equations because myosotis is the genus for forget-me-nots. The equations give the angular and linear deflections of a cantilever beam.

Imagine a beam anchored at one end and free on the other, subject to one of the kinds of load: a bending moment *M* at the opposite end, a point force *P* a the opposite end, or a force *w* distributed over the length of the beam. The equations below give the rotation (angular deflection) and displacement (linear deflection) of the free end of the beam.

Rotation | Displacement | |
---|---|---|

Bending moment | ML/EI |
ML^{2}/2EI |

Point load | PL^{2}/2EI |
PL^{3}/3EI |

Distributed load | wL^{3}/6EI |
wL^{4}/8EI |

Here *E* is the modulus of elasticity, *L* is the length of the beam, and *I* is the area moment of inertia.

]]>… I said that if science could come up with something like the Jump it could surely solve a problem like that. Severin seized hold of that word, “science.” Science, he said, is not some mysterious larger-than-life force, it’s just the name we give to bright ideas that individual guys have when they’re lying in bed at night, and that if the fuel thing bothered me so much, there was nothing stopping me from having a bright idea to solve it …

This is a thumbnail version of a large, high-resolution image by Ulysse Carion. Thanks to Aleksey Shipilëv (@shipilev) for pointing it out.

It’s hard to see in the thumbnail, but the map gives the change in velocity needed at each branch point. You can find the full 2239 x 2725 pixel image here or click on the thumbnail above.

]]>It looks like the story is a matter of fraud rather than sloppiness. This is unfortunate because sloppiness is much more pervasive than fraud, and this could have made a great case study of bad analysis. However, one could look at it as a case study in how *good* analysis (by the folks at MD Anderson) can uncover fraud.

Now there’s a new development in the Potti saga. The latest issue of The Cancer Letter contains letters by whistle-blower Bradford Perez who warned officials at Duke about problems with Potti’s research.

]]>Eroom’s law — that’s Moore’s law backward — observes that the number of new drugs approved per billion dollars spent on R&D has halved every nine years since 1950.

**Update**: Here’s an article from Nature that gives more details. The trend is pretty flat on a log scale, i.e. exponentially declining efficiency.

* * *

]]>**Related post**: Take chances, make mistakes, and get messy

Suppose you have a space ship that could accelerate at 1 g for as long as you like. Inside the ship you would feel the same gravity as on earth. You could travel wherever you like by accelerating at 1 g for the first half of the flight then reversing acceleration for the second half of the flight. This approach could take you to Mars in three days.

If you could accelerate at 1 g for a year you could reach the speed of light, and travel half a light year. So you could reverse your acceleration and reach a destination a light year away in two years. But this ignores relativity. Once you’re traveling at near the speed of light, time practically stops for you, so you could keep going as far as you like without taking any more time from your perspective. So you could travel **anywhere** in the universe in two years!

Of course there are a few problems. We have no way to sustain such acceleration. Or to build a ship that could sustain an impact with a spec of dust when traveling at relativistic speed. And the calculation ignores relativity until it throws it in at the end. Still, it’s fun to think about.

**Update**: Dan Piponi gives a calculation on G+ that addresses the last of the problems I mentioned above, sticking relativity on to the end of a classical calculation. He does a proper relativistic calculation from the beginning.

]]>If you take the radius of the observable universe to be 45 billion light years, then I think you need about 12.5 g to get anywhere in it in 2 years. (Both those quantities as measured in the frame of reference of the traveler.)

If you travel at constant acceleration a for time t then the distance covered is c^2/a (cosh(a t/c) – 1) (Note that gives the usual a t^2/2 for small t.)

Which side is correct depends on what’s out there waiting to be discovered, which of course we don’t know. We can only guess. Timid research is rational if you believe there are only marginal improvements that are likely to be discovered.

Sample size increases quickly as the size of the effect you’re trying to find decreases. To establish small differences in effect, you need very large trials.

If you think there are only small improvements on the status quo available to explore, you’ll explore each of the possibilities very carefully. On the other hand, if you think there’s a miracle drug in the pipeline waiting to be discovered, you’ll be willing to risk falsely rejecting small improvements along the way in order to get to the big improvement.

Suppose there are 500 drugs waiting to be tested. All of these are only 10% effective except for one that is 100% effective. You could quickly find the winner by giving each candidate to one patient. For every drug whose patient responded, repeat the process until only one drug is left. One strike and you’re out. You’re likely to find the winner in three rounds, treating fewer than 600 patients. But if all the drugs are 10% effective except one that’s 11% effective, you’d need hundreds of trials with thousands of patients each.

The best research strategy depends on what you believe is out there to be found. People who know nothing about cancer often believe we could find a cure soon if we just spend a little more money on research. Experts are more sanguine, except when they’re asking for money.

]]>However, a more fundamental point has been lost. At the core of Ioannidis’ paper is the assertion that **the proportion of true hypotheses under investigation matters**. In terms of Bayes’ theorem, the *posterior* probability of a result being correct depends on the *prior* probability of the result being correct. This prior probability is vitally important, and it varies from field to field.

In a field where it is hard to come up with good hypotheses to investigate, most researchers will be testing false hypotheses, and most of their positive results will be coincidences. In another field where people have a good idea what ought to be true before doing an experiment, most researchers will be testing true hypotheses and most positive results will be correct.

For example, it’s very difficult to come up with a better cancer treatment. Drugs that kill cancer in a petri dish or in animal models usually don’t work in humans. One reason is that these drugs may cause too much collateral damage to healthy tissue. Another reason is that treating human tumors is more complex than treating artificially induced tumors in lab animals. Of all cancer treatments that appear to be an improvement in early trials, very few end up receiving regulatory approval and changing clinical practice.

A greater proportion of physics hypotheses are correct because physics has powerful theories to guide the selection of experiments. Experimental physics often succeeds because it has good support from theoretical physics. Cancer research is more empirical because there is little reliable predictive theory. This means that a published result in physics is more likely to be true than a published result in oncology.

Whether “most” published results are false depends on context. The proportion of false results varies across fields. It is high in some areas and low in others.

* * *

For daily tips on data science, follow @DataSciFact on Twitter.

]]>I’m not sure whether I agree with Brenner’s quote, but I find it interesting. You could argue that techniques are most important because they have the most leverage. A new technique may lead to many new discoveries and new ideas.

]]>

]]>“Oh, the intellectual freedom of academia” he thought while filling out a time sheet which checks that he does not work on non-grant science.

When Coleridge, the most famous poet of the day, wrote his tract on scientific method in 1817 it was not considered an oddity; by 1833, the time of the third meeting of the British Association for the Advancement of Science, it was already remarkable, and in the years that followed it was almost inconceivable.

**Related post**: How the term “scientist” came to be

To me, the subject of “information theory” is badly named. That discipline is devoted to finding ideal compression schemes for messages to be sent quickly and accurately across a noisy channel. It deliberately does not pay any attention to what the messages mean. To my mind this should be called compression theory or redundancy theory. Information is inherently meaningful—that is its purpose—any theory that is unconcerned with the meaning is not really studying information per se. The people who decide on speed limits for roads and highways may care about human health, but a study limited to deciding ideal speed limits should not be called “human health theory”.

Despite what was said above, Information theory has been extremely important in a diverse array of fields, including computer science but also in neuroscience and physics. I’m not trying to denigrate the field; I am only frustrated with its name.

From David Spivak, footnotes 13 and 14 here.

]]>I was surprised by the articles on the bombing of Hiroshima and Nagasaki. New York Times reporter William Lawrence was allowed to go on the mission over Nagasaki. He was not on the plane that dropped the bomb, but was in one of the other B-29 Superfortresses that were part of the mission. Lawrence’s story was published September 9, 1945, exactly one month later. Lawrence was also allowed to tour the ruins of Hiroshima. His article on the experience was published September 5, 1945. I was surprised how candid these articles were and how quickly they were published. Apparently military secrecy evaporated rapidly once WWII was over.

Another thing that surprised me was that some stories were newsworthy more recently than I would have thought. I suppose I underestimated how long it took to work out the consequences of a major discovery. I think we’re also biased to think that whatever we learned as children must have been known for generations, even though the dust may have only settled shortly before we were born.

]]>