Coulomb’s constant

Richard Feynman said nearly everything is really interesting if you go into it deeply enough. In that spirit I’m going to dig into the units on Coulomb’s constant. This turns out to be an interesting rabbit trail.

Coulomb’s law says that the force between two charged particles is proportional to the product of their charges and inversely proportional to the distance between them. In symbols,

F = k_e \frac{q_1\, q_2}{r^2}

The proportionality constant, the ke term, is known as Coulomb’s constant. Continue reading

Herd immunity countdown

A few weeks ago I wrote a post giving a back-of-the-envelope calculation regarding when the US would reach herd immunity to SARS-COV-2. As I pointed out repeatedly, this is only a rough estimate because it makes numerous simplifying assumptions and is based on numbers that have a lot of uncertainty around them. See that post for details.

That post was based on the assumption that 26 million Americans had been infected with the virus. I’ve heard other estimates of 50 million or 100 million.

Update: The CDC estimates that 83 million Americans were infected in 2020 alone. I don’t see that they’ve issued any updates to this figure, but everyone who has been infected in 2021 brings us closer to herd immunity.

The post was also based on the assumption that we’re vaccinating 1.3 million per day. A more recent estimate is 1.8 million per day. (Update: We’re at 2.7 million per day as of March 30, 2021.) So maybe my estimate was pessimistic. On the other hand, the estimate for the number of people with pre-existing immunity that I used may have been optimistic.

Because there is so much we don’t know, and because numbers are frequently being updated, I’ve written a little Python code to make all the assumptions explicit and easy to update. According to this calculation, we’re 45 days from herd immunity. (Update: We could be at herd immunity any time now, depending on how many people had pre-existing immunity.)

As I pointed out before, herd immunity is not a magical cutoff with an agreed-upon definition. I’m using a definition that was suggested a year ago. Viruses never [1] completely go away, so any cutoff is arbitrary.

Here’s the code. It’s Python, but you it would be trivial to port to any programming language. Just remove the underscores as thousands separators if your language doesn’t support them and change the comment marker if necessary.

US_population         = 330_000_000
num_vaccinated        =  50_500_000 # As of March 30, 2021
num_infected          =  83_100_000 # As of January 1, 2021
vaccine_efficacy      = 0.9
herd_immunity_portion = 0.70

# Some portion of the population had immunity to SARS-COV-2
# before the pandemic. I've seen estimates from 10% up to 60%.
portion_pre_immune = 0.30
num_pre_immune = portion_pre_immune*US_population

# Adjust for vaccines given to people who are already immune.
portion_at_risk = 1.0 - (num_pre_immune + num_infected)/US_population

num_new_vaccine_immune = num_vaccinated*vaccine_efficacy*portion_at_risk

# Number immune at present
num_immune = num_pre_immune + num_infected + num_new_vaccine_immune
herd_immunity_target = herd_immunity_portion*US_population

num_needed = herd_immunity_target - num_immune

num_vaccines_per_day = 2_700_000 # As of March 30, 2021
num_new_immune_per_day = num_vaccines_per_day*portion_at_risk*vaccine_efficacy

days_to_herd_immunity = num_needed / num_new_immune_per_day

print(days_to_herd_immunity)

[1] One human virus has been eliminated. Smallpox was eradicated two centuries after the first modern vaccine.

Solving for neck length

A few days ago I wrote about my experiment with a wine bottle and a beer bottle. I blew across the empty bottles and measured the resulting pitch, then compared the result to the pitch you would get in theory if the bottle were a Helmholtz resonator. See the previous post for details.

Tonight I repeated my experiment with an empty water bottle. But I ran into a difficulty immediately: where would you say the neck ends?

water bottle

An ideal Helmholtz resonator is a cylinder on top of a larger sphere. My water bottle is basically a cone on top of a cylinder.

So instead of measuring the neck length L and seeing what pitch was predicted with the formula from the earlier post

f = \frac{v}{2\pi} \sqrt{\frac{A}{LV}}

I decided to solve for L and see what neck measurement would be consistent with the Helmholtz resonator approximation. The pitch f was 172 Hz, the neck of the bottle is one inch wide, and the volume is half a liter. This implies L is 10 cm, which is a little less than the height of the conical part of the bottle.

Herd immunity on the back of an envelope

This post presents a back-of-the-envelope calculation regarding COVID herd immunity in the US. Every input to the calculation is only roughly known, and I’m going to make simplifying assumptions left and right. So take this all with a grain of salt.

According to a recent article, about 26 million Americans have been vaccinated against COVID, about 26 million Americans have been infected, and 1.34 million a day are being vaccinated, all as of February 1, 2021.

Somewhere around half the US population was immune to SARS-COV-2 before the pandemic began, due to immunity acquired from previous coronavirus exposure. The proportion isn’t known accurately, but has been estimated as somewhere between 40 and 60 percent.

Let’s say that as of February 1, that 184 million Americans had immunity, either through pre-existing immunity, infection, or vaccination. There is some overlap between the three categories, but we’re taking the lowest estimate of pre-existing immunity, so maybe it sorta balances out.

The vaccines are said to be 90% effective. That’s probably optimistic—treatments often don’t perform as well in the wild as they do in clinical trials—but let’s assume 90% anyway. Furthermore, let’s assume that half the people being vaccinated already have immunity, due to pre-existing immunity or infection.

Then the number of people gaining immunity each day is 0.5*0.9*1,340,000, which is about 600,000 per day. This assumes nobody develops immunity through infection from here on out, though of course some will.

There’s no consensus on how much of the population needs to have immunity before you have herd immunity, but I’ve seen numbers like 70% tossed around, so let’s say 70%.

We assumed we had 184 M with immunity on February 1, and we need 231 M (70% of a US population of 330M) to have herd immunity, so we need 47 M more people. If we’re gaining 600,000 per day through vaccination, this would take 78 days from February 1, which would be April 20.

So, the bottom line of this very crude calculation is that we should have herd immunity by the end of April.

I’ve pointed out several caveats. There are more, but I’ll only mention one, and that is that herd immunity is not an objective state. Viruses never completely go away; only one human virus—smallpox—has ever been eradicated, and that took two centuries after the development of a vaccine.

Every number in this post is arguable, and so the result should be taken with a grain of salt, as I said from the beginning. Certainly you shouldn’t put April 20 on your calendar as the day the pandemic is over. But this calculation does suggest that we should see a substantial drop in infections long before most of the population has been vaccinated.

Update: A few things have changed since this was written. For one thing, we’re vaccinating more people per day. See an update post with code you can update (or just carry out by hand) as numbers change.

More COVID posts

Good news from Pfizer and Moderna

Both Pfizer and Moderna have announced recently that their SARS-COV2 vaccine candidates reduce the rate of infection by over 90% in the active group compared to the control (placebo) group.

That’s great news. The vaccines may turn out to be less than 90% effective when all is said and done, but even so they’re likely to be far more effective than expected.

But there’s other good news that might be overlooked: the subjects in the control groups did well too, though not as well as in the active groups.

The infection rate was around 0.4% in the Pfizer control group and around 0.6% in the Moderna control group.

There were 11 severe cases of COVID in the Moderna trial, out of 30,000 subjects, all in the control group.

There were 0 severe cases of COVID in the Pfizer trial in either group, out of 43,000 subjects.

 

Why a little knowledge is a dangerous thing

Alexander Pope famously said

A little learning is a dangerous thing;
Drink deep, or taste not the Pierian spring:
There shallow draughts intoxicate the brain,
And drinking largely sobers us again.

I’ve been thinking lately about why a little knowledge is often a dangerous thing, and here’s what I’ve come to.

Any complex system has many causes acting on it. Some of these are going to be more legible than others. Here I’m using “legible” in a way similar to how James Scott uses the term. As Venkatesh Rao summarizes it,

A system is legible if it is comprehensible to a calculative-rational observer looking to optimize the system from the point of view of narrow utilitarian concerns and eliminate other phenomenology. It is illegible if it serves many functions and purposes in complex ways, such that no single participant can easily comprehend the whole. The terms were coined by James Scott in Seeing Like a State.

People who have a little knowledge of a subject are only aware of some of the major causes that are acting, and probably they are aware of the most legible causes. They have an unbalanced view because they are aware of the forces pushing in one direction but not aware of other forces pushing in other directions.

A naive view may be unaware of a pair of causes in tension, and may thus have a somewhat balanced perspective. And an expert may be aware of both causes. But someone who knows about one cause but not yet about the other is unbalanced.

Examples

When I first started working at MD Anderson Cancer Center, I read a book on cancer called One Renegade Cell. After reading the first few chapters, I wondered why we’re not all dead. It’s easy to see how cancer can develop from one bad cell division and kill you a few weeks later. It’s not as easy to understand why that doesn’t usually happen. The spreading of cancer is more legible than natural defenses against cancer.

I was recently on the phone with a client who had learned enough about data deidentification to become worried. I explained that there were also reasons to not be as worried, but that they’re more complicated, less legible.

What to do

Theories are naturally biased toward causes that are amenable to theory, toward legible causes. Practical experience and empirical data tend to balance out theory by providing some insight into less legible causes.

A little knowledge is dangerous not so much because it is partial but because it is biased; it’s often partial in a particular way, such as theory lacking experience. If you spiral in on knowledge in a more balanced manner, with a combination of theory and experience, you might not be as dangerous along the way.

When theory and reality differ, the fault lies in the theory. More on that in my next post. Theory necessarily leaves out complications, and that’s what makes it useful. The art is knowing which complications can be safely ignored under which circumstances.

Related posts

Time spent on the moon

Lunar module and lunar rover, photo via NASA

This post will illustrate two things: the amount of time astronauts have spent on the moon, and how to process dates and times in Python.

I was curious how long each Apollo mission spent on the lunar surface, so I looked up the timelines for each mission from NASA. Here’s the timeline for Apollo 11, and you can find the timelines for the other missions by making the obvious change to the URL.

Here are the data on when each Apollo lunar module touched down and when it ascended.

    data = [
        ("Apollo 11", "1969-07-20 20:17:39", "1969-07-21 17:54:00"),
        ("Apollo 12", "1969-11-19 06:54:36", "1969-11-20 14:25:47"),
        ("Apollo 14", "1971-02-05 09:18:13", "1971-02-06 18:48:42"),
        ("Apollo 15", "1971-07-30 22:16:31", "1971-08-02 17:11:23"),
        ("Apollo 16", "1972-04-21 02:23:35", "1972-04-24 01:25:47"),
        ("Apollo 17", "1972-12-11 19:54:58", "1972-12-14 22:54:37"),
    ]

Here’s a first pass at a program to parse the dates and times above and report their differences.

    from datetime import datetime, timedelta

    def str_to_datetime(string):
        return datetime.strptime(string, "%Y-%m-%d %H:%M:%S")

    def diff(str1, str2):
        return str_to_datetime(str1) - str_to_datetime(str2)

    for (mission, touchdown, liftoff) in data:
        print(f"{mission} {diff(liftoff, touchdown)}")

This works, but the formatting is unsatisfying.

    Apollo 11 21:36:21
    Apollo 12 1 day, 7:31:11
    Apollo 14 1 day, 9:30:29
    Apollo 15 2 days, 18:54:52
    Apollo 16 2 days, 23:02:12
    Apollo 17 3 days, 2:59:39

It would be easier to scan the output if it were all in hours. So we rewrite our diff function as follows.

    def diff(str1, str2):
        delta = str_to_datetime(str1) - str_to_datetime(str2)
        hours = delta.total_seconds() / 3600
        return round(hours, 2)

Now the output is easier to read.

    Apollo 11 21.61
    Apollo 12 31.52
    Apollo 14 33.51
    Apollo 15 66.91
    Apollo 16 71.04
    Apollo 17 74.99

These durations fall into three clusters, corresponding to the Apollo mission types G, H, and J. Apollo 11 was the only G-type mission. Apollo 12, 13, and 14 were H-type, intended to demonstrate a precise landing and explore the lunar surface. (Apollo 13 had to loop around the moon without landing.) The J-type missions were more extensive scientific missions. These missions included a lunar rover (“moon buggy”) to let the astronauts travel further from the landing site. There were no I-type missions; the objectives of the original I-type missions were merged into the J-type missions.

Incidentally, UNIX systems store times as seconds since 1970-01-01 00:00:00. That means the first two lunar landings were at negative times and the last four were at positive times. More on UNIX time here.

Related posts

Sample size calculation

If you’re going to run a test on rabbits, you have to decide how many rabbits you’ll use. This is your sample size. A lot of what statisticians do in practice is calculate sample sizes.

A researcher comes to talk to a statistician. The statistician asks what effect size the researcher wants to detect. Do you think the new thing will be 10% better than the old thing? If so, you’ll need to design an experiment with enough subjects to stand a good chance of detecting a 10% improvement. Roughly speaking, sample size is inversely proportional to the square of effect size. So if you want to detect a 5% improvement, you’ll need 4 times as many subjects as if you want to detect a 10% improvement.

You’re never guaranteed to detect an improvement. The race is not always to the swift, nor the battle to the strong. So it’s not enough to think about what kind of effect size you want to detect, you also have to think about how likely you want to be to detect it.

Here’s what often happens in practice. The researcher makes an arbitrary guess at what effect size she expects to see. Then initial optimism may waver and she decides it would be better to design the experiment to detect a more modest effect size. When asked how high she’d like her chances to be of detecting the effect, she thinks 100% but says 95% since it’s necessary to tolerate some chance of failure.

The statistician comes back and says the researcher will need a gargantuan sample size. The researcher says this is far outside her budget. The statistician asks what the budget is, and what the cost per subject is, and then the real work begins.

The sample size the negotiation will converge on is the budget divided by the cost per sample. The statistician will fiddle with the effect size and probability of detecting it until the inevitable sample size is reached. This sample size, calculated to 10 decimal places and rounded up to the next integer, is solemnly reported with a post hoc justification containing no mention of budgets.

Sample size is always implicitly an economic decision. If you’re willing to make it explicitly an economic decision, you can compute the expected value of an experiment by placing a value on the possible outcomes. You make some assumptions—you always have to make assumptions—and calculate the probability under various scenarios of reaching each conclusion for various sample sizes, and select the sample size that leads to the best expected value.

More on experimental design

[1] There are three ways an A/B test can turn out: A wins, B wins, or there isn’t a clear winner. There’s a tendency to not think enough about the third possibility. Interim analysis often shuts down an experiment not because there’s a clear winner, but because it’s becoming clear there is unlikely to be a winner.

Hohmann transfer orbit

How does a spacecraft orbiting a planet move from one circular orbit to another? It can’t just change lanes like a car going around a racetrack because speed and altitude cannot be changed independently.

The most energy-efficient way to move between circular orbits is the Hohmann transfer orbit [1]. The Hohmann orbit is an idealization, but it approximates maneuvers actually done in practice.

The Hohmann transfer requires applying thrust twice: once to leave the first circular orbit into the elliptical orbit, and once again to leave the elliptical orbit for the new circular orbit.

Hohmann transfer orbit

Suppose we’re in the orbit represented by the inner blue circle above and we want to move to the outer green circle. We apply our first instantaneous burst of thrust, indicated by the inner ×, and that puts us into the orange elliptical orbit.

(We can’t move faster in our current orbit without continually applying trust because velocity determines altitude. The new orbit will pass through the point at which we applied the thrust, and so our new orbit cannot be a circle because distinct concentric circles don’t intersect.)

The point at which we first apply thrust will be the point of the new orbit closest the planet, the point with maximum kinetic energy. The point furthest from the planet, the point with maximum potential energy, will occur 180° later on the opposite side. The first burst of thrust is calculated so that the maximum altitude of the resulting elliptical orbit is the desired altitude of the new circular orbit.

Once the elliptical orbit is at its maximum distance from the planet, marked by the outer ×, we apply the second thrust.  The amount of thrust is whatever it needs to be in order to maintain a circular orbit at the new altitude. The second half of the elliptical orbit, indicated by the dashed orange curve, is not taken; it’s only drawn to show the orbit we would stay on if we didn’t apply the second thrust.

So in summary, we use one burst of thrust to enter an elliptic orbit, and one more burst of thrust to leave that elliptical orbit for the new circular orbit. There are ways to move between circular orbits more quickly, but they require more fuel.

The same principles work in reverse, and so you could also use a Hohmann transfer to descend from a higher orbit to a lower one. You would apply your thrust opposite direction of motion.

There are several idealizations to the Hohmann transfer orbit. The model assume orbits are planar, that the initial orbit and the final orbit are circular, and that the two burns are each instantaneous.

The Hohmann transfer also assumes that the mass of the spacecraft is negligible compared to the planet. This would apply, for example, to a typical communication satellite, but perhaps not to a Death Star.

More orbital mechanics posts

[1] If you’re moving from one orbit to another at 12 times the radius, then the bi-elliptic orbit maneuver would use less fuel. Instead of taking half of an elliptical orbit to make the transfer, it fires thrusters three times, using half each of two different elliptical orbits to reach the desired circular orbit.

Nuclear Tornadoes

Tornado with lightning

I’ve recently started reading One Giant Leap, a book about the Apollo missions. In a section on nuclear testing, the book describes what was believed to be a connection between weapon tests and tornadoes.

In 1953 and 1954 there were all-time record outbreaks of tornadoes across the US, accompanied by fierce thunderstorms. The belief that the tornadoes were caused by the atomic testing in Nevada was so common among the public that Gallup conducted a poll on the topic, which showed that 29% of Americans thought that A-bombs were causing tornadoes many states away; 20% couldn’t say whether they were or not. Members of Congress required the military to provide data on any connection.

It wasn’t unreasonable at the time to speculate that there could be a connection between nuclear testing and tornadoes. No one knew much about the effects of nuclear explosions or about the causes of tornadoes in 1954. But here’s the part that was preventable:

In fact it turned out there weren’t really more tornadoes in those years; it was just that tornado reporting and tracking had gotten much more effective and thorough.

Once the public got the impression that tornadoes were becoming more frequent, confirmation bias would firm up that impression with each tornado citing.

I imagine someone knew this in 1954. Someone must have looked at the data and said “Hey, wait a minute.” This person was probably ridiculed by the few who even listened.

Update: I found a newspaper article cited in One Giant Leap: “A-Bomb Link to Tornadoes is Discounted,” Washington Post. November 18, 1954. The article quoted D. Lee Harris of the United States Weather Bureau saying “Improvement in tornado reporting is largely responsible for the great increase in total number of the storms reported in recent years.”

The Gallup poll was prior to the article cited. It would be interesting if there had been a follow-up poll to see how much impact Harris’ statement had on public opinion.