Polish serenity

Posted on 3 February 2026 by John

Yesterday I ran across the following mashup by Amy Swearer of a Polish proverb and the Serenity Prayer.

Lord, grant me the serenity to accept when it’s no longer my circus,
the courage to control the monkeys that are still mine,
and the wisdom to know the difference.

The proverb is “Nie mój cyrk, nie moje małpy,” literally “Not my circus, not my monkeys”.

Memorizing chemical element symbols

Posted on 7 January 2026 by John

Here’s something I’ve wondered about before: are there good mnemonics for chemical element symbols?

Some element symbols are based on Latin or German names and seem arbitrary to English speakers, such as K (kalium) for potassium or Fe (ferrum) for iron. However, these elements are very common and so their names and symbols are familiar.

When you take out the elements whose symbols are mnemonic in another language, every element symbol begins with the first letter of the element name. The tricky part is the second letter. For example, does Ra stand for radon or radium?

The following rule of thumb usually holds whenever there is a chemical symbol what corresponds to the first letters of two different elements:

The lightest/longest-known element wins.

Scientists didn’t wait until the periodic table was complete before assigning symbols, and the easiest names were handed out first. Calcium (20) was assigned Ca, for example, before cadmium (48) and californium (98) were known.

The elements were discovered roughly in order of atomic weight. For example, beryllium (4) was discovered before berkelium (97) and neon (10) was discovered before neptunium (93). So sometimes you can substitute knowledge of chemistry for knowledge of history. [1]

There are instances where the heavier element got to claim the first-two-letter symbol. Usually the heavier element was discovered first. That’s why Ra stands for radium (88) and not radon (86). One glaring exception to this rule is that palladium (Pd) was discovered a century before protactinium (Pa).

Often the element that was discovered first is more familiar, and so you could almost say that when there’s a conflict, the more familiar element wins. For example, Li stands for lithium and not livermorium. This revises our rule of thumb above:

The lightest/longest-known/most familiar element wins.

To return to the question at the top of the post, I’m not aware of a satisfying set of mnemonics for chemical element symbols. But there are some heuristics. Generally the elements that are the lightest, most familiar, and have been known the longest get the simpler names. Maybe you can remember, for example, that berkelium must be Bk because B, Be, and Br were already taken by the time berkelium was discovered.

After using this heuristic, you could apply more brute-force mnemonic techniques for whenever the heuristic doesn’t work. (Whenever it doesn’t work for you: mnemonics are very personal.) For example, you might imagine a registered nurse (an RN) spraying the insecticide Raid on a fish, fish being a Major system encoding of the number 86, the atomic number of radon.

[1] Chemical elements named after scientists, planets, and laboratories appear toward the end of the table and are recent discoveries.

When was Newton born?

Posted on 23 December 2025 by John

A young Isaac Newton unwrapping and apple as a Christmas present.

Newton’s birthday was on Christmas when he was born, but now his birthday is not.

When Newton was born, England was still using the Julian calendar, and would continue to use the Julian calendar until 25 years after his death.

On the day of Newton’s birth, his parents would have said the date was December 25, 1642. We would now describe the date as January 4, 1643.

You’ll sometimes see Newton’s birthday written as December 25, 1642 O.S. The “O.S.” stands for “Old Style,” i.e. Julian calendar. Of course the Newton family would not have written O.S. because there was no old style until the new style (i.e. Gregorian calendar) was adopted, just as nobody living in the years before Christ would have written a date as B.C.

In a nutshell, the Julian year was too long, which made it drift out of sync with the astronomical calendar. The Julian year was 365 ¹/₄ days, whereas the Gregorian calendar has 365 ⁹⁷/₄₀₀ days, which more closely matches the time it takes Earth to orbit the sun. Removing three Leap Days (in centuries not divisible by 400) put the calendar back in sync. When countries adopted the Gregorian calendar, they had to retroactively remove excess Leap Days. That’s why Newton’s birthday got moved up 10 days.

You can read more on the Julian and Gregorian calendars here.

The winter solstice in the northern hemisphere was two days ago: December 21, 2025. And in 1642, using the Gregorian calendar, the solstice was also on December 21. But in England, in 1642, people would have said the solstice occurred on December 11, because the civil calendar was 10 days ahead of the astronomical calendar.

Rolling n-sided dice to get at least n

Posted on 10 December 2025 by John

Dungeons and Dragons dice

Say you have a common 6-sided die and need to roll it until the sum of your rolls is at least 6. How many times would you need to roll?

If you had a 20-sided die and you need to roll for a sum of at least 20, would that take more rolls or fewer rolls on average?

According to [1], the expected number of rolls of an n-sided dice for the sum of the rolls to be n or more equals

$\left(1 + \frac{1}{n}\right)^{n-1}$

So for a 6-sided die, the expected number of rolls is (7/6)⁵ = 2.1614.

For a 20-sided die, the expected number of rolls is (21/20)¹⁹ = 2.5270.

The expected number of rolls is an increasing function of n, and it converges to e.

Here’s a little simulation script for the result above.

from numpy.random import randint

def game(n):
    s = 0
    i = 0
    while s < n:
        s += randint(1, n+1)
        i += 1
    return i

N = 1_000_000
s = 0
n = 20
for _ in range(N):
    s += game(n)
print(s / N)

This produced 2.5273.

[1] Enrique Treviño. Expected Number of Dice Rolls for the Sum to Reach n. American Mathematical Monthly, Vol 127, No. 3 (March 2020), p. 257.

TV tuned to a dead channel

Posted on 24 November 2025 by John

The opening line of William Gibson’s novel Neuromancer is famous:

The sky above the port was the color of a television, tuned to a dead channel.

When I read this line, I knew immediately what he meant, and thought it was a brilliant line. Later I learned that younger readers didn’t know what he was saying.

TV tuned to a dead channel circa 1960

My mind went to an old black-and-white television, one that received analog broadcasts, and that displayed “snow” when tuned to a channel that had no broadcast signal. Someone whose earliest memories of television are based on digital color broadcast might imagine the sky above the port was solid blue rather than crackly gray.

Gibson discusses how his book has aged in a preface to a recent edition. He says that science fiction that is too prescient would be received poorly.

Imagine a novel from the sixties whose author had somehow fully envisioned cellular telephony circa 2004, and had worked it, exactly as we know it today, into the fabric of her imaginary future. Such a book would have seemed highly peculiar in the sixties … in ways that would quickly overwhelm the narrative.

He then goes on to say

I suspect that Neuromancer owes much of its shelf life to my almost perfect ignorance of the technology I was extrapolating from. … Where I made things up from whole cloth, the colors remain bright.

I find it odd that many judge a work of science fiction by what it “got right.” I don’t read science fiction as a forecast; read it to enjoy a story. I don’t need a book to be prescient, but until reading Gibson’s remarks it hadn’t occurred to me that fiction that is too prescient might not be enjoyable fiction, at least for its first readers.

Four generalizations of the Pythagorean theorem

Posted on 13 November 2025 by John

Here are four theorems that generalize the Pythagorean theorem. Follow the links for more details regarding each equation.

1. Theorem by Apollonius for general triangles.

2. Edsgar Dijkstra’s extension of the Pythagorean theorem for general triangles.

$\text{sgn}(\alpha + \beta - \gamma) = \text{sgn}(a^2 + b^2 - c^2)$

3. A generalization of the Pythagorean theorem to tetrahedra.

$V_0^2 = \sum_{i=1}^n V_i^2$

4. A unified Pythagorean theorem that covers spherical, plane, and hyperbolic geometry.

$A(c) = A(a) + A(b) - \kappa \frac{A(a) \, A(b)}{2\pi}$

Weighting an average to minimize variance

Posted on 12 November 2025 by John

Suppose you have $100 to invest in two independent assets, A and B, and you want to minimize volatility. Suppose A is more volatile than B. Then putting all your money on A would be the worst thing to do, but putting all your money on B would not be the best thing to do.

The optimal allocation would be some mix of A and B, with more (but not all) going to B. We will formalize this problem and determine the optimal allocation, then generalize the problem to more assets.

Two variables

Let X and Y be two independent random variables with finite variance and assume at least one of X and Y is not constant. We want to find t that minimizes

$\text{Var}[tX + (1-t)Y]$

subject to the constraint 0 ≤ t ≤ 1. Because X and Y are independent,

$\text{Var}[tX + (1-t)Y] = t^2 \text{Var}[X] + (1-t)^2 \text{Var}[Y]$

Taking the derivative with respect to t and setting it to zero shows that

$t = \frac{\text{Var}[Y]}{\text{Var}[X] + \text{Var}[Y]}$

So the smaller the variance on Y, the less we allocate to X. If Y is constant, we allocate nothing to X and go all in on Y. If X and Y have equal variance, we allocate an equal amount to each. If X has twice the variance of Y, we allocate 1/3 to X and 2/3 to Y.

Multiple variables

Now suppose we have n independent random variables X_i for i running from 1 to n, and at least one of the variables is not constant. Then we want to minimize

$\text{Var}\left[ \sum_{i=1}^n t_i X_i \right] = \sum_{i=1}^n t_i^2 \text{Var}[X_i]$

subject to the constraint

$\sum_{i=1}^n t_i = 1$

and all t_i non-negative. We can solve this optimization problem with Lagrange multipliers and find that

$t_i \text{Var}[X_i] = t_j \text{Var}[X_j]$

for all 1 ≤ i, j ≤ n. These (n − 1) equations along with the constraint that all the t_i sum to 1 give us a system of equations whose solution is

$t_i = \frac{\prod_{j \ne i} \text{Var}[X_j]}{\sum_{i = 1}^n \prod_{j \ne i} \text{Var}[X_j]}$

Incidentally, the denominator has a name: the (n − 1)st elementary symmetric polynomial in n variables. More on this in the next post.

How much is a gigawatt?

Posted on 7 November 2025 by John

There’s increasing talk of gigawatt data centers. Currently the largest data center, Switch’s Citadel Campus in Nevada, uses 850 megawatts of power. OpenAI’s Stargate data center, under construction, is supposed to use 1.2 gigawatts.

Gigawatt

An average French nuclear reactor produces about a gigawatt of power. If the US were allowed build nuclear reactors, we could simply build one reactor for every gigantic data center. Unfortunately, the Nuclear Regulatory Commission essentially prohibits the construction of profitable nuclear reactors.

An American home uses about 1200 watts of power, so a gigawatt of electricity could power 800,000 homes. So roughly, a gigawatt is a megahome.

Gigawatt-year

A gigawatt is a unit of power, not energy. Energy is power over some time period.

A gigawatt-year is about 3 × 10¹⁶ joules, or 30 petajoules.

A SpaceX Starship launch releases 50 terajoules of energy, so a gigawatt-year is 60 Starship launches.

A couple months ago I wrote about illustrating cryptographic strength in terms of the amount of energy needed to break it, and how much water that much energy would boil. Let’s do something similar for a gigawatt-year.

It takes about 300 kilojoules of energy to boil a liter of water [1], so 30 petajoules would boil 100 billion liters of water. So a gigawatt-year of energy would be enough to boil Coniston Water, the third largest lake in England.

If you could convert a kilogram of matter to energy according to E = mc², this would release 90 petajoules. So a gigawatt-year is the energy in about 300 grams of matter.

[1] In detail, boiling a liter of water is defined as increases the temperature from 20° C to 100° C at sea level.

Impedance and Triangular Numbers

Posted on 2 November 2025 by John

A few days ago I wrote two posts about how to create a Smith chart, a graphical device used for impedance calculations. Then someone emailed me to point out the connection between the Smith chart and triangular numbers.

The Smith chart is the image of a rectangular grid in the right half-plane under the function

f(z) = (z − 1)/(z + 1).

If you subtract the values of f at consecutive integers, you get the reciprocal of a triangular number.

f(n) − f(n − 1) = 2/(n(n + 1)) = 1 / T_n

Or to put it another way,

f(n) − f(n − 1) = 1 / (1 + 2 + 3 + … + n).

In the first post on the Smith chart we showed that the function f maps vertical lines

in the z plane to circles in the w plane all touching at w = 1.

The circles are symmetric about the real axis and the diameter runs from f(n) to 1. The separation between the circles on the left side is thus

f(n) − f(n − 1) = 1 / T_n.

Number the circles starting from the outermost as 0, 1, 2, …. Then the maximum distance between circle n and circle n − 1 is 1 / T_n. You can see in the graph above that the distance between circle 0 and circle 1 is 1. It’s a little harder to see that the distance between circle 1 and circle 2 is 1/3. It looks like the distance between circles 2 and 3 is about half of that between circles 1 and 2, so it would be 1/6.

Cross ratio

Posted on 1 November 2025 by John

The cross ratio of four points A, B, C, D is defined by

$(A, B; C, D) = \frac{AC \cdot BD}{BC \cdot AD}$

where XY denotes the length of the line segment from X to Y.

The idea of a cross ratio goes back at least as far as Pappus of Alexandria (c. 290 – c. 350 AD). Numerous theorems from geometry are stated in terms of the cross ratio. For example, the cross ratio of four points is unchanged under a projective transformation.

Complex numbers

The cross ratio of four (extended [1]) complex numbers is defined by

$(z_1, z_2; z_3, z_4) = \frac{(z_3 - z_1)(z_4 - z_2)}{(z_3 - z_2)(z_4 - z_1)}$

The absolute value of the complex cross ratio is the cross ratio of the four numbers as points in a plane.

The cross ratio is invariant under Möbius transformations, i.e. if T is any Möbius transformation, then

This is connected to the invariance of the cross ratio in geometry: Möbius transformations are projective transformations on a complex projective line. (More on that here.)

If we fix the first three arguments but leave the last argument variable, then

$T(z) = (z_1, z_2; z_3, z) = \frac{(z_3 - z_1)(z - z_2)}{(z_3 - z_2)(z - z_1)}$

is the unique Möbius transformation mapping z₁, z₂, and z₃ to ∞, 0, and 1 respectively.

The anharmonic group

Suppose (a, b; c, d) = λ ≠ 1. Then there are 4! = 24 permutations of the arguments and 6 corresponding cross ratios:

$\lambda, \frac{1}{\lambda}, 1 - \lambda, \frac{1}{1 - \lambda}, \frac{\lambda - 1}{\lambda}, \frac{\lambda}{\lambda - 1}$

Viewed as functions of λ, these six functions form a group, generated by

$\begin{align*} f(\lambda) &= \frac{1}{\lambda} \\ g(\lambda) &= 1 - \lambda \end{align*}$

This group is called the anharmonic group. Four numbers are said to be in harmonic relation if their cross ratio is 1, so the requirement that λ ≠ 1 says that the four numbers are anharmonic.

The six elements of the group can be written as

$\begin{align*} f(\lambda) &= \frac{1}{\lambda} \\ g(\lambda) &= 1 - \lambda \\ f(f(\lambda)) &= g(g(\lambda) = z \\ f(g(\lambda)) &= \frac{1}{\lambda - 1} \\ g(f(\lambda)) &= \frac{\lambda - 1}{\lambda} \\ f(g(f(\lambda))) &= g(f(g(\lambda))) = \frac{\lambda}{\lambda - 1} \end{align*}$

Hypergeometric transformations

When I was looking at the six possible cross ratios for permutations of the arguments, I thought about where I’d seen them before: the linear transformation formulas for hypergeometric functions. These are, for example, equations 15.3.3 through 15.3.9 in A&S. They relate the hypergeometric function F(a, b; c; z) to similar functions where the argument z is replaced with one of the elements of the anharmonic group.

I’ve written about these transformations before here. For example,

$F(a, b; c; z) = (1-z)^{-a} F\left(a, c-b; c; \frac{z}{z-1} \right)$

There are deep relationships between hypergeometric functions and projective geometry, so I assume there’s an elegant explanation for the similarity between the transformation formulas and the anharmonic group, though I can’t say right now what it is.

[1] For completeness we need to include a point at infinity. If one of the z equals ∞ then the terms involving ∞ are dropped from the definition of the cross ratio.

Uncategorized

Polish serenity

Memorizing chemical element symbols

Related posts

When was Newton born?

Rolling n-sided dice to get at least n

TV tuned to a dead channel

Four generalizations of the Pythagorean theorem

Weighting an average to minimize variance

Two variables

Multiple variables

Related posts

How much is a gigawatt?

Gigawatt

Gigawatt-year

Impedance and Triangular Numbers

Related posts

Cross ratio

Complex numbers

The anharmonic group

Hypergeometric transformations

Related posts