Harmonic e

Douglas Hofstadter discovered that the 8th harmonic number equals e.

1 + \frac{1}{2} + \frac{1}{3} + \frac{1}{4} + \frac{1}{5} + \frac{1}{6}+ \frac{1}{7} + \frac{1}{8} = e

OK, not really. The following equation cannot possibly be true because the left side is rational and the right side is irrational.

However, Hofstadter showed that the equation does hold if you carry all calculations out to three decimal places.

    1.000
    0.500
    0.333
    0.250
    0.200
    0.167
    0.143
    0.125
    -----
    2.718

The following Python code gets the same result using four-place decimal arithmetic.

Here’s Python code to verify it.

    >>> from decimal import *
    >>> getcontext().prec = 4
    >>> sum(Decimal(1)/Decimal(k) for k in range(1, 9))
    Decimal('2.718')

Related posts

Voiced and unvoiced consonants and digits

The latest episode of The History of English Podcast discusses the history of pronunciation changes in the Elizabethan period. The episode has a lot to say about the connections between voiced and unvoiced pairs of consonants, and the circumstances under which a consonant might change from voiced to unvoiced and vice versa.

The major mnemonic system encodes digits as consonant sounds in order to make words out of numbers. The system works in terms of sounds, not spellings, and so some of the symbols below are not English letters but rather IPA symbols [1]. More on the system here.

0: S Z, 1: T D ð θ, 2: N ŋ, 3: M, 4: R, 5: L, 6: ʤ ʧ ʃ ʒ, 7: K G, 8: F V, 9: P B

If you’re not familiar with the concept of voiced and unvoiced vocal sounds it may seem arbitrary that, for example, the F and V sounds both decode to 8, or that the S and SH sounds map to different numbers, 0 and 6 respectively.

The allocation of sounds may seem inefficient at first.Some numbers get more sounds than others because some sounds belong to clusters of related sounds and some do not. For example, there’s no such thing as an unvoiced L sound, so 5 gets L and no other sound. But 8 gets P and B because these are unvoiced and voiced variations of the same sound.

The allocation is more uniform than it seems at first when you realize each digit gets one consonant group.

Related

[1] Here are the IPA symbols above that do not correspond to English letters.

|-----+---------|
| IPA | Example |
|-----+---------|
| ð   | THis    |
| θ   | THistle |
| ŋ   | siNG    |
| ʤ   | Jar     |
| ʧ   | CHurCH  |
| ʃ   | SHoe    |
| ʒ   | corsaGe |
|-----+---------|

For more on IPA, see the Wikipedia IPA help page.

Year share

This post will be about psychology as much as math, looking at a number of algorithms for mentally calculating the same function.

The most difficult part of mentally computing days of the week is computing

⌊5y/4⌋ % 7

where y is the last two digits of a year. This quantity is called the year share because it is the year’s contribution to the day-finding algorithm.

The most obvious way to compute this function is to take the whole number part of y/4, add it to y, and take the remainder when dividing by 7. In Python, this would be

    def method0(y): return (y + y//4) % 7

The only reason there’s more to say is that the ease of description is not the same as ease of execution. There are several other methods that some people find easier to use.

Outline

In [1] the author gives several methods for computing the year share. Each of these methods has some advantage over the most direct approach. They generally work with smaller numbers and take advantage of operations that are mentally trivial such as splitting a number into its tens and ones digits.

I’ll describe one of these methods in words, the so-called “odd + 11” method, then implement all the methods in Python. Each method looks more complicated than the direct method if you just glance at the code. But if you mentally execute the code you can see why each method could be easier for a human to carry out.

You can find algebraic proofs of why each method works in [1]. Here I’ll use brute force: the methods only have to work for integer values of y from 0 to 99, so I’ll let Python verify that the methods all produce the same result. This brute force approach uncovered an error in the the paper that I correct in the code below.

Odd + 11 method

Here’s the “odd + 11” method:

  1. If y is odd, add 11 to y.
  2. Divide y by 2.
  3. If y is odd after the step above, add 11.
  4. Set y to the remainder when dividing y by 7.
  5. Return 7 – y.

This method requires adding no more than 11, and it only requires one mental “register.” That is, all operations are done in place. With the direct method, you have to hold y and ⌊y/4⌋ in your head.

Python implementations

Here are Python implementations of 10 methods of computing a value congruent to ⌊5y/4⌋ mod 7. That is, the methods may produce different values, but the values differ by multiples of 7.

Note that the last three methods only require working with single-digit numbers.

Method #6 may be the easiest method to use. YMMV.

    from math import floor
    
    def method0(y):
        return y + y//4
    
    def method1(y):
        if y % 2 == 1:
            y += 11
        y /= 2
        if y % 2 == 1:
            y += 11
        y %= 7
        return 7 - y
    
    def method2(y):
        parity1 = y % 2
        if parity1:
            y -= 3
        y /= 2
        parity2 = y % 2
        if parity1 != parity2:
            y -= 3
        return -y 
    
    def method3(y):
        t = y % 12
        return y // 12 + t + t//4
    
    def method4(y):
        r = y % 4
        y -= r
        return r - y//2
    
    def method5(y):
        # placeholder to maintain the numbering in the paper
        return method0(y)
    
    def method6(y):
        r = y % 4
        y -= r # i.e. latest leap year
        t = y // 10
        u = y % 10
        return 2*t - u//2 + r
    
    def method7(y):
        t = y // 10
        u = y % 10
        x = (2*t + u) // 4
        x += u
        # The paper says to return 2t - x but it should be the opposite.
        return x - 2*t
    
    def method8(y):
        t = y // 10
        u = y % 10
        p = t % 2
        return (2*t + 10*p + u + (2*p + u)//4) % 7
    
    def method9(y):
        t = y // 10
        u = y % 10    
        return u - t + floor(u/4 - t/2)
    
    # verify all resultss are equal mod 7
    for y in range(100):
        for m in range(10):
            f = eval("method" + str(m))
            assert ((method0(y) - f(y)) % 7 == 0)

Related posts

[1] S. Kamal Abdali. inding the year’s share in day-of-week calculations. Recreational Mathematics Magazine. Number 6, December 2016.

Decoding a grid square

I saw a reference last night to the grid square EL29fx and wanted to figure out where that is. There are many programs that will do this for you, but I wanted to do it by hand.

I wrote about how grid squares work a year ago, but I was rusty on the details, so this morning I went back and reread my earlier post.

Parsing a locator

The system used to create grid squares in amateur radio is the Maidenhead Locator System. Last year I wrote about why the system works the way it does. Here I will recap how the system works but refer you to the earlier post for the motivation behind the design choices.

First of all, the Maidenhead system interleaves the longitude and latitude information. So in the square EL29fx, the longitude information is (E, 2, f) and the latitude information is (L, 9, x). The first two characters, EL, gives a rough location, a “field.” Then the next two characters, 29, refine the location to a “square.” The last two, fx, narrow the square down to a “subsquare.”[1]

Finding the field

The first character in the grid square specifies a range of longitude east of the International Date Line [2]. Each letter represents a range of 20° longitude: A for 0 to 20° longitude, B for 20° to 40°, etc. So E represents 80° to 100° longitude east of the antemeridian or 260° to 280° east of the prime meridian.

The second character in the grid square represents latitude north of the south pole in increments of 10°. So L represents 110° to 120° north of the south pole, or 20° to 30° north of the equator.

Together, EL tells us we’re looking at some place 260° to 280° E and 20° to 30° N, i.e. somewhere in Mexico or the US Gulf Coast.

Finding the square

The two digits in the middle of a grid square divide the field into 10 parts in both directions. Since a field is 20° wide and 10° tall, the first digit represents multiples of 2° latitude and the second digit represent multiples of 1° latitude.

So if we look at EL29, we know our location is between 264° and 266° E and between 29° and 30° N, which is somewhere around Houston, Texas

Finding the subsquare

A subsquare divides a square into 24 parts in each direction. So a subsquare is 5 minutes of longitude wide and 2.5 minutes of latitude high.

The f in EL29fx refines the longitude is 264° 25′ E (95° 35′ W) and the x refines the latitude is 29° 55′ N. This narrows the location to somewhere in northwest Houston.

Northwest Houston

[1] There’s no significance to the case of the letters other than as a reminder that the first and second pair of letters are interpreted differently.

[2] Technically, degrees east of 180th meridian, a.k.a. the antemeridian, which mostly coincides with the International Date Line. The latter zigzags a little to avoid splitting island groups.

Exponential of a line

Draw a line in the complex plane. What is the image of that line when you apply the exponential function?

A line through w with direction z is the set of points w + tz where w and z are complex and t ranges over the real numbers. The image of this line is

exp(w+ tz) = exp(w) exp(tz)

and we can see from this that character of the image mostly depends on z. We can first plot exp(tz) then rotate and stretch or shrink the result by multiplying by exp(w).

If z is real, so the line is horizontal, then the image of the line is a straight line.

If z is purely imaginary, so the line is vertical, then the image of the line is a circle.

But for a general value of z, corresponding to neither a perfectly horizontal nor perfectly vertical line, the image is a spiral.

Since the imaginary part of z is not zero, the image curves. And since the real part of z is not zero, the image spirals into the origin, either as t goes to infinity or to negative infinity, depending on the sign of the real part of z.

Here’s an example, the image of the line through (3, 0) with slope 2.

exponential spiral plot

Discrete derivatives

If you’ve taken calculus, and someone asks you what the derivative of x5 is, you can say without hesitation that it’s 5x4.

Now suppose they come back and say, “I’m sorry. I forgot to give you any context. Here x5 is a polynomial in the field of 343 elements.”

It turns out that this additional information does not change your answer, but it greatly changes how things are defined.

Derivatives are defined in terms of limits, but you don’t have limits over a finite field. And not only that, polynomials aren’t even functions when you’re working over a finite field [1].

You can define the derivative of a polynomial over any field to be just what you’d get if you were working over real numbers, if you interpreting things correctly. That works, and it’s useful. You can use this derivative in a purely algebraic setting to do some of the same sorts of things you’d use a derivative for in analysis, such as determining whether a polynomial has a multiple root.

You can define the derivative of a term

axn

to be

naxn-1

and define the derivative of a polynomial to be the sum of the derivatives of each term.

Here a is an element of the base field and n is a positive integer. The expression na indicates the sum of the field element a with itself n times. (You can’t just multiply an integer by an element of an arbitrary field.)

This post was motivated by a tweet from Sam Walters this morning.

[1] When you’re working with polynomials over a finite field, a polynomial is not a function. You could say that a polynomial is never a function, but the distinction is usually pedantic. When you’re working over an infinite field, you won’t get in trouble if you think of polynomials as functions. Even over a finite field you often won’t get into trouble, though sometimes you will.

 

Chemical element abbreviation patterns

I’ve wondered occasionally about the patterns in how chemical elements are abbreviated. If you don’t know the abbreviation for an element, is there a simple algorithm that would let you narrow the range of possibilities or improve your odds at guessing?

Here’s a survey of how the elements are abbreviated.

Latin and German

The elements that have been known the longest often have abbreviations that are mnemonic in Latin.

  • Iron (Fe)
  • Sodium (Na)
  • Silver (Ag)
  • Tin (Sn)
  • Antimony (Sb)
  • Tungsten (W)
  • Gold (Au)
  • Mercury (Hg)
  • Lead (Pb)
  • Potassium (K)
  • Copper (Cu)

I included Tungsten in this section because it also has an abbreviation that is mnemonic in another language, in this case German.

Initial letter

The easiest abbreviations to remember are simply the first letters of the element names (in English).

  • Boron (B)
  • Carbon (C)
  • Fluorine (F)
  • Hydrogen (H)
  • Iodine (I)
  • Nitrogen (N)
  • Oxygen (O)
  • Phosphorus (P)
  • Sulfur (S)
  • Uranium (U)
  • Vanadium (V)
  • Yttrium (Y)

First two letters

The largest group of elements are those abbreviated by the first two letters of their name. When in doubt, guess the first two letters.

  • Actinium (Ac)
  • Aluminum (Al)
  • Americium (Am)
  • Argon (Ar)
  • Barium (Ba)
  • Beryllium (Be)
  • Bismuth (Bi)
  • Bromine (Br)
  • Calcium (Ca)
  • Cerium (Ce)
  • Chlorine (Cl)
  • Cobalt (Co)
  • Dysprosium (Dy)
  • Erbium (Er)
  • Europium (Eu)
  • Flerovium (Fl)
  • Francium (Fr)
  • Gallium (Ga)
  • Germanium (Ge)
  • Helium (He)
  • Holmium (Ho)
  • Indium (In)
  • Iridium (Ir)
  • Krypton (Kr)
  • Lanthanum (La)
  • Lithium (Li)
  • Lutetium (Lu)
  • Molybdenum (Mo)
  • Neon (Ne)
  • Nickel (Ni)
  • Nobelium (No)
  • Oganesson (Og)
  • Osmium (Os)
  • Polonium (Po)
  • Praseodymium (Pr)
  • Radium (Ra)
  • Rhodium (Rh)
  • Ruthenium (Ru)
  • Scandium (Sc)
  • Selenium (Se)
  • Silicon (Si)
  • Tantalum (Ta)
  • Tellurium (Te)
  • Thorium (Th)
  • Titanium (Ti)
  • Xenon (Xe)

Many of these elements use the first two letters to avoid a conflict with the first letter. For example, helium uses He because hydrogen already took H.

There are several elements that start with the same letter, and no element uses just the first letter. For example: actinium, aluminum, americium, and argon.

Xenon could have been X, or dysprosium could have been just D, but that’s not how it was done.

First letter and next consonant

The next largest group of elements are abbreviated by their first letter and the next consonant, skipping over a vowel.

  • Bohrium (Bh)
  • Cadmium (Cd)
  • Cesium (Cs)
  • Dubnium (Db)
  • Gadolinium (Gd)
  • Hafnium (Hf)
  • Hassium (Hs)
  • Livermorium (Lv)
  • Magnesium (Mg)
  • Manganese (Mn)
  • Meitnerium (Mt)
  • Neodymium (Nd)
  • Neptunium (Np)
  • Nihonium (Nh)
  • Niobium (Nb)
  • Rubidium (Rb)
  • Samarium (Sm)
  • Technetium (Tc)
  • Zinc (Zn)
  • Zirconium (Zr)

Many of these elements would cause a conflict if they had been abbreviated using one of the above rules. For example, cadmium could not be C because that’s carbon, and it could not be Ca because that’s calcium.

Initials of first two syllables

  • Astatine (At)
  • Berkelium (Bk)
  • Darmstadtium (Ds)
  • Einsteinium (Es)
  • Fermium (Fm)
  • Lawrencium (Lr)
  • Mendelevium (Md)
  • Moscovium (Mc)
  • Platinum (Pt)
  • Promethium (Pm)
  • Roentgenium (Rg)
  • Terbium (Tb)
  • Thallium (Tl)

Initials of first and third syllable

  • Californium (Cf)
  • Copernicium (Cn)
  • Palladium (Pd)
  • Rutherfordium (Rf)
  • Seaborgium (Sg)
  • Tennessine (Ts)
  • Ytterbium (Yb)

First and last letter

  • Curium (Cm)
  • Radon (Rn)
  • Thulium (Tm)

Miscellaneous

  • Arsenic (As)
  • Chromium (Cr)
  • Plutonium (Pu)
  • Protactinium (Pa)
  • Rhenium (Re)
  • Strontium (Sr)

Related posts

Mental hash function

A few years ago I wrote about Manual Blum’s proposed method for mentally computing a secure hash function. He proposed using this method as a password manager, using the hash of a web site’s name as the password for the site.

I first wrote about Blum’s method on the Heidelberg Laureate Forum blog, then wrote some footnotes to the post here.

NB: This method is insecure for reasons Rob Shearer describes in the comments.

Algorithm

Blum’s algorithm requires creating and memorizing two things: a random map from letters to digits, and a random permutation of digits.

Let f be a function that maps the set {A, B, C, …, Z} to the set {0, 1, 2, …, 9} created by a random number generator, and let g be a random permutation of {0, 1, 2, …, 9}.

Start with the text you want to hash. Then use f to convert the text to a sequence of digits, replacing each character c with f(c). Label this sequence of digits

a0 a1 a2an.

To create the first digit of your hash value, add the first and last digits of your sequence of digits, take the last digit of the sum, and apply the permutation g to it. In symbols, the first digit of the output, b0, is given by

b0 = g( (a0 + an) % 10 )

where a % 10 is the remainder when a is divided by 10.

For the rest of the digits, add the previous output to the current input, take the last digit, and apply the permutation. That is, for k > 0,

bk = g( (bk-1 + ak) % 10 ).

A few years ago Blum recommended using at least 12 characters.

Python code

Here’s some Python code to implement the method above and make all the details explicit.

    from numpy.random import default_rng
    
    rng = default_rng(20220516)
    
    fdata = rng.integers(10, size=26)
    def f(ch):
        ch = ch.lower()
        return fdata[ord(ch) - ord('a')]
    
    gdata = rng.permutation(10)
    def g(n):
        i = list(gdata).index(n)
        return gdata[(i+1) % 10]
    def hash(text):
        digits = [f(c) for c in text]
        N = len(text)
        out = [0] * N
        out[0] = g((digits[0] + digits[-1]) % 10)
        for i in range(1, N):
            out[i] = g((out[i-1] + digits[i]) % 10)
        return "".join(str(d) for d in out)
    
    print(hash("hello"))

This prints 26752.

Is mental cryptography hopeless?

It’s been eight years since I talked to Manuel Blum. Perhaps he realized years ago that this system is flawed. But I still find his initial question interesting: are there encryption algorithms that humans can execute mentally that cannot be easily broken by computers?

It would be interesting if such an algorithm could at least put up a good fight. Cryptographic algorithms are considered broken if they can be compromised given a few weeks of computing. Could a mental hashing algorithm at least withstand an hour of computer attack?

Sampling with replacement until you’ve seen everything

Suppose you have a standard deck of 52 cards. You pull out a card, put it back in the deck, shuffle, and pull out another card. How long would you expect to do this until you’ve seen every card?

Here’s a variation on the same problem. Suppose you’re a park ranger keeping data on tagged wolves in a national park. Each day you catch a wolf at random, collect some data, and release it. If there are 100 wolves in the park, how long until you’ve seen every wolf at least once?

We’ll solve this problem via simulation, then exactly.

Simulation

The Python code to solve the problem via simulation is straight forward. Here we’ll simulate our ranger catching and releasing one of 100 wolves. We’ll repeat our simulation five times to get an idea how much the results vary.

    import numpy as np
    from numpy.random import default_rng

    rng = default_rng()
    num_runs = 5
    N = 100 # number of unique items

    for run in range(num_runs):
        tally = np.zeros(N,dtype=int)
        trials = 0
        while min(tally) == 0:
            tally[rng.integers(N)] += 1
            trials += 1
        print(trials)

When I ran this, I got 846, 842, 676, 398, and 420.

Exact solution

Suppose we have a desk of N cards, and we sample the deck with replacement M times. We can select the first card N ways. Ditto for the second, third, and all the rest of the cards. so there are NM possible sequences of card draws.

How many of these sequences include each card at least once? To answer the question, we look at Richard Stanley’s twelvefold way. We’re looking for the number of ways to allocate M balls into N urns such that each urn contains at least one ball. The answer is

N! S(M, N)

where S(M, N) is the Stirling number of the second kind with parameters M and N [1].

So the probability of having seen each card at least once after selecting M cards randomly with replacement is

N! S(M, N) / NM.

Computing Stirling numbers

We’ve reduced our problem to the problem of computing Stirling numbers (of the second kind), so how do we do that?

We can compute Stirling numbers in terms of binomial coefficients as follows.

S(n,k) = \frac{1}{k!}\sum_{i = 0}^k (-1)^i {k \choose i} (k-i)^n

Now we have a practical problem if we want to use this formula for larger values of n and k. If we work with floating point numbers, we’re likely to run into overflow. This is the perennial with probability calculations. You may need to compute a moderate-sized number as the ratio of two enormous numbers. Even though the final result is between 0 and 1, the intermediate results may be too large to store in a floating point number. And even if we don’t overflow, we may lose precision because we’re working with an alternating sum.

The usual way to deal with overflow is to work on a log scale. But that won’t work for Stirling numbers because we’re adding terms together, not multiplying them.

One way to solve our problem—not the most efficient way but a way that will work—is to work with integers. This is easy in Python because integers by default can be arbitrarily large.

There are two ways to compute binomial coefficients in Python: scipy.special.binom and math.comb. The former uses floating point operations and the latter uses integers, so we need to use the latter.

We can compute the numerator of our probability with

    from math import comb
    def k_factorial_stirling(n, k):
        return sum((-1)**i * comb(k, i)*(k-i)**n for i in range(k+1))    

Then our probability is

    k_factorial_stirling(M, N) / N**M

If we compute the probability that our ranger has seen all 100 wolves after catching and releasing 500 times, we get 0.5116. So the median number of catch and release cycles is somewhere around 500, which is consistent with our simulation above.

Note that the numerator and denominator in our calculation are both on the order of 101000 while the largest floating point number is on the order of 10308, so we would have overflowed if we’d used binom instead of comb.

Related posts

[1] It’s unfortunate that these were called the “second kind” because they’re the kind that come up most often in application.

Image: “Fastening a Collar” by USFWS Mountain Prairie is licensed under CC BY 2.0 .

Tool recursion

“Literature about Lisp rarely resists that narcissistic pleasure of describing Lisp in Lisp.” — Christian Queinnec, Lisp in Small Pieces

 

Applying software development tools to themselves has a dark side and a light side.

There’s a danger of becoming obsessed with one’s tools and never getting around to using them. If it’s your job to cut down a tree, there’s some benefit to sharpening your ax, but you can’t only sharpen your ax. At some point you hit diminishing return and it’s time to start chopping.

But there are benefits to self-referential systems, such as macros that use Lisp to generate Lisp, or writing a C compiler in C, or using Emacs to tweak Emacs. There’s a kind of consistency that results, and there can be a compound return on effort. But as with writing a recursive procedure, there has to be a base case, a point at which you stop recursing. Otherwise you go down the black hole of becoming absorbed in your tool and never using it for work.

Even though I’ve used Emacs for a long time, I’ve never understood the recursive fascination some people have with it. For example, part of the elevator pitch for Emacs is that it’s self-documenting. You can pull up help on Emacs from inside Emacs. But you can also type your questions into a search engine without having to learn an arcane help API. What’s the advantage to the former?

For one thing, using Emacs help inside Emacs works without a network connection. For another, you avoid the risk of being distracted by something you see when you’re using your web browser. But the most subtle benefit is the compound effect of self-reference. You can use the same navigation commands in the help system that you can when editing text, you can execute code snippets in place, etc.

When I hear “Isn’t it cool that you can do X in X?” my first thought is “Yeah, that sounds cool” but my second thought is “But why would you want to do that? Sounds like it could be really hard.” I’m starting to appreciate that there are sometimes long-term benefits to these sort of recursive tool uses even if they’re not optimal in the short run.