Uncategorized

Morse code and the limits of human perception

Musician Adam Neely made a video asking What is the fastest music humanly possible? He emphasizes that he means the fastest music possible to hear, not the fastest to perform.

Screenshot of Bolton paper

The video cites a psychology article [1] from 1894 that found that most people can reliably distinguish an inter-onset interval (time between notes) of 100 ms [2]. It also gives examples of music faster than this, such as a performance of Vivaldi with an inter-onset interval of 83 ms [3]. The limit seem to be greater than 50 ms because a pulse train with an inter-onset interval of 50 ms starts to sound like a 20 Hz pitch.

People are able to receive Morse code faster than this implies is possible. We will explain how this works, but first we need to convert words per minute to inter-onset interval length.

Morse code timing

Morse code is made of dots and dashes, but it is also made of spaces of three different lengths: the space between the dots and dashes representing a single letter, the space between letters, and the space between words.

According to an International Telecommunication Union standard

  • A dash is equal to three dots.
  • The space between the signals forming the same letter is equal to one dot.
  • The space between two letters is equal to three dots.
  • The space between two words is equal to seven dots.

The same timing is referred to as standard in a US Army manual from 1957,

Notice that all the numbers above are odd. Since a dot or dash is always followed by a space, the duration of a dot or dash and its trailing space is always an even multiple of the duration of a dot.

If we think of a dot as a sixteenth note, Morse code is made of notes that are either sixteenth notes or three sixteenth notes tied together. Rests are equal to one, three, or seven sixteenths, and notes and rests must alternate. All notes start on an eighth note boundary, i.e. either on a down beat or an up beat but not in between.

Words per minute

Morse code speed is measured in words per minute. But what exactly is a “word”? Words have a variable number of letters, and even words with the same number of letters can have very different durations in Morse code.

The most common standard is to use PARIS as the prototypical word. Ten words per minute, for example, means that dots and dashes are coming at you as fast as if someone were sending the word PARIS ten times per minute. Here’s a visualization of the code for PARIS with each square representing the duration of a dot.

■□■■■□■■■□■□□□■□■■■□□□■□■■■□■□□□■□■□□□■□■□■□□□□□□□□

This has the duration of 50 dots.

How does this relate to inter-onset interval? If each duration of a dot is an interval, then n words per minute corresponds to 50n intervals per minute, or 60/50n = 1.2/n seconds per interval.

This would imply that 12 wpm would correspond to an inter-onset interval of around 100 ms, pushing the limit of perception. But 12 wpm is a beginner speed. It’s not uncommon for people to receive Morse code at 30 wpm. The world record, set by Theodore Roosevelt McElroy in 1939, is 75.2 wpm.

What’s going on?

In the preceding section I assumed a dot is an interval when calculating inter-onset interval length. In musical terms, this is saying a sixteenth note is an interval. But maybe we should count eighth notes as intervals. As noted before, every “note” (dot or dash) starts on a down beat or up beat. Still, that would say 20 wpm is pushing the limit of perception, and we know people can listen faster than that.

You don’t have to hear with the precision of the diagram above in order to recognize letters. And you have context clues. Maybe you can’t hear the difference between “E E E” and “O”, but in ordinary prose the latter is far more likely.

E E E vs O

At some skill level people quit hearing individual letters and start hearing words, much like experienced readers see words rather than letters. I’ve heard that this transition happens somewhere between 20 wpm and 30 wpm. That would be consistent with the estimate above that 20 wpm is pushing the limit of perception letter by letter. But 30 words per minute is doable. It’s impressive, but not unheard of.

What I find hard to believe is that there were intelligence officers, such as Terry Jackson, who could copy encrypted text at 50 wpm. Here a “word” is a five-letter code group. There are millions of possible code groups [4], all equally likely, and so it would be impossible to become familiar with particular code groups the way one can become familiar with common words. Maybe they learned to hear pairs or triples of letters.

Related posts

[1] Thaddeus L. Bolton. Rhythm. The American Journal of Psychology. Vol. VI. No. 2. January, 1894. Available here.

[2] Interonset is not commonly hyphenated, but I’ve hyphenated it here for clarity.

[3] The movement Summer from Vivaldi’s The Four Seasons performed at 180 bpm which corresponds to 720 sixteenth notes per minute, each 83 ms long.

[4] If a code group consisted entirely of English letters, there are 265 = 11,881,376 possible groups. If a code group can contain digits as well, there would be 365 = 60,466,176 possible groups.

Estimating satellite altitude from apparent motion

If you see a light moving in the sky, how could you tell whether it’s an airplane or a satellite? If it is a satellite, could you tell how high of an orbit it’s in?

For an object in a circular orbit,

v² = μ/r

where r is the orbit radius and μ is the standard gravitational parameter. For a satellite orbiting the earth,

μ = 5.1658 × 1012  km³/hr².

The orbit radius r is the earth’s radius R plus the altitude a above the earth’s surface.

rRa

where the mean radius of the earth is R = 6,371 km.

Angular velocity is

vr = √(μ/r³)

This velocity is relative to an observer hovering far above the earth. An observer on the surface of the earth would need to subtract the earth’s angular velocity to get the apparent velocity [1].

from numpy import pi, sqrt

def angular_v(altitude): # in radians/hour
    R = 6371 # mean earth radius in km
    r = altitude + R
    mu = 5.1658e12 # km^3/hr^2
    return sqrt(mu/r**3)

def apparent_angular_v(altitude):
    earth_angular_v = 2*pi/(23 + 56/60 + 4/3600)
    return angular_v(altitude) - earth_angular_v

Here’s a plot of apparent angular velocity for altitudes between 350 km (initial orbit of a Starlink satellite) and geostationary orbit (35,786 km).

Radians per hour is a little hard to conceptualize; degrees per minute would be easier. But fortunately, the two are nearly the same. One radian per hour is 3/π = 0.955 degrees per minute.

The apparent size of the moon is about 0.5°, so you could estimate angular velocity by seeing how long it takes a satellite to cross the moon (or a region of space the width of the moon). It would only take an object in low earth orbit (LEO) a few seconds to cross the moon.

Can you see objects in medium earth orbit (MEO) or high earth orbit (HEO)? Yes. Although most satellites are in LEO, objects in higher orbits may be larger, and if they’re reflective enough they may be visible.

Airplanes move much slower than LEO satellites. An airliner at altitude 10 km moving 1000 km/hr would have an apparent angular velocity of 0.16 radians per hour. An object appearing to move that slowly is either an airplane or an object in HEO, and very likely the former.

[1] The earth completes a full rotation (2π radians) in 23 hours 56 minutes and 4 seconds, so its angular velocity is 0.2625 radians per hour. Why not 24 hours? That’s the length of a rotation of the earth relative to the sun. Since the earth moves in its orbit relative to the sun while it rotates, it has to rotate a little longer (i.e. for about 4 more minutes) to reach the same position relative to the sun.

Gluons, quarks, letters, and envelopes

Yesterday I wrote a couple of posts about a combinatorics question that lead to OEIS sequence A000255. That page has this intriguing line:

This comment derives from a family of recurrences found by Malin Sjodahl for a combinatorial problem for certain quark and gluon diagrams (Feb 27 2010)

I love how pulling on a thread can lead you to unexpected places. What in the world does my little permutation problem have to do with particle physics‽

I found Malin Sjödahl’s website, and apparently the citation above refers to this paper from 2009. The author’s site doesn’t list any papers from 2010. Maybe the paper was published electronically in 2009 and in print in 2010.

Counting tensors

The paper is mostly opaque to me since I don’t know particle physics, but at one point Sjödahl says

The problem of finding all such topologies is equivalent to the number of ways of mapping N elements to each other without mapping a single one to itself …

and says that the solution is

N! \sum_{i=0}^N \frac{(-1)^i}{i!} \to \frac{N!}{e}

Sjödahl is not counting physical permutations but the number possible tensors associated with gluons and quark interaction diagrams.

The right-hand side above is essentially the same as the asymptotic estimate for the function Q(n) in the previous post.

I didn’t find the recurrence that the OEIS comment alluded to. Perhaps I’m not looking at the same paper. Perhaps I’m not looking hard enough because I’m skimming a paper whose contents I don’t understand.

The Montmort letter problem

In more mathematical terminology, Sjödahl is counting the number of permutations of N objects with no fixed point, known as the number derangements of a set N objects.

If you divide by the number of possible permutations N! you have the probability that a random permutation moves every object.

This was historically known as the Montmort letter problem, named after Pierre-Remond Montmort who asked the following question. If N letters are randomly assigned to N envelopes, what is the probability that no letter ends up in the correct envelope?

The probability converges to 1/e as N approaches infinity. It approaches 1/e quickly, and so this is a good approximation even for moderate N. More on the rate of convergence in the next post.

Previous posts in this series

Computing the nth digit of π directly

The following equation, known as the BBP formula [1], will let you compute the nth digit of π directly without having to compute the previous digits.

 \pi = \sum_{n=0}^\infty \frac{1}{16^n} \left( \frac{4}{8n+1} - \frac{2}{8n+4} - \frac{1}{8n+5} - \frac{1}{8n+6}\right)

I’ve seen this claim many times, but I’ve never seen it explained in much detail how this formula lets you extract digits of π.

First of all, this formula lets you calculate the digits of π, not in base 10, but in base 16, i.e. in hexadecimal.

It looks like the BBP formula might let you extract hexadecimal digits. After all, the hexadecimal expansion of π is the set of coefficients an such that

 \pi = \sum_{n=0}^\infty \frac{a_n}{16^n}

where each an is an integer between 0 and 15. But the term multiplying 16n in the BBP formula is not an integer, so it’s not that easy. Fortunately, it’s not that hard either.

Warmup

As a warmup, let’s think how we would find the nth digit of π in base 10. We’d multiply by 10n, throw away the factional part, and look at the last digit. That is, we would calculate

\left\lfloor 10^n \pi \right\rfloor \bmod 10

Another approach would be to multiply by 10n−1, shifting the digit that we’re after to the first digit after the decimal point, then take the integer part of 10 times the fraction part.

\left\lfloor 10 \{10^{n-1} \pi\} \right\rfloor

Here {x} denotes the fractional part of x. This is a little more complicated, but now all the calculation is inside the floor operator.

For example, suppose we wanted to find the 4th digit of π. Multiplying by 103 gives 3141.592… with fractional part 0.592…. Multiplying by 10 gives 5.92… and taking the floor gives us 5.

100th hex digit

Now to find the nth hexadecimal digit we’d do the analogous procedure, replacing 10 with 16. To make things concrete, let’s calculate the 100th hexadecimal digit. We need to calculate

\left\lfloor 16 \left\{\sum_{n=0}^\infty 16^{99-n} \left( \frac{4}{8n+1} - \frac{2}{8n+4} - \frac{1}{8n+5} - \frac{1}{8n+6}\right) \right\} \right\rfloor

We can replace the infinite sum with a sum up to 99 because the remaining terms sum to an amount too small to change our answer. Note that we’re being sloppy here, but this step is justified in this particular example.

Here’s the trick that makes this computation practical: when calculating the fractional part, we can carry out the calculation of the first term mod (8n + 1), and the second part mod (8n + 4), etc. We can use fast modular exponentiation, the same trick that makes, for example, a lot of encryption calculations practical.

Here’s code that evaluates the Nth hexadecimal digit of π by evaluating the expression above.

def hexdigit(N):
    s = 0
    for n in range(N):
        s += 4*pow(16, N-n-1, 8*n + 1) / (8*n + 1)
        s -= 2*pow(16, N-n-1, 8*n + 4) / (8*n + 4)
        s -=   pow(16, N-n-1, 8*n + 5) / (8*n + 5)
        s -=   pow(16, N-n-1, 8*n + 6) / (8*n + 6)
    frac = s - floor(s)
    return floor(16*frac)

Here the three-argument version of pow, introduced into Python a few years ago, carries out modular exponentiation efficiently. That is, pow(b, x, m) calculates bx mod m.

This code correctly calculates that the 100th hex digit of π is C. Note that we did not need 100 hex digits (400 bits) of precision to calculate the 100th hex digit of π. We used standard precision, which is between 15 and 16 bits.

Improved code

We can improve the code above by adding an estimate of the infinite series we ignored.

A more subtle improvement is reducing the sum accumulator variable s mod 1. We only need the fractional part of s, and so by routinely cutting off the integer part we keep s from getting large. This improves accuracy by devoting all the bits in the machine representation of s to the fractional part.

epsilon = np.finfo(float).eps

def term(N, n):
     return 16**(N-1-n) * (4/(8*n + 1) - 2/(8*n+4) - 1/(8*n+5) - 1/(8*n + 6))

def hexdigit(N):
    s = 0
    for n in range(N):
        s += 4*pow(16, N-n-1, 8*n + 1) / (8*n + 1)
        s -= 2*pow(16, N-n-1, 8*n + 4) / (8*n + 4)
        s -=   pow(16, N-n-1, 8*n + 5) / (8*n + 5)
        s -=   pow(16, N-n-1, 8*n + 6) / (8*n + 6)
        s = s%1

    n = N + 1    
    while True:
        t = term(N, n)
        s += t
        n += 1
        if abs(t) < epsilon:
            break
        
    frac = s - floor(s)
    return floor(16*frac)

I checked the output of this code with [1] for N = 106, 107, 108, and 109. The code will fail due to rounding error for some very large N. We could refine the code to push this value of N a little further out, but eventually machine precision will be the limiting factor.

[1] Named after the initials of the authors. Bailey, David H.; Borwein, Peter B.; Plouffe, Simon (1997). On the Rapid Computation of Various Polylogarithmic Constants. Mathematics of Computation. 66 (218) 903–913.

Unicode, Tolkien, and Privacy

When I’m in the mood to write, you can often follow a chain of though in my posts. Recently, a post on LLM tokenization lead to a post on how Unicode characters are tokenized, which led to a post on Unicode surrogates. The latter ended by touching on Unicode’s PUA (Private Use Area), which of course leads to J.R.R. Tolkien and privacy, as we shall see.

To back up a bit, Unicode started as an attempt to one alphabet to rule them all, a coding system large enough to contain all the world’s writing systems. Initially it was thought that 216 = 65,536 symbols would be enough, but that didn’t last long. The Unicode standard now contains nearly 100,000 Chinese characters alone [1], along with myriad other symbols such as graphical drawing elements, mathematical symbols, and emoji.

Private Use Area

Although Unicode is vast, it doesn’t have room to include every symbol anyone might use. Unicode anticipated that contingency and reserved some areas for private use. The most well known example is probably Apple’s use of U+F8FF for their logo . The private use area is also being used for ancient and medieval scripts, as well as for fictional writing systems such as Tolkein’s Tengwar and Cirth.

Because anyone can use the private use area as they wish, there could be conflicts. For example, Apple intends U+F8FF to display their logo, but the code point is also used for a Klingon symbol. We’ll see another example of a conflict at the end of this post.

Privacy

Here’s the chain of thought that leads to privacy. When I started thinking about this post, I thought about creating an image of Tengwar writing, and that made me think about font fingerprinting. Browsers let web servers know what fonts you have installed, which serves a benign purpose: a site may want to send you text formatted in a particular font if its available but will fall back to a more common font if necessary.

However, the collection of fonts installed on a particular computer may be unique, or at least used in combination with other browser information to uniquely identify a user. Before installing a Tengwar font I thought about how it would be sure to increase the uniqueness of my font fingerprint.

Tolkein’s Tengwar

J. R. R. Tolkein’s Tengwar writing system has been mapped to Unicode range E000 to E07F.

Here’s a sample. No telling what it looks like in your browser:

      

When I view the same text in Tecendil online transcriber I see this:

Not all who wander are lost

But then I open the text in Emacs I see this:

Both are legitimate renderings because nobody owns the private use areas. The characters that Tecendil uses for Tengwar, apparently some font on my laptop uses to display Chinese characters.

I’m especially curious about the last character, U+E000. Tecendil interprets it as the Tengwar symbol tinco but something on my laptop interprets it as the Mozilla mascot.

Mozilla T-rex mascot

[1] It wasn’t so naive to think 16 bits would do, even including Chinese. There may be 100,000 Chinese characters, but a few thousand characters account for nearly all Chinese writing. More on that here. But if you’re going to break the 16-bit barrier anyway, you might as well try to include even the most rare characters.

Spreading out words in space

A common technique for memorizing numbers is to associate numbers with words. The Major mnemonic system does this by associating consonant sounds with each digit. You form words by inserting vowels as you please.

There are many possible encodings of numbers, but sometimes you want to pick a canonical word for each number, what’s commonly called a peg. Choosing pegs for the numbers 1 through 10, or even 1 through 100, is not difficult. Choosing pegs for a larger set of numbers becomes difficult for a couple reasons. First, it’s hard to think of words to fit some three-digit numbers. Second, you want your pegs to be dissimilar in order to avoid confusion.

Say for example you’ve chosen “syrup” for 049 and you need a peg for 350. You could use “molasses,” but that’s conceptually similar to “syrup.” If you use “Miles Davis” for 350 then there’s no confusion [1].

You could quantify how similar words are using cosine similarity between word embeddings.  A vector embedding associates a high-dimensional vector with each word in such a way that the geometry corresponds roughly with meaning. The famous example is that you might have, at least approximately,

queen = king − man + woman.

This gives you a way to define angles between words that ideally corresponds to conceptual similarity. Similar words would have a small angle between their vectors, while dissimilar words would have larger angles.

If you wanted to write a program to discover pegs for you, say using some corpus like ARPABet, you could have it choose alternatives that spread the words out conceptually. It’s debatable how practical this is, but it’s interesting nonetheless.

The angles you get would depend on the embedding you use. Here I’ll use the gensim code I used earlier in this post.

The angle between “syrup” and “molasses” is 69° but the angle between “syrup” and “miles” is 84°. The former is larger than I would have expected, but still significantly smaller than the latter. If you were using cosine similarity to suggest mnemonic pegs, hopefully the results would be directionally useful, choosing alternatives that minimize conceptual overlap.

As I said earlier, it’s debatable how useful this is. Mnemonics are very personal. A musician might be fine with using “trumpet” for 143 and “flugelhorn” for 857 because in his mind they’re completely different instruments, but someone else might think they’re too similar. And you might not want to use “Miles Davis” and “trumpet” as separate pegs, even though software will tell you that “miles” and “trumpet” are nearly orthogonal.

Related posts

[1] Here we’re following the convention that only the first three consonants in a word count. This makes it easier to think of pegs.

Martian Leap Years

The previous post looked at one challenge with designing a calendar for Mars, namely how to vary the number of days per month so that each month corresponds to a change of 30 degrees with respect to the sun. This is a bigger problem on Mars than here on Earth.

That post assumed that a Martian year consisted of 669 sols (Martian days). But a Martin year is not an integer number of sols, just as an Earth year is not an integer number of days. In fact, a Martian year is 668.5991 sols.

One way to handle this would be for 3 out of every 5 years to have 669 sols, with the remaining years having 668 sols. So maybe if the year mod 5 equals 0, 1 or 2 the year would be long, 669 sols, and otherwise it would be short, 668 sols. (This alternation of 3 and 2 reminds me of Dave Brubeck’s song Take Five.)

This scheme, which approximates 668.5991 by 668.6, is about as accurate as our Gregorian calendar, which approximates 365.2422 by 365.2425. For more accuracy, Martian’s would need something like our leap seconds.

Settlers versus Hipsters

When my children were little, I read the Little House on the Prairie books aloud to them and I naturally saw the books through the eyes of a child. Last night I started reading the books by myself for the first time and saw them very differently.

Laura Ingalls Wilder wrote the Little House books later in life, looking back at her childhood an early adult years. The events in the books took place in the last quarter of the nineteenth century.

At first glance the books tell the story of the Ingalls family living off the land, which to a large extent they did. The first chapter of the first book describes the family salting and smoking meat in order to have food for the winter when it would be hard to hunt game. Without adequate preparation they would starve, which they nearly do in one of the latter books.

The initial chapter also describes the father greasing his traps. He didn’t smelt iron to make his traps; he had bought the traps somewhere and brought them with him. There no sense in the books that the family was trying to avoid contemporary technology. They gladly used the technology available to them, such as it was.

In addition to hardware such as bear traps, the family also had consumables they could not produce themselves. Where did they get the salt to preserve their meat? They didn’t drive their SUV down to Costco, but neither did they mine salt. And as I wrote about years ago, the books mention coffee, something that doesn’t grow in the continental United States.

Obtaining supplies was difficult, not something they would do lightly or frequently, but there’s no sense that they saw buying supplies as a failing. They were trying to settle new land, but they weren’t trying to get away from contemporary amenities. They did without amenities out of necessity, not out of conviction.

 

Golden convergence

The golden ratio φ satisfies the following equation.

\varphi = \sqrt{1 + \sqrt{1 + \sqrt{1 + \sqrt{1 + \cdots}}}}

The proof most commonly given is to let x equal the right-hand side of the equation, then observe that x² = 1 + x, the quadratic equation for the golden ratio. The quadratic has two roots: φ and −1/φ. Since x > 1, x = φ.

This proof tacitly assumes that the expression above is meaningfully defined, and then manipulates it algebraically. But is it meaningfully defined? What exactly does the infinitely nested sequence of radicals mean?

You could interpret the nested radicals to be the limit of the iteration

x \mapsto \sqrt{1 + x}

and show that the limit exists.

What should the starting value of our iteration be? It seems natural to choose x = 1, but other values will do. More on that shortly. We could compute φ by implementing the iteration in Python as follows.

    x_old, x_new = 0, 1
    while (abs(x_old - x_new) > 1e-15):
        x_new, x_old = (1 + x_new)**0.5, x_new
    print(x_new)

The program terminates, so the iteration must converge, right? Probably so, but that’s not a proof. To be more rigorous, define

f(x) = \sqrt{1 + x}

and so

f^\prime(x) = \frac{1}{2\sqrt{1 + x}}

This shows 0 < f ′(x) < ½ for any positive x and so the function f(x) is a contraction mapping. This means the iterations converge. We could start with any x > −¾ and the derivative would still be less than 1, so the iterations would converge.

We can visualize the process of convergence starting with x = 0 using the following cobweb plot.

Related posts

Mnemonic images with Grok 3

The Major mnemonic system makes numbers easier to memorize by encoding them as words. Each digit corresponds to one or more consonant sounds, and you can fill in vowels as you wish.

In August 2022 I tried creating a few images using DALL-E 2. The results were disappointing and sometimes disturbing.

To illustrate the use of the Major system, I gave the example of memorizing a list of the US presidents by creating mental images associating each president with the number of their term as president. For example, Franklin Delano Roosevelt was the 32nd POTUS. You can encode 32 as “moon”, so you might imagine FDR looking up at the moon.

At the time, Grover Cleveland was the only US President to serve two non-consecutive terms, being both the 22nd and 24th president. I asked DALL-E to create an image of Grover the Muppet holding an onion (22) and a wiener dog (24). This was too much for DALL-E at the time. The image below was as close as I could get.

Blue dog holding cucumber dog?

When I asked Grok 3 for a similar image it did a much better job. To be fair, it initially put a different dog breed in Grover’s hand, but I asked it to change the dog to a dachshund and it performed admirably.

Grover holding an onion and a wiener dog

Related posts