Dividing an octave into 14 pieces

Keenan Pepper left a comment on my previous post saying that the DTMF tones used by touch tone phones “are actually quite close to 14 equal divisions of the octave (rather than the usual 12).”

Let’s show that this is right using a little Python.

    import numpy as np

    freq = np.array([697, 770, 852, 941, 1209, 1336, 1477])
    lg = np.log2(freq)
    spacing = lg[1:] - lg[:-1]
    print(spacing)
    print([2/14, 2/14, 2/14, 5/14, 2/14, 2/14])

If the tones are evenly spaced on a 14-note scale, we’d expect the logarithms base 2 of the notes to differ by multiples of 1/14.

This produces the following:

    [0.144 0.146 0.143 0.362 0.144 0.145]
    [0.143 0.143 0.143 0.357 0.143 0.143]

By dividing the octave into 14 points the DFMT system largely avoids overlap between the set of tones and the set of their harmonics. However, the first overtone of the first tone (1394 Hz) is kinda close to the fundamental of the 6th tone (1336 Hz).

Plotting DTMF tones and first two harmonics

Related posts

Phone tones in musical notation

The sounds produced by a telephone keypad are a combination of two tones: one for the column and one for the row. This system is known as DTMF (dual tone multiple frequency).

I’ve long wondered what these tones would be in musical terms and I finally went to the effort to figure it out when I ran across DTMF in [1].

The three column frequencies are 1209, 1336, and 1477 Hz. These do not correspond exactly to standard musical pitches. The first frequency, 1209 Hz, is exactly between a D and a D#, two octaves above middle C. The second frequency, 1336 Hz, is 23 cents [2] higher than an E. The third frequency, 1477 Hz, lands on an F#.

In approximate musical notation, these pitches are two octaves above the ones written below.

Notice that the symbol in front of the D is a half sharp, one half of the symbol in front of the F.

Similarly, the four row frequencies, starting from the top, are 697, 770, 852, and 941 Hz. In musical terms, these notes are F, G (31 cents flat), A (54 cents flat), and B flat (16 cents sharp).

 

The backward flat symbol in front of the A is a half flat. As with the column frequencies, the row frequencies are two octaves higher than written.

These tones are deliberately not in traditional harmony because harmonic notes (in the musical sense) are harmonically related (in the Fourier analysis sense). The phone company wants tones that are easy to pull apart analytically.

Finally, here are the chords that correspond to each button on the phone keypad.

Update: Dial tone and busy signal

Related posts

[1] Electric Circuits by Nilsson and Riedel, 10th edition, page 548.
[2] A cent is 1/100 of a semitone.

Morse code in musical notation

Maybe this has been done before, but I haven’t seen it: Morse code in musical notation.

Here’s the Morse code alphabet, one letter per measure; in practice there would be less space between letters [1]. A dash is supposed to be three times as long as a dot, so a dot is a sixteenth note and a dash is a dotted eighth note.

Morse code is often at a frequency between 600 and 800 Hz. I picked the E above middle C (660 Hz) because it’s in that range.

Rhythm

Officially a dash is three times as long as a dot. But there’s also a space equal to the length of a dot between parts of a letter. So the sheet music above would be more accurate if you imagined all the sixteenth notes are staccato and the dotted eighth notes are really eighth notes followed by a sixteenth rest.

This doesn’t make much difference because individual operators have varying “fists,” styles of sending Morse code, and won’t exactly follow the official length and spacing rules.

You could rewrite the music above as follows, but it’s all an approximation.

Tempo

According to Wikipedia, “the dit length at 20 words per minute is 50 milliseconds.” So if a sixteenth note has a duration of 50 milliseconds, this would mean five quarter notes per second, or 300 beats per minute. But according to this video, the shortest duration people can distinguish is about 50 milliseconds.

That would imply that copying Morse code at 20 wpm is pushing the limits of human hearing. But copying at 20 wpm is common. Some people can copy Morse code at more than 50 words per minute or more, but at that speed they’re not hearing individual dits and dahs. An H, for example, four dits in a row, sounds like a single rough sound. In fact, they’re not really hearing letters at all but recognizing the shape of words.

How the image was made

I made the image above with LaTeX and Lilypond.

Adding the letters above each measure was kind of a hack. I used rehearsal markings to label the measures, but there was one problem: the software skips from letter H to letter J. That meant that the labels I and all subsequent letters were one ahead of what they should be, and the final letter Z was labeled AA. I tried several tricks, and Lilypond steadfastly refused to label a measure with ‘I’ even though I’ve seen such a label in the documentation.

My way around this was to make it label two consecutive measures with H, then in image editing software I turned the second H into an I. No doubt there’s a better way, but this worked.

I may play around with this and try to improve it a bit. If you have any suggestions, particularly related to Lilypond, please let me know.

Related posts

[1] You could think of the musical score above as a sort of transcription of the Farnsworth method of teaching Morse code. Students learn the letters at full speed, but with extra space between the letters at first. The faster speed discourages consciously counting the dits and dahs, forcing the student to listen to the overall rhythm of the letters.

O Come, O Come, Emmanuel: condensing seven hymns into one

The Christmas hymn “O Come, O Come, Emmanuel” is a summary of the seven “O Antiphons,” sung on December 17 though 23, dating back to the 8th century [1]. The seven antiphons are

  1. O Sapientia
  2. O Adonai
  3. O Radix Jesse
  4. O Clavis David
  5. O Oriens
  6. O Rex Gentium
  7. O Emmanuel

The corresponding verses of “O Come, O Come, Emmanuel” begin with the lines below.

  1. O come, Thou Wisdom from on high
  2. O come, o come, thou lord of might
  3. O come, Thou Branch of Jesse’s stem
  4. O come, Thou Key of David, come
  5. O come, Thou Bright and Morning Star
  6. O come, Desire of nations, bind
  7. O come, O come, Emmanuel

[1] Mars Hill Audio Conversations O Come, O Come, Emmanuel: Malcolm Guite and J. A. C. Redford on the Advent O Antiphons

Complexity below the surface

The other day I ran across a Rick Beato video entitled “The most complex pop song of all time.”

I thought the song would be something by a cerebral group like Rush, but instead it’s “Never Gonna Let You Go” by Sérgio Mendes. The song made it to #4 on the weekly pop charts in 1983 and came in at #16 for the year. I knew the song, but I would not have thought it was especially simple or complex.

The song works because the melody is simple enough but the harmony is complex. The complexity isn’t gratuitous, but serves the song. Millions of people thought the song was enjoyable, not impressive; you have to listen closely to be impressed.

I was reminded, as I often am, of the line from Feynman that nearly everything is really interesting if you look into it deeply enough. I wonder what other pop songs I’ve dismissed that have a lot going on if you listen more closely. And I wonder more generally what else around me is more interesting than I realize.

Perfect fifths, octaves, and an ergodic map

In music, a perfect fifth is the interval between two notes whose frequencies are in 3:2 ratio. For example, the interval from an A at 440 Hz and an E at 660 Hz is a perfect fifth.

Going up by 12 perfect fifths is very nearly the same as going up 7 octaves. That is,

(3/2)12 ≈ 27

or in other words,

27/12 ≈ 3/2.

This is why equal temperament tuning works. More on that here.

53-note scale

Going back to our first approximation, we can say that (12, 7) is an approximate solution to the equation

(3/2)x = 2y.

One could naturally ask whether there are better approximate solutions. One such solution, dating back to 40 BC, came from the Chinese scholar King Fang [1]. He discovered that (53, 31) is a substantially more accurate solution than (12, 7), between 6 and 7 times better.

    >>> 1.5**12/2**7
    1.0136432647705078
    >>> 1.5**53/2**31
    1.0020903140410862

In other words, going up 53 perfect fifths is nearly the same as going up 31 octaves. You could divide the octave into 53 parts using perfect fifths; we will demonstrate this visually below. However, we’re moving more the realm of number theory than practical music theory at this point.

Powers of 2 and 3

We could change our problem slightly, making it mathematically simpler though less obviously related to music, by putting all our powers of 2 on one side and looking for powers of 3 that approximately equal powers of two. That is, we could look at approximate solutions to

3x = 2y.

where now x and y have different meanings; our new y is our old y plus our old x. No power of 3 will ever equal a power of 2, but you can find powers of 3 close to powers of 2, making the ratio as small as you’d like.

Ergodic map

Taking the logarithm of both sides in base 2, we recast the problem as looking for integers n such that

log2(3) n

is approximately an integer.

If we define k = log2(3), we are looking for n such that kn mod 1, the integer part of kn, is near 0.

The map

nkn mod 1

is ergodic because k is irrational, and so its image in the unit interval is dense as n goes to infinity. This means we can find solutions as close we’d like to 0. It also means we can find solutions as close as we’d like to any other number in the interval.

Here’s a plot of our map for n up to 12.

Note that the range of the map falls very nearly on the 12 evenly-spaced horizontal lines. This corresponds to the circle of fifths filling in the 12 notes of the chromatic scale.

Now let’s keep going for n up to 53.

Look closely at the bottom of the plot. The graph gets close to 0 at 12, and it gets even closer to 0 at 53. The value of kn mod 1 doesn’t get smaller than the value at n = 53 until n = 359 where the value is about twice as small.

When we go up by 12 perfect fifths, we don’t end up exactly on the note we started on; 12 fifths is a little more than 7 octaves. If we keep going up in fifths we fill in notes a little higher than the original chromatic scale. Here’s a plot for n up to 24, with the values sorted so we can focus on the range rather than the ups and downs of filling in the range.

Every note in the original chromatic scale now has a counterpart that’s somewhere around 25 cents sharp.

If we keep going to n = 53, we fill in the octave more evenly. Here’s a plot of the sorted values.

Related posts

[1] A. L. Leigh Silver. Some Musico-Mathematical Curiosities. The Mathematical Gazette , Feb., 1964, Vol. 48, No. 363 (Feb., 1964), pp. 1-17.

Saxophone ranges

Saxophone quartet

I stumbled on a recording of a contrabass saxophone last night and wondered just how low it was [1], so I decided to write this post giving the ranges of each of the saxophones.

The four most common saxophones are baritone, tenor, alto, and soprano. These correspond to the instruments in the image above. There are saxophones below the baritone and above the soprano, but they’re rare.

Saxophones have roughly the same range as the human vocal parts with the corresponding names, as shown in the following table.

\begin{center} \begin{tabular}{lllrr} \hline Part & Sax SPN & Human SPN & Sax Hz & Human Hz\\ \hline Soprano & \(A \flat_3\) -- \(E \flat_6\) & \(C_4\) -- \(C_6\) & 208--1245 & 262--1047\\ Alto & \(D \flat_3\) -- \(A \flat_5\) & \(F_3\) -- \(F_5\) & 139--831 & 175--698\\ Tenor & \(A \flat_2\) -- \(E \flat_5\) & \(C_3\) -- \(C_5\) & 104--622 & 131--523\\ Baritone & \(D \flat_2\) -- \(A \flat_4\) & \(G_2\) -- \(G_4\) & 69--415 & 98--392\\ Bass & \(A \flat_1\) -- \(E \flat_4\) & \(E_2\) -- \(E_4\) & 52--311 & 82--330\\ \hline \end{tabular} \end{center}

SPN stands for scientific pitch notation, explained here. Hz stands for Hertz, vibrations per second.

The human ranges are convenient two-octave ranges. Of course different singers have different ranges. (Different saxophone players have different ranges too if you include the altissimo range.)

If you include the rare saxophones, the saxophone family has almost the same range as a piano. The lowest note on a subcontrabass saxophone is a half step lower than the lowest note on a piano, and the highest note on the sopranissimo saxophone is a few notes shy of the highest note on a piano.

\begin{center} \begin{tabular}{llr} \hline Part & Sax SPN & Sax Hz\\ \hline Sopranissimo & \(A \flat_4\) -- \(E \flat_7\) & 416--2490\\ Sopranino & \(D \flat_4\) -- \(A \flat_6\) & 277--1662\\ Soprano & \(A \flat_3\) -- \(E \flat_6\) & 208--1245\\ Alto & \(D \flat_3\) -- \(A \flat_5\) & 139--831\\ Tenor & \(A \flat_2\) -- \(E \flat_5\) & 104--622\\ Baritone & \(D \flat_2\) -- \(A \flat_4\) & 69--415\\ Bass & \(A \flat_1\) -- \(E \flat_4\) & 52--311\\ Contrabass & \(D \flat_1\) -- \(A \flat_3\) & 35--208\\ Subcontrabass & \(A \flat_0\) -- \(E \flat_3\) & 26--156\\ \hline \end{tabular} \end{center}

Update

My intent when I wrote this post was to add some visualization. One thought was to display the data above on a piano keyboard. That would be a nice illustration, but it would be a lot of work to create. Then it occurred to me that what putting things on a piano is really just a way of displaying the data on a log scale. So I plotted the frequency data on a log scale, which was much easier.

Saxophone and human voice ranges

More saxophone posts

[1] I could figure it out in terms of musical notation—you can see what a regular pattern the various saxes have in the table above—but I think more in terms of frequencies these days, so I wanted to work everything out in terms of Hz. Also, I’d always assumed that tenor saxes and tenor voices have about the same range etc., but I hadn’t actually verified this before.

Pitch of a big wine bottle

Yesterday my daughter came by and dropped off a huge blue wine bottle (empty).

Trader Joe's Incanto Chardonnay Pinot Grigio

She had started removing the label, but as you can see she didn’t get very far yet. It’s an Incanto Chardonnay Pinot Grigio from Trader Joe’s.

I blew across the top of the bottle to hear what sound it makes, and it makes a nice deep rumble.

I tried to identify the pitch using a spectrum analyzer app on my phone, and it says 63 Hz.

audio spectrum analyzer screen shot

Next I tried to figure out what pitch I should expect theoretically based on physics. Wine bottles are Helmholtz resonators, and there’s a formula for the fundamental frequency of Helmholtz resonators:

f = \frac{v}{2\pi} \sqrt{\frac{A}{LV}}

The variables in this equation are:

  • f, frequency in Hz
  • v, velocity of sound
  • A, area of the opening
  • L, length of the neck
  • V, volume

I measured the opening to be 3/4 of an inch across, and the neck to be about 7 inches. The volume is 1.5 liters. The speed of sound at sea level and room temperature is 343 meters per second. After a few unit conversions [1] I got a result of 56.4 Hz, about 10% lower than what the spectrum analyzer measured.

An ideal Helmholtz resonator has a cylindrical neck attached to a spherical body. This bottle is far from spherical. The base is an ellipse with a major axis about twice as long as the minor axis. And from there it tapers off more like a cone than a sphere [2]. And yet the frequency predicted by Helmholtz’ formula comes fairly close to what I measured empirically.

I suspect I got lucky to some extent. I didn’t measure the bottle that accurately; it’s hard to even say when the neck of the bottle stops. But apparently Helmholtz’ formula is robust to changes in shape.

Update: Pitch of a beer bottle

I repeated my experiment with a beer bottle, specifically a Black Venom Imperial Stout.

Black Venom Imperial Stout

The opening diameter is about 3/4″, as with the wine bottle above, and the neck is about 3″ long. The volume is 12 fluid ounces. Helmholtz’ formula predicts a pitch of 177 Hz. My spectrum analyzer measured 191 Hz, the G below middle C. So this time theory was about 7% lower than the observed value.

Spectral analysis of blowing across beer bottle

The beer bottle is closer to the shape of a Helmholtz resonator than the wine bottle was. It’s at least radially symmetric, but the body is a cylinder rather than a sphere.

Update 2: Typical wine bottle

When I tested a typical 750 ml wine bottle, I got a pitch of 114 Hz. With a 3.5 inch neck and a 0.75 in diameter opening, the calculated pitch was 113 Hz. There’s some element of luck that theory and measurement agree so well, especially since the punt at the bottom means its shape is even further from spherical than that of a beer bottle.

Audio spectrum of a 750 ml pinot noir bottle

More acoustics posts

[1] Thanks to a reader who provided this write-up of the calculation:

calculation with dimensions

[2] What we usually call a cone is more specifically a right circular cone. But more generally a cone can have any base, not just a circle, and this bottle is approximately an elliptical cone.

The Crown and The Planets

I first heard the hymn “I Vow to Thee, My Country” while watching the first season of The Crown [1]. I assume the hymn is familiar in the UK, but it is not in America as far as I know.

When I say I first heard the hymn, I mean that I first heard it as a hymn with words. I thought the tune sounded familiar, and that it reminded me of The Planets by Holst.

I’ve started watching the latest season of The Crown and once again I heard the hymn [2], so this time I looked into it more. The tune does indeed come from The Planets, specifically from the middle of the Jupiter movement.

With a little searching I found the sheet music to the hymn.

Sheet music for the tune Thaxted

This brought up more things I’ve long meant to look into. As a child I remember cryptic notations around hymns, such as “Thaxted 13.13.13 D” above, and never knew what they meant.

Tunes have names independent of the hymns they appear in, but these tune names were, and still are, completely unfamiliar to me. For example, the hymn “Amazing Grace” has the tune “McIntosh,” though I don’t imagine many people know that.

In the example here, “Thaxted” is the name of the melody from Jupiter when it is used as a hymn. The name comes from the English town of Thaxted where Holst lived. Perhaps there are other hymns that use the same tune.

Now what about the mysterious numbers 13.13.13? They mean that the hymn is built out of groups of three lines, each with 13 syllables. The hymn Once in Royal David’s City is marked 87 87 77, meaning the hymn has three phrases, the first two alternating lines of 8 and 7 syllables, and the last having two lines of 7 syllables each.

From what I’ve read, the “D” in “13.13.13 D” stands for double meter, which I would take to mean 2/4, but the tune is clearly in 3/4, so I’m not sure what the D means.

Update: The D means the entire pattern is doubled, not that the meter is double time. Thaxted has six lines, in two groups of three. Thanks to Michael Lugo for letting me know via Twitter.

***

[1] S1E1 10:00

[2] S4E3 40:30

 

Lee distance: codes and music

The Hamming distance between two sequences of symbols is the number of places in which they differ. For example, the Hamming distance between the words “hamming” and “farming” is 2, because the two worlds differ in their first and third letters.

Hamming distance is natural when comparing sequences of bits because bits are either the same or different. But when the sequence of symbols comes from a larger alphabet, Hamming distance may not be the most appropriate metric.

Here “alphabet” is usually used figuratively to mean the set of available symbols, but it could be a literal alphabet. As English words, “hamming” seems closer to “hanning” than to “farming” because m is closer to n, both in the alphabet and phonetically, than it is to f or r. [1]

The Lee distance between two sequences x1x2xn and y1y2yn of symbols from an alphabet of size q is defined as

\sum_{i=1}^n \min\{ |x_i - y_i|, q - |x_i - y_i| \}

So if we use distance in the English alphabet, the words “hamming” and “hanning” are a Lee distance of 1 + 1 = 2 apart, while “hamming” and “farming” are a Lee distance of 2 + 5 = 7 apart.

Coding theory uses both Hamming distance and Lee distance. In some contexts, it only matters whether symbols are different, and in other contexts it matters how different they are. If q = 2 or 3, Hamming distance and Lee distance coincide. If you’re working over an alphabet of size q > 3 and symbols are more likely to be corrupted into nearby symbols, Lee distance is the appropriate metric. If all corruptions are equally likely, then Hamming distance is more appropriate.

Application to music

Lee distance is natural in music since notes are like integers mod 12. Hence the circle of fifths.

My wife and I were discussing recently which of two songs was in a higher key. My wife is an alto and I’m a baritone, so we prefer lower keys. But if you transpose a song up so much that it’s comfortable to sing an octave lower, that’s good too.

If you’re comfortable singing in the key of C, then the key of D is two half-steps higher. But what about they key of A? You could think of it as 9 half-steps higher, or 3 half-steps lower. In the definition of Lee distance, measured in half-steps, the distance from C to D is

min{2, 12 − 2} = 2,

i.e. you could either go up two half-steps or down 10. Similarly the distance between C and A is

min{9, 12 − 9} = 3.

So you could think of the left side of the minimum in the definition of Lee distance as going up from x to y and the right side as going down from x to y.

Using Lee distance, the largest interval is the tritone, the interval from C to F#. It’s called the tritone because it is three whole steps. If C is your most comfortable key, F# would be your least comfortable key: the notes are as far away from your range as possible. Any higher up and they’d be closer because you could drop down an octave.

The tritone is like the hands of a clock at 6:00. The hour and minute hands are as far apart as possible. Just before 6:00 the hands are closer together on the left side of the clock and just after they are closer on the right side of the clock.

Related posts

[1] I bring up “Hanning” because Hamming and Hanning are often confused. In signal processing there is both a Hamming window and a Hanning window. The former is named after Richard Hamming and the latter after Julius von Hann. The name “Hanning window” rather than “Hann window” probably comes from the similarity with the Hamming window.