Lee distance: codes and music

The Hamming distance between two sequences of symbols is the number of places in which they differ. For example, the Hamming distance between the words “hamming” and “farming” is 2, because the two worlds differ in their first and third letters.

Hamming distance is natural when comparing sequences of bits because bits are either the same or different. But when the sequence of symbols comes from a larger alphabet, Hamming distance may not be the most appropriate metric.

Here “alphabet” is usually used figuratively to mean the set of available symbols, but it could be a literal alphabet. As English words, “hamming” seems closer to “hanning” than to “farming” because m is closer to n, both in the alphabet and phonetically, than it is to f or r. [1]

The Lee distance between two sequences x1x2xn and y1y2yn of symbols from an alphabet of size q is defined as

\sum_{i=1}^n \min\{ |x_i - y_i|, q - |x_i - y_i| \}

So if we use distance in the English alphabet, the words “hamming” and “hanning” are a Lee distance of 1 + 1 = 2 apart, while “hamming” and “farming” are a Lee distance of 2 + 5 = 7 apart.

Coding theory uses both Hamming distance and Lee distance. In some contexts, it only matters whether symbols are different, and in other contexts it matters how different they are. If q = 2 or 3, Hamming distance and Lee distance coincide. If you’re working over an alphabet of size q > 3 and symbols are more likely to be corrupted into nearby symbols, Lee distance is the appropriate metric. If all corruptions are equally likely, then Hamming distance is more appropriate.

Application to music

Lee distance is natural in music since notes are like integers mod 12. Hence the circle of fifths.

My wife and I were discussing recently which of two songs was in a higher key. My wife is an alto and I’m a baritone, so we prefer lower keys. But if you transpose a song up so much that it’s comfortable to sing an octave lower, that’s good too.

If you’re comfortable singing in the key of C, then the key of D is two half-steps higher. But what about they key of A? You could think of it as 9 half-steps higher, or 3 half-steps lower. In the definition of Lee distance, measured in half-steps, the distance from C to D is

min{2, 12 – 2} = 2,

i.e. you could either go up two half-steps or down 10. Similarly the distance between C and A is

min{9, 12-9} = 3.

So you could think of the left side of the minimum in the definition of Lee distance as going up from x to y and the right side as going down from x to y.

Using Lee distance, the largest interval is the tritone, the interval from C to F#. It’s called the tritone because it is three whole steps. If C is your most comfortable key, F# would be your least comfortable key: the notes are as far away from your range as possible. Any higher up and they’d be closer because you could drop down an octave.

The tritone is like the hands of a clock at 6:00. The hour and minute hands are as far apart as possible. Just before 6:00 the hands are closer together on the left side of the clock and just after they are closer on the right side of the clock.

Related posts

[1] I bring up “Hanning” because Hamming and Hanning are often confused. In signal processing there is both a Hamming window and a Hanning window. The former is named after Richard Hamming and the latter after Julius von Hann. The name “Hanning window” rather than “Hann window” probably comes from the similarity with the Hamming window.