Given a set of dots and dashes, how many ways can they be partitioned into a set of Morse code letters?
There is at least one way, since you could take each dot to be an E and each dash to be a T.
If you have a sequence of n dots and dashes, there no more than 2n−1 ways to partition the symbols: at each of the n − 1 spaces between symbols, you either start a new partition or you don’t. This is an over-estimate for large n since a Morse code letter has at most 4 dots or dashes, and not all combinations of four dots and dashes corresponds to a letter.
Last year I wrote about the song YYZ and how it was inspired by the sound of “YYZ” in Morse code, YYZ being the designation of the Toronto airport. Here’s the song’s opening theme:
The C code given here enumerates partitions of dots and dashes, and it shows that there are 1324 ways to partition -.---.----..
into the Morse code for letters [1]. This number 1324 is closer to our upper estimate of 211 = 2048 than our lower estimate of 1.
Define a function M(n) as follows. Express n in binary, convert the 0s to dots and the 1s to dashes, and let M(n) be the number of ways this sequence of dots and dashes can be partitioned into letters. The n corresponding to the Morse code for YYZ is 101110111100two = 3004ten, so M(3004) = 1324.
I looked to see whether M(n) were in OEIS and it doesn’t appear to be, though there are several sequences in OEIS that include Morse code in their definition.
It’s easy to see that 1 ≤ M(n) ≤ n. Exercise for the reader: find sharper upper and lower bounds for M(n). For example, show that every group of three bits can be partitioned four ways, and so M(n) has a lower bound something like n2/3.
Related posts
- How efficient is Morse code?
- Smooth ramps and CW clicks
- Q codes in Seveneves
- Estimating integer partitions
[1] The code returns 1490 possibilities, but some of these contain one or more asterisks indicating combinations of dots and dashes that do not correspond to letters.
For the OEIS, it might be better to use bijective base-2, so that we can encode sequences with leading dots.
Some eight-bit Morse code practice software I played with ages ago encoded the codes with a leading 1 in the value to mark the start of the sequence. Or maybe it was a trailing 1 for the end; I don’t remember which way the code shifted the bits out.