Memorizing four-digit numbers

The Major mnemonic system is a method of converting numbers to words that can be more easily memorized. The basics of the system can be written on an index card, but there are practical details that are seldom written down.

Presentations of the Major system can be misleading, intentionally or unintentionally, by implying that it is easy to find single words that encode numbers with three or four digits. Books and articles can unintentionally leave a wrong impression by being brief, but I can think of one book that I thought was intentionally misleading, opening with an example that was obviously reverse-engineered from its mnemonic. It was something like “Isn’t it easier to remember ‘constitution’ than 7201162?” Indeed it is, but I make up that example by starting with “constitution,” not starting with 7201162.

The Major system maps digits to consonant sounds. Spelling doesn’t matter, only pronunciation, and you can insert any vowels (or semivowels) you like. I list the mapping here in ARPAbet notation and here in IPA notation. The former is less precise but easier for most people to understand, so I’ll repeat it here.

0: S or Z
1: D, DH, T, or DH
2: N or NG
3: M
4: R
5: L
6: CH, JH, SH, or ZH
7: G or K
8: F or V
9: P or B

It is easy to find words that encode single digits. It’s a little harder to find words that encode some two-digit numbers, but it’s certainly doable. But if you want to encode all three-digit numbers as single words, you have to make some compromises. I estimate there’s about a 52% chance of being able to encode a four-digit number as a single word, for reasons I’ll explain below.

The CMU Pronouncing Dictionary lists 134,373 words along with their ARPAbet pronunciation. In this post I describe how I mapped the words to numbers, creating a file cmu_major.txt.

Not every three-digit number is in this file. The command

    grep -P -o ' \d{3}$' cmu_major.txt | sort -u | wc -l

shows that there are 958 unique three-digit numbers in the file, i.e. 42 three-digit numbers cannot be encoded as words in the CMU dictionary. By changing the ‘3’ to a ‘4’ in the one-liner above we see there are 5,191 unique four-digit numbers in the file, i.e. about 52% of all possible four-digit numbers.

Since it is very often not possible to encode numbers with four or more digits as single words, a common approach is to not try. Instead, just pay attention to the first three digits that a word would encode. The advantage of this is that it opens up more possibilities for encoding three-digit numbers. The downside is that you give up the possibility of encoding four-digit numbers in a single word, but this isn’t giving up much since there’s a 40% chance you’d fail anyway.

So if you want to memorize a four-digit number, you could memorize a pair of two-digit numbers. Some people like to draw these two numbers from different sets, such as using the name of a person for the first two digits and an action for the second two digits. I’ll explore this more in my next post on the PAO system.