Recovering a permuted seed phrase

As mentioned in the previous post, crypto wallets often turn a list of words, known as a seed phrase, into private keys. These words come from a list of 2¹¹ = 2048 words chosen to be distinct and memorable. You can find the list here. Typically a seed phrase will contain 12 words. For example:

prize, differ, year, cloth, require, wild, clean, kit, cart, knock, biology, above

Each word carries 11 bits of information, so in total this represents 132 bits, which is 128 bits of content and 4 bits of checksum.

Now suppose you memorized your list of 12 words, but then later you don’t remember them in the correct order. Then you have about half a billion permutations to try because

12! = 479,001,600.

However, not all of these permutations are equally likely to be correct. You are far more likely to transpose a couple words, for example, than to completely scramble the list. So you would start with the list of words in the remembered order, then explore further and further afield from initial sequence. In mathematical terms, you’ll try the permutations of your remembered sequence in order of Cayley distance.

There are 66 permutations of a list of 12 things that are a transposition of two elements. So if the sequence you started with was

abcdefghijkl

then you’d try things like bacdefghijkl or ebcdafhghijkl. There are 66 possibilities because there are 66 ways to choose 2 items from a set of 12; you’re choosing the two items to swap.

If the above didn’t work, you could next explore permutations that are at a Cayley distance of 2 from what you remember. Now there are 1,925 possibilities. At a Cayley distance of 3 there are 32,670 possibilities.

In general, the number of arrangements at a Cayley distance k from the original sequence of n items equals

|S₁(n, n − k)|

where S₁ denotes the Stirling number of the first kind.

There are other approaches to measuring how far apart two permutations are. Another possibility is Kendall distance which counts the number of adjacent transpositions needed to turn one sequence into another. Kendall distance is sometimes called bubble sort distance by analogy with the bubble sort algorithm. Kendall distance may be more appropriate than Cayley distance because people are more likely to accidentally transpose adjacent elements of a sequence.

Neither Cayley distance nor Kendall distance exactly correspond to the corruption of human memories, but both would guide you in correcting likely changes before exploring unlikely changes.

Related posts