Golay codes

Photo of Jupiter taken by Voyager

Suppose you want to sent pictures from Jupiter back to Earth. A lot could happen as a bit travels across the solar system, and so you need some way of correcting errors, or at least detecting errors.

The simplest thing to do would be to transmit photos twice. If a bit is received the same way both times, it’s likely to be correct. If a bit was received differently each time, you know one of them is wrong, but you don’t know which one. So sending the messages twice doesn’t let you correct any errors, but it does let you detect some errors.

There’s a much better solution, one that the Voyager probes actually used, and that is to use a Golay code. Take the bits of your photo in groups of 12, think of each group as a row vector, and multiply it by the following matrix, called the generator matrix:


(You might see different forms of the generator matrix in different publications.)

That’s the (extended binary) Golay code in a nutshell. It takes groups of 12 bits and returns groups of 24 bits. It doubles the size of your transmission, just like transmitting every image twice, but you get more bang for your bits. You’ll be able to correct up to 3 corrupted bits per block of 12 and detect more.

Matrix multiplication here is done over the field with two elements. This means multiplication and addition of 0’s and 1’s works as in the integers, except 1 + 1 = 0.

Each pair of rows in the matrix above differ in 8 positions. The messages produced by multiplication are linear combinations of these rows, and they each differ in at least 8 positions.

When you receive a block of 24 bits, you know that there has been some corruption if the bits you receive are not in the row space of the matrix. If three or fewer bits have been corrupted, you can correct for the errors by replacing the received bits by the closest vector in the row space of the matrix.

If four bits have been corrupted, the received bits may be equally close to two different possible corrections. In that case you’ve detected the error, but you can’t correct it with certainty. If five bits are corrupted, you’ll be able to detect that, but if you attempt to correct the errors, you could mistakenly interpret the received bits to be a corruption of three bits in another vector.

Going deeper

In one sense, Golay codes are very simple: just multiply by a binary matrix. But there’s a lot going on beneath the surface. Where did the magic matrix above come from? It’s rows differ in eight positions, but how do you know that all linear combinations for the rows also differ in at least eight positions? Also, how would you implement it in practice? There are more efficient approaches than to directly carry out a matrix multiplication.

To give just a hint of the hidden structure in the generator matrix, split the matrix half, giving an identity matrix on the left and a matrix M on the right.


The 1’s along the last row and last column have to do with why this is technically an “extended” Golay code: the “perfect” Golay code has length 23, but an extra check sum bit was added. Here “perfect” means optimal in a sense that I may get into in another post. Let’s strip off the last row and last column.


The first row corresponds to quadratic residues mod 11. That is, if you number the columns starting from 0, the zeros are in columns 1, 3, 4, 5, and 9. These are the non-zero squares mod 11. Also, the subsequent rows are rotations of the first row.

Golay codes are practical for error correction—they were used to transmit the photo at the top of the post back to Earth—but they also have deep connections to other parts of math, including sphere packing and sporadic groups.

Related posts

4 thoughts on “Golay codes

  1. As an aside to “Golay codes are very simple”: Many other codes have generator matrices that are used in the same way. They are called linear block codes because they use linear algebra in their construction. Efficient maximum-likelihood decoders for them are an interesting topic; as the number of check boys grows, finding the most similar valid message either needs a special structure in the code or needs exponentially growing time.

    (I am sure Dr. Cook is familiar with the field, but other readers might not be.)

  2. You could multiply by a projection matrix, projecting the received vector onto the space of code vectors. There’s probably a clever way to do this without directly multiplying by a matrix.

  3. There’s an excellent book that covers this (and more): Information Theory, Inference, and Learning Algorithms, by David J.C. MacKay. (It’s possible I may have learned about it on this web site.)

Leave a Reply

Your email address will not be published. Required fields are marked *