Sometimes you have a poor quality source of randomness and you want to refine it into a better source. You might, for example, want to generate cryptographic keys from a hardware source that is more likely to produce 1’s than 0’s. Or maybe your source of bits is dependent, more likely to produce a 1 when it has just produced a 1.

## Von Neumann’s method

If the bits are biased but independent, a simple algorithm by John von Neumann will produce unbiased bits. Assume each bit of input is a 1 with probability *p* and a 0 otherwise. As long as 0 < *p* < 1 and the bits are independent, it doesn’t matter what *p* is [1].

Von Neumann’s algorithm works on consecutive pairs bits. If the pair is 00 or 11, don’t output anything. When you see 10 output a 1, and when you see 01 output a 0. This will output a 0 or a 1 with equal probability.

## Manuel Blum’s method

Manuel Blum built on von Neumann’s algorithm to extract independent bits from dependent sources. So von Neumann’s algorithm can remove bias, but not dependence. Blum’s algorithm can remove bias and dependence, if the dependence takes the form of a Markov chain.

Blum’s assumes each state in the Markov chain can be exited two ways, labeled 0 and 1. You could focus on a single state and use von Neumann’s algorithm, returning 1 when the state exits by the 1 path the first time and 0 the second time, and return 0 when it does the opposite. But if there are many states, this method is inefficient since it produces nothing when the chain is in all the other states.

Blum basically applies the same method to *all* states, not just one state. But there’s a catch! His paper shows that the naive implementation of this idea is wrong. He gives a pseudo-theorem and a pseudo-proof that the naive algorithm is correct before showing that it is not. Then he shows that the correct algorithm really is correct.

This is excellent exposition. It would have been easier to say nothing of the naive algorithm, or simply mention in passing that the most obvious approach is flawed. But instead he went to the effort of fully developing the wrong algorithm (with a warning up-front that it will be shown to be wrong).

## Related posts

[1] It matters for efficiency. The probability that a pair of input bits will produce an output bit is 2*p*(1 – *p*), so the probability is maximized when *p* = 0.5. If *p* = 0.999, for example, the method will still work correctly, but you’ll have to generate 500 pairs of input on average to produce each output bit.

For independent but biased bits, you could improve the von Neumann method by looking at more than 2 bits at once.

Let’s say I have a source of random independent bits that generates a 1 about 10% of the time. With the von Neumann method it takes about 20 bits to generate a single unbiased bit. But let’s say I look at 17 bits at once. (Why 17? I took a look at Pascal’s triangle and saw lots of numbers that are a bit more than a power of 2.)

If I get zero 1’s, I’m out of luck. But there are 17 equally likely ways to get one 1, and I can assign each of them to a sequence of 4 bits, except for one that has to remain unassigned. I can do the same with 128 of the 136 ways to get two 1’s (obtaining 7 bits), and 512 of the 680 ways to get three 1’s (obtaining 9 bits), and so on. On average this would produce about 4.8 unbiased bits per 17 input bits, or 0.28 unbiased bits per input bit – pretty close to the information-theoretic maximum of 0.3.

I don’t know whether or not this idea could be extended to Blum’s method if the transition probabilities are highly biased.

Salil Vadhanâ€™s monograph on pseudorandomness has a good chapter on randomness extraction

https://people.seas.harvard.edu/~salil/pseudorandomness/extractors.pdf

Thanks. Looks like a good book.

Many thanks. Would you please explain Von Newman algorithm. If p=0.99999 for ‘1’s then we get long sequences of ‘1’s and in between a single ‘0’ (or short sequence of ‘0’s.) Is the result of the algorithm always a ‘1’ immediately followed by a ‘0’?

Dan: The result of the algorithm is an independent stream of bits. So after seeing a 1, there’s a 50% chance you’ll get another 1.