Around this time last year I wrote about the entropy extractor used in μRNG. It takes three biased random bit streams and returns an unbiased bit stream, provided each input stream as has least 1/3 of a bit of min-entropy.
I’ve had in the back of my mind that I should go back and run the output of the extractor through a standard test suite. I’ve been doing more RNG testing lately, and so while the software is fresh on my mind I wanted to go back and test the entropy extractor.
To create a biased bit stream, I first created an unbiased bit stream using the PCG random number generator, then took the bitwise OR of consecutive samples. In C notation, I created two 32-bit unsigned integers u
and v
and used u|v
. The resulting bits are 1’s three out of four times since a bit is 0 only if both corresponding bits are 0’s. The min-entropy of the resulting stream is
-log2 max(0.25, 0.75) = 0.415
which is larger than 1/3, so we should be able to apply the μRNG entropy extractor. To do this, we create three bit streams. Let a, b, and c be 8-bit bytes, each from a different stream. Then we combine these into
a×b + c
and use that as one byte of output. So it takes three bytes of input to produce one byte of output, which is to be expected since we’re starting with sources that may contain only 1/3 of a bit of min-entropy per bit of output.
The multiplication and addition above are carried out in the Galois field GF(28). This means that multiplication may be like nothing you expect, and addition is XOR, i.e. bitwise exclusive OR. The multiplication is the same at that used in AES encryption.
NIST Statistical Test Suite
There are several test suites we could use—DIEHARDER, PractRand, TestU01, etc.—and expect I’ll write more about those before long, but for this post we’ll focus on the NIST Statistical Test Suite or STS. (Update: the extractor fails hard on PractRand.) The STS suite includes the following 15 tests.
- Frequency (monobit)
- Frequency test within a block
- Runs test
- Test for the longest run of ones in a block
- Binary matrix rank test
- Discrete Fourier Transform (spectral) test
- Non-overlapping template matching test
- Overlapping template matching test
- Maurer’s universal statistical test
- Linear complexity test
- Serial test
- Approximate entropy test
- Cumulative sums (cumsum) test
- Random excursions test
- Random excursions variant test
See NIST Special Publication 800-22: A Statistical Test Suite for Random and Pseudorandom Number Generators for Cryptographic Applications.
When we run the input of PCG alone through STS, it passes with flying colors. If we run the output of PCG after OR’ing pairs of unsigned integers together, the test fails spectacularly, spewing underflow warnings. But if we take three streams of biased bits that would each fail and keep the extracted output, it passes all the tests, just as with the original output of PCG.
In each case I tested one million 32-bit unsigned integers. In the biased case I sampled three million integers
Note that in our example we know the amount of bias because we deliberately created the bias. You might not know the bias, or equivalently the min-entropy, in general. As long as the min-entropy is greater than 1/3, the entropy extractor works.