Testing the PCG random number generator

M. E. O’Neill’s PCG family of random number generators looks very promising. It appears to have excellent statistical and cryptographic properties. And it takes remarkably little code to implement. (PCG stands for Permuted Congruential Generator.)

The journal article announcing PCG gives the results of testing it with the TestU01 test suite. I wanted to try it out by testing it with the DIEHARDER test suite (Robert G. Brown’s extension of George Marsaglia’s DIEHARD test suite) and the NIST Statistical Test Suite. I used what the generator’s website calls the “minimal C implementation.”

The preprint of the journal article is dated 2015 but apparently hasn’t been published yet.

Update: See the very informative note by the author of PCG in the comments below.

For the NIST test suite, I generated 10,000,000 bits and divided them into 10 streams.

For the DIEHARDER test suite, I generated 800,000,000 unsigned 32-bit integers. (DIEHARDER requires a lot of random numbers as input.)

For both test suites I used the seed (state) 20170707105851 and sequence constant (inc) 42.

The PCG generator did well on all the NIST tests. For every test, at least least 9 out of 10 streams passed. The test authors say you should expect at least 8 out of 10 streams to pass.

Here’s an excerpt from the results. You can find the full results here.

--------------------------------------------------------
 C1  C2  C3 ...  C10  P-VALUE  PROPORTION  STATISTICAL TEST
--------------------------------------------------------
  2   0   2        0  0.213309     10/10   Frequency
  0   0   1        3  0.534146     10/10   BlockFrequency
  3   0   0        0  0.350485     10/10   CumulativeSums
  1   1   0        2  0.350485     10/10   CumulativeSums
  0   2   2        1  0.911413     10/10   Runs
  0   0   1        1  0.534146     10/10   LongestRun
  0   1   2        0  0.739918     10/10   Rank
  0   4   0        0  0.122325     10/10   FFT
  1   0   0        1  0.000439     10/10   NonOverlappingTemplate
  ...              
  2   1   0        0  0.350485      9/10   NonOverlappingTemplate
  0   2   1        0  0.739918     10/10   OverlappingTemplate
  1   1   0        2  0.911413     10/10   Universal
  1   1   0        0  0.017912     10/10   ApproximateEntropy
  1   0   1        1     ----       3/4    RandomExcursions
  ...              
  0   0   0        1     ----       4/4    RandomExcursions
  2   0   0        0     ----       4/4    RandomExcursionsVariant
  ...              
  0   0   3        0     ----       4/4    RandomExcursionsVariant
  1   2   3        0  0.350485      9/10   Serial
  1   1   1        0  0.739918     10/10   Serial
  1   2   0        0  0.911413     10/10   LinearComplexity

...

The DIEHARDER suite has 31 kinds tests, some of which are run many times, making a total of 114 tests. Out of the 114 tests, two returned a weak pass for the PCG input and all the rest passed. A few weak passes are to be expected from running so many tests and so this isn’t a strike against the generator. In fact, it might be suspicious if no tests returned a weak pass.

Here’s an edited version of the results. The full results are here.

#=============================================================================#
        test_name   |ntup| tsamples |psamples|  p-value |Assessment
#=============================================================================#
   diehard_birthdays|   0|       100|     100|0.46682782|  PASSED
      diehard_operm5|   0|   1000000|     100|0.83602120|  PASSED
  diehard_rank_32x32|   0|     40000|     100|0.11092547|  PASSED
    diehard_rank_6x8|   0|    100000|     100|0.78938803|  PASSED
   diehard_bitstream|   0|   2097152|     100|0.81624396|  PASSED
        diehard_opso|   0|   2097152|     100|0.95589325|  PASSED
        diehard_oqso|   0|   2097152|     100|0.86171368|  PASSED
         diehard_dna|   0|   2097152|     100|0.24812341|  PASSED
diehard_count_1s_str|   0|    256000|     100|0.75417270|  PASSED
diehard_count_1s_byt|   0|    256000|     100|0.25725000|  PASSED
 diehard_parking_lot|   0|     12000|     100|0.59288414|  PASSED
    diehard_2dsphere|   2|      8000|     100|0.79652706|  PASSED
    diehard_3dsphere|   3|      4000|     100|0.14978100|  PASSED
     diehard_squeeze|   0|    100000|     100|0.35356584|  PASSED
        diehard_sums|   0|       100|     100|0.04522121|  PASSED
        diehard_runs|   0|    100000|     100|0.39739835|  PASSED
        diehard_runs|   0|    100000|     100|0.99128296|  PASSED
       diehard_craps|   0|    200000|     100|0.64934221|  PASSED
       diehard_craps|   0|    200000|     100|0.27352733|  PASSED
 marsaglia_tsang_gcd|   0|  10000000|     100|0.10570816|  PASSED
 marsaglia_tsang_gcd|   0|  10000000|     100|0.00267789|   WEAK
         sts_monobit|   1|    100000|     100|0.98166534|  PASSED
            sts_runs|   2|    100000|     100|0.05017630|  PASSED
          sts_serial|   1|    100000|     100|0.95153782|  PASSED
...
          sts_serial|  16|    100000|     100|0.59342390|  PASSED
         rgb_bitdist|   1|    100000|     100|0.50763759|  PASSED
...
         rgb_bitdist|  12|    100000|     100|0.98576422|  PASSED
rgb_minimum_distance|   2|     10000|    1000|0.23378443|  PASSED
...
rgb_minimum_distance|   5|     10000|    1000|0.13215367|  PASSED
    rgb_permutations|   2|    100000|     100|0.54142546|  PASSED
...
    rgb_permutations|   5|    100000|     100|0.96040216|  PASSED
      rgb_lagged_sum|   0|   1000000|     100|0.66587166|  PASSED
...
      rgb_lagged_sum|  31|   1000000|     100|0.00183752|   WEAK
      rgb_lagged_sum|  32|   1000000|     100|0.13582393|  PASSED
     rgb_kstest_test|   0|     10000|    1000|0.74708548|  PASSED
     dab_bytedistrib|   0|  51200000|       1|0.30789191|  PASSED
             dab_dct| 256|     50000|       1|0.89665788|  PASSED
        dab_filltree|  32|  15000000|       1|0.67278231|  PASSED
        dab_filltree|  32|  15000000|       1|0.35348003|  PASSED
       dab_filltree2|   0|   5000000|       1|0.18749029|  PASSED
       dab_filltree2|   1|   5000000|       1|0.92600020|  PASSED

8 thoughts on “Testing the PCG random number generator

  1. If you need cryptographic properties, it’s better to take a modern block cipher like Speck and run it in CTR mode. PCG is not cryptographically strong. Speck is also quite small and very fast.

  2. John – do you know if, based on these results, PCG would be considered a cryptographically-secure pseudo random number generator (CSPRNG)? I seem to remember reading the PCG creator’s introduction to the RNG, saying it wasn’t meant for use in cryptography (e.g. key generation, nonces, challenge-response, etc.) but I don’t remember anymore.

    Probably not a fair question, you’ll probably answer, “Sorry I’m not a cryptographer” (and that’s fair), but I guess I don’t see the big difference between, say, PCG and Blum Blum Shub. If you know the seed (state), you know the output; but if you don’t know the seed, is the output from PCG not really discernable from a random bit stream? The test results seem to indicate “maybe so”.

  3. Dan, I see how PCG would be better for crypto than, say, Mersenne Twister, but I can’t comment on whether it’s good enough. You might look at the PCG web site or contact author.

  4. David Blackman

    The 64 bit versions of PCG can’t possibly be cryptographically secure by modern standards because that size is breakable by brute force. Random number quality of a single stream of numbers from PCG looks good for non-cryptographic use according to my testing so far. PCG claims to provide multiple independent streams, but last time i looked no advice was given on how to do this. I made a few guesses for how to do this, but for 4 streams or more my attempts had detectable correlations between streams. Multiple independent streams are important for high-end non-cryptographic use, but perhaps not for more casual applications.

  5. Hi John, author of PCG here. I stumbled on your post a week or so ago. Thanks for writing it. It’s good to know the DIEHARDER results. I was doing some testing at around the same time, with PractRand.

    FWIW, the DIEHARDER tests aren’t the most stringent ones out there. I think the two best test suites are PractRand and TestU01. TestU01 is older and has been considered the gold standard for a long time, but it isn’t quite as easy to use (it’s old and was designed to test smaller generators, so it’s only sensitive to the top 31-bits; as a result it’s easy to misunderstand how to properly test a 64-bit generator and come to entirely the wrong conclusions—I’ve seen it happen!). PractRand is excellent, much easier to use, gives you answers quicker, and catches almost all the problems TestU01 does, and some additional ones it doesn’t. I’ve come to like it quite a lot.

    I’ve got several new blog posts relevant to PractRand testing, which you should feel free to take a look at (click on blog in the top bar of the PCG site!). One post has PractRand results for PCG, but also one I just posted showing how to do testing with PractRand. You could confirm my PCG results if you like, but you might want to try testing some other well known generators and reporting the results. If I do it, people will think it’s all self promotion, rather than just trying to get the word out because some people might not know.

    John, in your post, you said that PCG has “excellent statistical and cryptographic properties”. I’ve learn it best never to say “cryptographic security” or “cryptographic properties” when trying to place something on a spectrum of prediction difficulty. Too many misunderstandings. Saying “prediction difficulty” still causes some crossed wires, but it’s about as good as we can go.

    Dan & Dmitriy, just to be 100% clear, I HAVE NEVER RECOMMENDED PCG FOR CRYPTOGRAPHY. I do however, care about prediction difficulty, and I don’t like trivially predictable generators. Because my viewpoint is often misunderstood, I have some more blog posts in the pipeline about these issues, but the key thing is that general-purpose PRNGs get used for almost anything, from using the low-order bit to toss a coin, to supporting randomized algorithms. If someone can predict your generator, they can mount an algorithmic complexity attack on your randomized algorithm, tanking its performance. If predicting the generator is more costly than the algorithmic complexity attack itself, people will try their attacks on easier targets. But if the generator spends to much time being trying hard to be unpredictable, we also tank our performance, thus we have to try to strike a balance in a different place than we do for traditional cryptographic applications. We already live in a world where hash table implementations in scripting languages need to be hardened because of algorithmic complexity attacks; this is the next logical step.

    All that said, there are members of the pcg family (not pcg32 and especially not pcg32_fast!) that I personally think would be really challenging to predict. I also find it frustrating that people don’t compare like with like. If you want to compare the prediction difficulty of PCG against a cryptographically secure PRNG, you need to compare a PCG variant at least broadly similar in size. Say, for example, we wanted to contrast PCG against the ChaCha PRNG from Orson Peters (perhaps with four rounds rather than the full 20), that PRNG is 104 bytes in size, so it’s fairest to compare it to pcg64_c8, which is 80 bytes, or pcg64_c16, which is 144 bytes.

    Regarding prediction difficulty, I’m very well aware of Bruce Schneier’s law, “Anyone, from the most clueless amateur to the best cryptographer, can create an algorithm that he himself can’t break. It’s not even hard. What is hard is creating an algorithm that no one else can break, even after years of analysis. And the only way to prove that is to subject the algorithm to years of analysis by the best cryptographers around.”, and his subsequent elaboration “Anyone can invent a security system that he himself cannot break. I’ve said this so often that Cory Doctorow has named it “Schneier’s Law”: When someone hands you a security system and says, “I believe this is secure,” the first thing you have to ask is, “Who the hell are you?” Show me what you’ve broken to demonstrate that your assertion of the system’s security means something.” Thus my personal thoughts on how difficult it is mean *NOTHING* in a cryptography context. I also can’t expect cryptographers to spend their time on my education, but if someone out does know a simple and efficient algorithm that can reliably break pcg64_c8, I really would love to see how. (Also, hey Bruce, gender-neutral language, it’s a thing.)

    I can do at least one thing that adds a tiny tiny tiny bit of credibility in the eyes of folks like Bruce Schneier , I can show other PRNGs I have broken that might have seemed hard to predict to a casual observer. Mostly that won’t actually help though, because a cryptographer would say “Ha! That’s toddler level stuff!” and a mathematician might say “I can’t understand why you care about prediction at all”. Sometimes I feel there should be more people trying to occupy the middle ground. Meh.

    But, let’s be clear, DO NOT USE PCG FOR CRYPTOGRAPHY. DO NOT USE PCG FOR CRYPTOGRAPHY. DO NOT USE PCG FOR CRYPTOGRAPHY. Clear?

    David, it’s good to know you’ve tested PCG as well. If you ever find anything interesting or surprising, I’d be delighted to get email from you. I appreciate your comment. I’ve written a long blog post about streams that cites your comment. I hope it illuminates things at least a little more clearly; feel free to check it out.

    I’ve been testing your work, too, David. I didn’t have an email for you, so I just reached out to Sebastiano. I don’t know if he forwarded the message on to you. If you’ve been immersed in doing random-number testing to the extent of testing my work, I’m sure you must have retested your own, too (and Sebastiano’s earlier work as well) and my findings won’t be any kind of surprise. FWIW, I’d rather live in a world where there are several good options for random number generation, so I’m always much happier saying “that one’s good too!”; I take no pleasure when the results are less positive.

  6. I recommend the following switches/syntax for testing Practrand (under 64-bit build on Windows):
    myPRNG.exe | rng_test.exe stdin -tf 2 -te 1 -tlmin 1KB -multithreaded
    or, for files (3x faster than using a ‘type filename’ pipe)
    rng_test.exe stdin -tf 2 -te 1 -tlmin 1KB -multithreaded 1E-5 before FAIL to document the PRNG max range (and tolerate the occasional BCFN_FF:FRQ at p > 1E-8, per the authors recommendation).

    Errata: overclocking my XEON W3690 to x29 just to get the testing done, since it takes days.

  7. Looks like much of my post got mangled and also lost switch usage rationale.
    I’m not going to replicate, so ignore ‘1E-5’ and after.

Leave a Reply

Your email address will not be published. Required fields are marked *