Testing Cliff RNG with DIEHARDER

My previous post introduced the Cliff random number generator. The post showed how to find starting seeds where the generator will start out by producing approximately equal numbers. Despite this flaw, the generator works well by some criteria.

I produced a file of s billion 32-bit integers by multiplying the output values, which were floating point numbers between 0 and 1, by 232 and truncating to integer. Then I ran the DIEHARDER random number generator test suite.

The results were interesting. Before running the tests, I thought the tests would nearly all pass or nearly all fail, more likely the latter. But what happened was that many tests passed and some failed hard [1].

Here’s a list of the tests that passed:

  • diehard_birthdays
  • diehard_rank_32x32
  • diehard_rank_6x8
  • diehard_bitstream
  • diehard_oqso
  • diehard_dna
  • diehard_count_1s_str
  • diehard_count_1s_byt
  • diehard_runs
  • sts_monobit
  • sts_serial
  • rgb_bitdist
  • rgb_kstest_test
  • dab_dct
  • dab_filltree2
  • dab_monobit2

The tests that failed were:

  • diehard_parking_lot
  • diehard_2sphere
  • diehard_3sphere
  • diehard_squeeze
  • diehard_craps
  • marsaglia_tsang_gcd
  • rgb_lagged_sum
  • dab_bytedistrib

I’ve left out a few test results that were ambiguous as well as tests that were described as “Suspect” and “Do not use” on the DIEHARDER website.

The site I mentioned in the previous post where I ran across this generator said that it passed a spherical generation test. I assume the implementation of that test was less demanding that the version included in DIEHARD. But the generator does well by other tests.

The lagged sum test tests for autocorrelation. There’s a good reason for this: the generator has a short period, as I discuss here. The lagged sum test fails because the output has perfect autocorrelation at a lag equal to the period.

This generator demonstrates how passing a few RNG tests can be misleading. Or to look at it a different way, it shows how a generator can be useful for some tasks and terrible for other tasks.

More on testing RNGs

[1] By “failed hard” I mean the test return a p-value of zero. The p-value couldn’t actually be zero, but it was close enough that it the displayed value was exactly zero.