Last year I wrote a chapter for O’Reilly’s book Beautiful Testing. The publisher gave each of us permission to post our chapters electronically, and so here is Chapter 10: How to test a random number generator.
How to test a random number generator
Posted in Software development, Statistics
16 comments on “How to test a random number generator”
6 Pings/Trackbacks for "How to test a random number generator"

[...] This post was mentioned on Twitter by jebyrnes, James Church, Magnus Lie Hetland, ChanHo Kim, Probability Fact and others. Probability Fact said: RT @CompSciFact: How to test a random number generator http://bit.ly/e2CPxL [...]

[...] How to test a random number generator — The Endeavour [...]

[...] Happy Thanksgiving, everyone. Earlier today, I found an interesting post from Bo Allen on pseudorandom vs random numbers, where the author uses a simple bitmap (heat map) to show that the rand function in PHP has a systematic pattern and compares these to truly random numbers obtained from random.org. The post’s results suggest that pseudorandomness in PHP is faulty and, in general, should not be underestimated in practice. Of course, the findings should not be too surprising, as there is a large body of literature on the subtleties, philosophies, and implications of the pseudo aspect of the most common approaches to random number generation. However, it is silly that PHP‘s random number generator (RNG) displays such an obvious pattern nowadays because there are several decent, wellstudied pseudoRNG algorithms available as well as numerous tests for randomness. For a good introduction to RNG, I recommend John D. Cook’s discussion on testing a random number generator. [...]

[...] interesting things about Mersenne primes How to test a random number generator Twin prime conjecture and the Pentium division [...]

[...] The KS test is discussed in John Cook’s chapter on testing a random number generator in Beautiful Testing. It is freely readable here. [...]

[...] How to test a random number generator. [...]
hi john,
thank you for the chapter.
i’ve been following your blog this year. just wondering if you were planning to write a numerical computing book. or if there are books you would recommend.
Wen, Thanks. I don’t have any plans to write a numerical computing book.
One of my favorite numerical books is an old one, Numerical Methods that Work. It’s not a cookbook. It’s a book on how to think about numerical computing.
Thanks for that, great read. Any suggestions for a programming recipe book that covers the same or similar tests?
Hansi: You can read more about some of the tests in Knuth’s Seminumerical Algorithms, TAOCP volume 2. You might also check Numerical Recipes.
Great, I’m pretty sure there is a copy of Knuth’s TAOCP at work. I’ll have a look tomorrow.
Thanks again.
Acton’s book is an entertaining read, but how well has it stood up? Computing power and software have advanced so much in the 40 years since he wrote it, it’s not clear that the cautionary tales he tells are still as relevant. He might tell you a story of how 20 minutes of thought could have saved an investigator 20 hours of computing time. Now however, that same computer program, or a more robust offtheshelf one, might run in a couple of seconds. So why think?
You could argue that the complexity of problems has grown as computing power has grown, so that thought is as necessary as ever. However, I might counter that more complex problems are difficult to tackle in the way the problems in Acton’s book can be tackled. Don’t the problems in Acton’s book seem rather… quaint?
I think if I were going to recommend a book for learning practical, handson numerical computing, I would recommend Numerical Recipes.
I think the principles in Acton’s book are still valuable, though the specifics are outdated. For example, most computing is now done in double precision rather than single precision. Loss of accuracy still happens when you’re not careful; you see the same problems but for a smaller range of inputs.
Acton doesn’t deal with sorting, but bubble sort is still slower than heap sort, even though both are instantaneous for small data sets. The idea of what is “small” has changed over time, but the principle remains.
However, there is one important area that Acton doesn’t address at all if I remember correctly, and that is random methods: random number generation, Monte Carlo integration, etc. These methods have become far more common in practice (at least in my practice :)).
I think the most outofdate aspect of Acton’s book is that it has an entire chapter on Fourier Series, but doesn’t mention the FFT.
I liked the chapter, in particular the line “The point is not only to test the quality of the library itself, but also to test the user’s understanding of the library.”
I’m surprised you didn’t recommend plotting the data and taking a look at it – always seems valuable.
I like this one for testing RNG’s.
http://portal.acm.org/citation.cfm?doid=1268776.1268777
There is a nice battery of tests ready to go in C++. If the RNG makes it through that in good shape, great.
One kind of disturbing aspect of the chapter is that its existence implies software developers should be going off and writing their own RNGs. Why bother when there are good ones out there that have been through very rigorous scrutiny?
The problem with plotting random data, especially uniform data, is that we can’t visually detect any meaningful flaws in the data.
Jason: I don’t want to encourage anyone to develop their own uniform random number generation algorithm. But some people will need to implement existing algorithms in new languages. And even more people will need to transform a uniform random sequence into a sequence with another distribution, maybe a distribution unique to their problem.
John,
Thanks for the reply. I hope the readership of the book takes your perspective as well!
Jason
Hi John,
Even though, this post is no longer a current subject of discussion. I write to appreciate your efforts in putting together a collection of numerical gems in this blog. I have benefitted quite a lot from this blog. Remain blessed.
From Singapore.
Hi John,
I share your love of Acton’s book and Steve’s recognition that it is, in some respects, outdated. I like Acton’s mindset of thinking about what you are trying to calculate and being willing to find different algorithms when you are up against the potential for or reality of catastrophic loss of precision. The original book had the title embossed into the red cover, with the letters highlighted in silver. Also embossed, but not highlighted, was the word “usually”. Lovely.
Just came across this link from the CompSciFact twitter account. Are you aware of the typo in the first footnote on page 131? The word “lest” should be “least”, I believe. I know this is a few years old, so maybe it’s not helpful now.
I’m reminded that sometimes, even though you may THINK you want random, it may not be what you want. From this tweet from earlier today, what the tweeter really wants is “random, or even a cheap approximation, unless the current entry has come up TOO recently:” “iPad, I appreciate your shuffle. I do. And you know I like Enya “The Longships.” But not as every third song. #shuffleparalysis”
And I hope I’m not too much of an insufferable pedant in pointing out that this is about testing pseudorandom number generators (PRNG) – a “true” random number generator will (almost?!) always pass such tests.