Randomization to protect against bias

When implemented well, randomization can increase objectivity by protecting us from our own biases. Random simulations can make it practical to solve problems that would be impractical with any other approach. And yet randomization can fail in ways that are not obvious, reducing objectivity and giving unwarranted confidence in incorrect results.

Defensible randomization

You may need to defend your randomization procedures from criticism, maybe to a scientific journal or maybe in a court of law. Under such scrutiny, it’s not enough to say “we randomized.” A journal editor or a judge may want to know the details of your randomization procedure. In that case you’ll want a randomization procedure that can be justified and explained in detail.

Randomization can be counterintuitive

Randomization is subtle. Systems built on an incomplete understanding of randomization can appear to work until flaws are revealed much later, sometimes with expensive consequences. On the other hand, randomized systems can appear not to work even though they are working well because randomness can be hard to understand. Sometimes I help companies fix flaws in randomization. At least as often I help companies understand that what may seem like a flaw is actually expected behavior in a randomized system.

Random number generation

For randomization to work well, one needs a quality source of randomness, i.e. a good random number generator, and apply that randomness appropriately to the problem at hand. I help companies with both random number generation and with applications of randomization such as random testing—whether this means testing human subjects, physical devices, or software—and Monte Carlo simulation.

Random number generators contain deterministic algorithms designed to produce output that simulates non-deterministic behavior. It’s amazing that there are algorithms that do this well enough for many applications. But unless used carefully, random number generators can misbehave in mysterious ways.

Random number generators typically have two components: a source of uniform random numbers, and a procedure to transform these numbers to have a desired distribution. The source of uniform random number is typically a pseudo-random number generator. The difference between pseudo-random and truly random numbers often does not matter, though this depends on the application.

Most software developers should not develop their own uniform random number generators; there are too many subtle traps that are easy to fall into but hard to detect. However, developers often need to transform a trusted source of uniform random numbers into a source of random numbers with some other distribution. The most common challenges in such a transformation are simply knowing how to implement the transformation, efficiency, and testing.

Testing random number generators is unlike testing other software in that in general the tests must be statistical, not deterministic. Such tests will inevitably fail occasionally. Indeed such tests should fail occasionally, if the source is behaving randomly! The challenge is to understand how often the tests should fail. This makes it possible to infer whether a given failure rate is more likely caused by an error or by foreseeable fluctuation.

To read more, see my chapter How to Test a Random Number Generator in the book Beautiful Testing.

Monte Carlo Simulation

Monte Carlo simulations may be the largest consumers of random numbers. Simulations can provide a practical way to understand a system too complex to understand with more traditional analytical methods. However, subtle things can go wrong. And because simulation is usually used in situations too complex to study by other methods, errors can be hard to catch.

Randomization consulting

Get expert help with your randomization procedures. Be confident that they will work well and stand up to scrutiny.


Trusted consultants to some of the world’s leading companies

Amazon, Facebook, Google, US Army Corp of Engineers, Amgen, Microsoft, Hitachi Data Systems