# Probability distributions and object oriented programming

This post looks at applying object oriented programming ideas to probability distribution software. It explains the Liskov Substitution Principle and shows how it can keep you from falling into a subtle trap.

One of the big ideas in object-oriented programming is to organize software into units of data and related functions that represent things in your problem domain. These units are called classes. Particular instances of classes are called objects. For example, a business application could have a customer class. Particular customers are represented by customer objects.

The C++ numerical library that we developed at MDACC has classes that represent probability distributions. A probability distribution class contains methods (functions) such as `PDF`, `CDF`, `Mean`, `Mode`, etc.  For example, the `NormalDistribution` class represents normal distributions. Particular `NormalDistribution` objects each have their own mean and variance parameters.

Another big idea of object-oriented programming is inheritance, a way to organize classes into a hierarchy. This is where things get more interesting.

Inheritance is commonly described as an “is a” relationship. That explanation is often helpful, but sometimes it can get you into trouble. (Listen to this interview with Robert Martin for an explanation.) Probability distributions illustrate when “is a” should be represented by inheritance and when it should not be.

A beta distribution is a continuous distribution. So is a normal distribution. The `BetaDistribution` and `NormalDistribution` classes representing these probability distributions both derive from `ContinuousDistribution` class. This makes it possible to write generic code that operates on continuous distributions. Later we could pass in a particular type of  continuous distribution rather than having to write special code for every kind of continuous distribution.

Now think about a chi square distribution. A chi square distribution with ν degrees of freedom is a gamma distribution with shape ν/2 and scale 2. So in a mathematical sense, a chi square distribution “is a” gamma distribution. But should a class representing a chi square distribution inherit from a class representing a gamma distribution? The surprising answer is “no.” A rule called the “Lyskov Substitution Principle” (LSP) says this is a bad idea.

When a class X inherits from a class Y, we say X is the derived class and Y is the base class. The LSP says code should work without surprises when an instance of a derived class is passed into a function written to receive instances of the base class.  Deriving a `BetaDistribution` class from a `ContinuousDistribution` class should not lead to any surprises. A function that handles continuous distributions in general should work just fine when you give it a specific distribution such as a beta, normal distribution, etc.

Now suppose we derive our `ChiSquareDistribution` class from the `GammaDistribution` class. Suppose also we have a function that expects a `GammaDistribution` object. What happens if we pass it a `ChiSquareDistribution`? Maybe the function works with no surprises. If the function calls methods like `PDF` or `CDF` there’s no problem. But what if the function calls a `SetParameters` method specifies the shape and scale of the distribution? Now we have a problem. You can’t set the shape and scale independently for a chi square distribution.

If you try to make this work, you’re going to dig yourself into a hole. The code can’t be intuitive: two people could have reasonable but different expectations for how the code should behave. And attempts to patch the situation are only going to make things worse, introducing awkward dependencies and generally entangling the code. The LSP says don’t go there. From an object oriented programming view point, the gamma and chi square distributions are simply unrelated. Neither derives from the other.

The canonical explanation of the LSP uses squares and rectangles. Geometrically, a square is a special type of rectangle. But should a `Square` class derive from a `Rectangle` class? The LSP says no. You can’t set the length and width of a square independently. What should a `Square` class do when someone tries to set its length and width? Ignore one of them? Which one? Suppose you just use the length input and set the width equal to the length. Now you’ve got a surprise: setting the length changes the width, not something you’d expect of rectangles. Robert Martin does a good job of explaining this example in the interview mentioned above. On the other hand, if a `Square` class and a `Rectangle` class both derive from a `Shape` class, code written to act on `Shape` objects will work just fine when passed either `Square` objects or `Rectangle` objects.

Programmers will argue till they’re blue in the face over whether a `Square` “is a” `Rectangle`, or vice versa, or neither. The resolution to the argument is that inheritance does not mean “is a.” The idea of “is a” is often useful when thinking about inheritance, but not always. Of course a square is a rectangle, but that does not mean it’s wise to derive a `Square` class from a `Rectangle` class. Inheritance actually has to do with interface contracts. The pioneers of object oriented programming did not use the term “is a” for inheritance. That terminology came later.

So although a chi square distribution is a gamma distribution, a `ChiSquareDistribution` class should not inherit from a `GammaDistribution` class, just as a square is a rectangle but a `Square` class should not inherit from a `Rectangle` class. On the other hand, chi square and gamma distributions are continuous distribution, and it’s fine for `ChiSquareDistribution` and `GammaDistribution` classes to inherit from a `ContinuousDistribution` class, just as it’s fine for `Square` and `Rectangle` classes to derive from a `Shape` class. The difference is a matter of software interface functionality and not philosophical classification.

## 5 thoughts on “Probability distributions and object oriented programming”

1. The Friar

Is there a reason not to have ChiSquareDistribution and GammaDistribution each inherit from some class like BareBonesGammaDistribution, with the parameter-setting methods written individually into the former pair but with methods for getting values written once for all in the latter? That is, isn’t it just a matter of needing another level of inheritance hierarchy?

2. What do a gamma and a chi squared distribution have in common? They share any behavior common to all continuous distributions, so it would make sense for both classes to derive from a class like `ContinuousDistribution`. But is there anything more specific they share? Not much. Both distributions have support on [0, ∞), but so does the log-normal distribution, for example. Any behavior related to parameters is going to be a problem, and behavior that doesn’t depend on parameters may also not depend on distribution family at all.

Obviously a `ChiSquareDistribution` class has a lot in common internally with a `GammaDistribution` class. The former could be a thin wrapper around an instance of the latter. But in terms of interfaces, the two classes don’t have much in common except what they also share with a lot of other distributions.

3. John,

This is a great post. Do you make the C++ numerical library that you mentioned available? It would be great to play with the code. I understand if the codes are proprietary. 🙂

Thanks!