This post looks at applying object oriented programming ideas to probability distribution software. It explains the Liskov Substitution Principle and shows how it can keep you from falling into a subtle trap.
One of the big ideas in object-oriented programming is to organize software into units of data and related functions that represent things in your problem domain. These units are called classes. Particular instances of classes are called objects. For example, a business application could have a customer class. Particular customers are represented by customer objects.
The C++ numerical library that we developed at MDACC has classes that represent probability distributions. A probability distribution class contains methods (functions) such as PDF
, CDF
, Mean
, Mode
, etc. For example, the NormalDistribution
class represents normal distributions. Particular NormalDistribution
objects each have their own mean and variance parameters.
Another big idea of object-oriented programming is inheritance, a way to organize classes into a hierarchy. This is where things get more interesting.
Inheritance is commonly described as an “is a” relationship. That explanation is often helpful, but sometimes it can get you into trouble. (Listen to this interview with Robert Martin for an explanation.) Probability distributions illustrate when “is a” should be represented by inheritance and when it should not be.
A beta distribution is a continuous distribution. So is a normal distribution. The BetaDistribution
and NormalDistribution
classes representing these probability distributions both derive from ContinuousDistribution
class. This makes it possible to write generic code that operates on continuous distributions. Later we could pass in a particular type of continuous distribution rather than having to write special code for every kind of continuous distribution.
Now think about a chi square distribution. A chi square distribution with ν degrees of freedom is a gamma distribution with shape ν/2 and scale 2. So in a mathematical sense, a chi square distribution “is a” gamma distribution. But should a class representing a chi square distribution inherit from a class representing a gamma distribution? The surprising answer is “no.” A rule called the “Lyskov Substitution Principle” (LSP) says this is a bad idea.
When a class X inherits from a class Y, we say X is the derived class and Y is the base class. The LSP says code should work without surprises when an instance of a derived class is passed into a function written to receive instances of the base class. Deriving a BetaDistribution
class from a ContinuousDistribution
class should not lead to any surprises. A function that handles continuous distributions in general should work just fine when you give it a specific distribution such as a beta, normal distribution, etc.
Now suppose we derive our ChiSquareDistribution
class from the GammaDistribution
class. Suppose also we have a function that expects a GammaDistribution
object. What happens if we pass it a ChiSquareDistribution
? Maybe the function works with no surprises. If the function calls methods like PDF
or CDF
there’s no problem. But what if the function calls a SetParameters
method specifies the shape and scale of the distribution? Now we have a problem. You can’t set the shape and scale independently for a chi square distribution.
If you try to make this work, you’re going to dig yourself into a hole. The code can’t be intuitive: two people could have reasonable but different expectations for how the code should behave. And attempts to patch the situation are only going to make things worse, introducing awkward dependencies and generally entangling the code. The LSP says don’t go there. From an object oriented programming view point, the gamma and chi square distributions are simply unrelated. Neither derives from the other.
The canonical explanation of the LSP uses squares and rectangles. Geometrically, a square is a special type of rectangle. But should a Square
class derive from a Rectangle
class? The LSP says no. You can’t set the length and width of a square independently. What should a Square
class do when someone tries to set its length and width? Ignore one of them? Which one? Suppose you just use the length input and set the width equal to the length. Now you’ve got a surprise: setting the length changes the width, not something you’d expect of rectangles. Robert Martin does a good job of explaining this example in the interview mentioned above. On the other hand, if a Square
class and a Rectangle
class both derive from a Shape
class, code written to act on Shape
objects will work just fine when passed either Square
objects or Rectangle
objects.
Programmers will argue till they’re blue in the face over whether a Square
“is a” Rectangle
, or vice versa, or neither. The resolution to the argument is that inheritance does not mean “is a.” The idea of “is a” is often useful when thinking about inheritance, but not always. Of course a square is a rectangle, but that does not mean it’s wise to derive a Square
class from a Rectangle
class. Inheritance actually has to do with interface contracts. The pioneers of object oriented programming did not use the term “is a” for inheritance. That terminology came later.
So although a chi square distribution is a gamma distribution, a ChiSquareDistribution
class should not inherit from a GammaDistribution
class, just as a square is a rectangle but a Square
class should not inherit from a Rectangle
class. On the other hand, chi square and gamma distributions are continuous distribution, and it’s fine for ChiSquareDistribution
and GammaDistribution
classes to inherit from a ContinuousDistribution
class, just as it’s fine for Square
and Rectangle
classes to derive from a Shape
class. The difference is a matter of software interface functionality and not philosophical classification.
Is there a reason not to have ChiSquareDistribution and GammaDistribution each inherit from some class like BareBonesGammaDistribution, with the parameter-setting methods written individually into the former pair but with methods for getting values written once for all in the latter? That is, isn’t it just a matter of needing another level of inheritance hierarchy?
What do a gamma and a chi squared distribution have in common? They share any behavior common to all continuous distributions, so it would make sense for both classes to derive from a class like
ContinuousDistribution
. But is there anything more specific they share? Not much. Both distributions have support on [0, ∞), but so does the log-normal distribution, for example. Any behavior related to parameters is going to be a problem, and behavior that doesn’t depend on parameters may also not depend on distribution family at all.Obviously a
ChiSquareDistribution
class has a lot in common internally with aGammaDistribution
class. The former could be a thin wrapper around an instance of the latter. But in terms of interfaces, the two classes don’t have much in common except what they also share with a lot of other distributions.John,
This is a great post. Do you make the C++ numerical library that you mentioned available? It would be great to play with the code. I understand if the codes are proprietary. :-)
Thanks!