Covariance and contravariance in math and CS

I heard the terms “covariance” and “contravariance” used in math long before I heard them used in object oriented programming.  I was curious whether there was any connection between the two. To my surprise, they’re very similar. In fact, you could formalize the OOP use of the terms so that they’re not just analogous but actually  special cases of the mathematical terms.

When I started writing this post, I intended to explain covariance and contravariance. However, the post became longer and more technical than I like to write here. Instead, I’ll just announce that a connection exists and give references for those who want to read further.

Chris Burrows describes covariance and contravariance in object oriented programming in his article New C# Features in the .NET Framework 4.

The terms covariant and contravariant were defined in category theory before computer scientists applied the terms to object oriented programming. Wikipedia has a short, readable introduction to category theory, including covariant and contravariant functors. See also A Categorical Manifesto (PostScript file).

Computer scientists have been interested in category theory for some time, so it’s not too surprising that category theory terms would filter down into practical programming. The real surprise was hearing category terminology used outside of math. It was like the feeling you get when you run into a coworker at a family reunion or a neighbor at a restaurant in another city.

Update (3 Jan 2011): See also Liskov Substitution Principle is Contravariance

Update (28 Feb 2013): See also how covariance and contravariance are used in the opposite sense with vector fields.

Related posts:

My mathematical opposite
Five principles for object oriented software engineering
Probability distributions and object oriented programming

Tagged with: , , ,
Posted in Math, Software development
5 comments on “Covariance and contravariance in math and CS
  1. mrkkrj says:

    Sorry, I cannot see the equivalence between category-theorethical and OO terms here. Maybe because I do not appreciate category theory? In math it seems to be a mapping replacing the domain and range of a functor, in OO it extends the range. Am I mistaken?

    Regards,
    Marek

  2. John says:

    Marek: In OO, the mapping between base types induces a map between the corresponding parameterized types. In Chris Burrows’ article, his example is that a manager is a type of employee. This induces a relationship between iterators over managers and iterators over employees. The key observation is that the “arrows” in Burrows’ diagram go in the same direction.

    Manager → Employee

    IEnumerable<Manager> → IEnumerable<Employee>

    This is covariance. And in his example of the IComparer interfaces, the arrows go the other way. This is contravariance.

    Category theory looks at this kind of situation in general, when maps between objects induce maps between associated objects. The first example I saw was in algebraic topology. There you associate an algebraic object with a topological space with the hope that some questions about topology can be answered by answering questions about their associated algebraic structures. A continuous function between topological spaces induces a map between the associated fundamental groups. And the arrows go in the same direction: you have a covariant functor.

    Category theory abstracts all this. It doesn’t care whether your objects are topological spaces or employees. The pattern is the same, and all it cares about is the patterns.

    In Chris Burrows’ article, you could think of the map between Manager objects and Employee objects as extending the domain. But that could be misleading because the relationships category theory are concerned with are far more general. What matters is that there’s some mapping. In this case it’s such a simple mapping that it could be missed. You could have instead a more complex mapping, such as a map between the XML serialization of an object and its binary representation in memory.

  3. Dean Wampler says:

    I learned a long time about the Liskov Substitution Principle, which is closely related to Bertran Meyer’s “Design by Contract”, at least as the latter behaves under inheritance (i.e., covariant vs. contravariant behaviors). Recently, this post argued that the Liskov Substitution Principle is contravariance.

    http://apocalisp.wordpress.com/2010/10/06/liskov-substitution-principle-is-contravariance/

    dean

  4. Looking forward to when you do clean up and do another post about co- and contra-variance. This is what’s on my desk to read about it.

    http://www.math.ucsd.edu/~ctiee/tensors.pdf

    I get the turning around arrows part (or I think I get it), but there also seems to be a relationship to raising and lowering indices in GR. Maybe I just don’t understand GR well enough and that’s the real problem.

  5. Following is one of several good quotes from Chris Tiee about covariance and contravariance. The PDF is down (although I might post it drop.io if I feel adventurous) … hopefully Chris wasn’t taking that page down on purpose.

    Think about a vector field. Do you want to take the derivative (x+h) – x of the TAILS of the vectors, or the HEADS?

    1.2. DoVectorsHave Location? You were always taught in vector calculus that “vectors have no location; they can be slid around with impunity.” This is blatantly false on a general manifold for two reasons. First, there is no real “canonical”way to do this sliding on a general manifold. In Rn there is a very natural and obvious identification of tangent spaces, namely sliding things to the origin. This is known to not cause any problems because, the standard structure of Rn always implicitly assumes the presence of a flat (Euclidean) metric. Using a metric structure (or more precisely, a connection) the ability to slide vectors along makes somewhat of a comeback. One place where sliding is implicit is in differentiation of vector fields on Rn —in the difference quotient, we have V xj = limh→0(V(x + hej) − V(x)) h, the two vectors, although they are very close together, still do not technically live in the same tangent spaces, and hence technically cannot be subtracted without taking making their tails meet. This is the whole reason why covariant differentiaton was originally invented.

    ————————–
    Chris also makes the point that covariant and contravariant are mis-named, essentially because the people who invented the terms were thinking too much in terms of Calculation and not enough in terms of Geometry.

    and All That.”) The terms contravariant and covariant are sort of relic terms from the bad old days and are confusing. The original model was that covariance means that decreasing the scale of space along the direction of a such a tensor increases the magnitude, whereas contravariant ones get larger along with space. That sounds backward, right? You’d expect
    covariant to mean that it varied with the space, and contravariant to mean it varies against.

    The reason for this is because components were the main thing people thought of back in the day, rather than invariant vectors, so when scaling the space, one would change the size of the basis vectors, and so the components, whose transformation properties are opposite to those of bases, in actuality do have the right correspondence in scaling and so forth. A better picture of what is going on is that covariant things have “plane-type” directionality, in which direction is given by the transverse dimensions (like the planes of a stack “facing” in its direction) wherease contravariant things have have “line-type” directionality, i.e. direction is given along whatever they are , such as an arrow “pointing.” This is all pretty vague without metric notions, so just think of covariant objects having direction arising from the object’s orthogonal complement (i.e. being oriented by a suitably generalized notion of “normal vector”), and contravariant ones as things whose direction arises from the space spanned by the object itself (“tangency”). Basically it’s much harder to picture in cases apart from 1- and (n − 1)-degree objects; “tangential” vs. “transverse” directions seem to me the best way to think about it. This does go somewhat awry with densitization, although there is a nice explanation for what happens here, at least for densitization of m-vectors (see §2.2).

    The modern mathematical terminology “covariant” and “contravariant” pertains to the category-theoretic notion of functors, meaning whether some map of manifolds induces a map of some corresponding space going in the same direction oppositely. It so happens that maps of manifolds do induce maps on tensor spaces, and the relationship is also backward from the old terminology.

1 Pings/Trackbacks for "Covariance and contravariance in math and CS"
  1. [...] This post was mentioned on Twitter by John D. Cook, Denny Abraham. Denny Abraham said: RT @JohnDCook: Covariance and contravariance in object oriented programming and in math http://bit.ly/aIMJ7p [...]