When a program needs to work with different systems of units, it’s best to consistently use one system for all internal calculations and convert to another system for output if necessary. Rigidly following this convention can prevent bugs, such as the one that caused the crash of the Mars Climate Orbiter.

For example, maybe you need to work in degrees and radians. It would be sensible to do all calculations in radians, because that’s what software libraries expect, and output results in degrees, because that’s what humans expect.

Now suppose you have a function that takes in a length and doubles it, and another function takes in a length and triples it. Both functions take in length in kilometers but print the result in miles.

You would like the composition of the two functions to multiply a length by six. And as before, the composition would take in a speed in kilometers and return a speed in miles.

Here’s how we could implement this badly.

miles_per_km = 5/8 # approx def double(length_km): return 2*length_km*miles_per_km def triple(length_km): return 3*length_km*miles_per_km length_km = 8 d = double(length_km) print("Double: ", d) t = triple(d) print("Triple: ", t)

This prints

Double: 10.0 Triple: 18.75

The second output should be 30, not 18.5. The result is wrong because we converted from kilometers to miles twice. The correct implementation would be something like the following.

miles_per_km = 0.6213712 def double(length_km): d = 2*length_km print("Double: ", d*miles_per_km) return d def triple(length_km): t = 3*length_km print("Triple: ", t*miles_per_km) return t length_km = 8 d = double(length_km) t = triple(d)

This prints the right result.

Double: 10.0 Triple: 30.0

In abstract terms, we don’t want the composition of *f* and *g* to be simply *g* ∘ *f*.

We have a function *f* from *X* to *Y* that we think of as our core function, and a function *T* that translates the output. Say *f* doubles its input and *T* translates from kilometers to miles. Let *f** be the function that takes *X* to *TY*, i.e. the combination of *f* and translation.

Now take another function *g* from *Y* to *Z* and define *g** as the function that takes *Y* to *TZ*. We want the composition of *f** and *g** to be

*g** ∘ *f** = *T ∘ g ∘ f*.

In the example above, we only want to convert from kilometers to miles once. This is exactly what Kleisli composition does. (“Kleisli” rhymes with “highly.”)

Kleisli composition is conceptually simple. Once you understand what it is, you can probably think of times when it’s what you wanted but you didn’t have a name for it.

Writing code to encapsulate Kleisli composition takes some infrastructure (i.e. monads), and that’s a little complicated, but the idea of what you’re trying to achieve is not. Notice in the example above, what the functions print is not what they return; the print statements are a sort of side channel. That’s the mark of a monad.

## Kleisli categories

The things we’ve been talking about are formalized in terms of Kleisli categories. You start with a category *C* and define another category that has the same objects as *C* does but has a different notion of composition, i.e. Kleisli composition.

Given a monad *T* on *C*, the Kleisli category *C*_{T} has the same objects as *C*. An arrow *f** from *X* to *Y* in *C*_{T} corresponds to an arrow *f* from *X* to *TY* in *C*. In symbols,

Hom_{CT}(*X*, *Y*) = Hom_{C}(*X*, *TY*).

Mr. Kleisli’s motivation for defining his categories was to answer a more theoretical question—whether all monads arise from adjunctions—but more practically we can think of Kleisli categories as a way of formalizing a variation on function composition.