Currying is a simple but useful idea. (It’s named after logician Haskell Curry  and has nothing to do with spicy cuisine.) If you have a function of two variables, you can think of it as a function of one variable that returns a function of one variable. So starting with a function f(x, y), we can think of this as a function that takes a number x and returns a function f(x, -) of y. The dash is a placeholder as in this recent post.
Calculus: Fubini’s theorem
If you’ve taken calculus then you saw this in the context of Fubini’s theorem:
To integrate the function of two variables f(x, y), you can temporarily fix y and integrate the remaining function of x. This gives you a number, the value of an integral, for each y, so it’s a function of y. Integrate that function, and you have the value of the original function of two variables.
The first time you see this you may think it’s a definition, but it’s not. You can define the integral on the left directly, and it will equal the result of the two nested integrations on the right. Or at least the two sides will often be equal. The conditions on Fubini’s theorem tell you exactly when the two sides are equal.
PDEs: Evolution equations
A more sophisticated version of the same trick occurs in partial differential equations. If you have an evolution equation, a PDE for a function on one time variable and several space variables, you can think of it as an ODE via currying. For each time value t, you get a function of the spatial variables. So you can think of your solution as a path in a space of functions. The spatial derivatives specify an operator on that space of functions.
(I’m glossing over details here because spelling everything out would take a lot of writing, and might obscure the big idea, which relevant for this post. If you’d like the full story, you can see, for example, my graduate advisor’s book. It was out of print when I studied it, but now it’s a cheap Dover paperback.)
In the Haskell programming language (also named after Haskell Curry) you get currying for free. In fact, there’s no other way to express a function of two variables. For example, suppose you want to implement the function f(x, y) = x² + y.
Prelude> f x y = x**2 + y
Then Haskell thinks of this as a function of one variable (i.e. x), that returns a function of one variable (i.e. f(x, -)) which itself returns a number (i.e. f(x, y)). You can see this by asking the REPL for the type of
Prelude> :info f f :: Floating a => a -> a -> a
Technically, Haskell, just like lambda calculus, only has functions of one variable. You could create a product datatype consisting of a pair of variables and have your function take that as an argument, but it’s still a function on one variable, though that variable is not atomic.
The way you’d formalize currying in category theory is to say that the following is a natural isomorphism:
For more on what Hom means, see this post.
- Weakening the requirements for a group (especially section on semigroups)
- Categorical products
- Beta reduction
 In concordance with Stigler’s law of eponymy, currying was not introduced by Curry but Gottlob Frege. It was then developed by Moses Schönfinkel and developed further by Haskell Curry.
7 thoughts on “Currying in calculus, PDEs, programming, and categories”
Good work on explaining Currying. How on Earth did mathematicians go so long without this?
I often think that the insane way calculus deals with notations for variables is one of the major stumbling blocks to learning the subject. You can easily find textbook examples where x means three different things in one integral. There is also the problem of lacking a name for identity or square functions independent of the variable name. Students may fail to grasp when they are dealing with a value, unbound variable, function, functional or operator. No wonder so many remain lost.
In a lighter tone, would you call it schoenfinkling or fregeration?
As far as going without currying, Frege made it explicit, but I assume the idea had been around before. I wouldn’t be surprised if Frege got the idea from double integral calculations.
Calculus notation is challenging. Statistics notation is worse. I’m satisfied with the notation in Spivak’s Calculus, but I haven’t seen anything in statistics I’d recommend.
I’m a little more sympathetic to traditional calculus notation than I used to be. Despite some of its logical shortcomings, the notation often suggests the right thing to do in calculations. If you know what’s going on rigorously, and use conventional notation as a sort of mnemonic, it works well. For example, the derivative of an inverse is dx/dy = 1/(dy/dx).
Thanks for the cool post.
You actually can express functions of two arguments in Haskell without currying! In fact, you can use exactly the same trick that conventional mathematics does: define it as a function on the product. This is what Haskell’s uncurry combinator does. Just as you say of Haskell, technically conventional mathematics also only has functions of one variable. So you can implement multiple parameters either by changing the domain, or the codomain. The choice is yours (and the two options give rise to an adjunction!)
The choice to curry functions in Haskell and ML by default is made deliberately, because it’s often convenient to partially apply many functions. (Mathematicians have left themselves a back door for such situations: the notation $f_x(y)$ conveniently allows the partial application $f_x$, but computer programmers are too concerned about syntactic ambiguities to let so many different notations flourish, I suppose!)
A different default would be possible, and in fact requires almost no change to the compiler. Only the standard library sets the default. (Okay, and the desugaring for infix operators, as well, so that’s annoying.) A disadvantage of currying is that it leads to very poor error messages when arguments are left out by accident. So, in fact, I’ve done precisely this for when I teach Haskell to children. The environment, http://code.world, is standard Haskell, but using a custom Prelude that does not curry its functions.
The derivative of an inverse is that if x is a function only of y. Perhaps you could give students a clear intuition for why it doesn’t generally work with functions of more than one variable.
Maybe I misunderstood your comment, because the derivative of an inverse is the inverse of the derivative, even for functions of more than one variable.
James: Asking for the inverse of a single-valued function of several variables is problematic to begin with. Such a function cannot be invertible unless it is not even continuous to begin with. If you want to look at invertible functions of several variables, you’ll have to consider functions f: R^n -> R^n, in which case the derivative of the inverse is indeed the inverse of the derivative, as John mentioned.
In the context of programming languages, currying becomes significantly more interesting if the programming language also supports partial evaluation.