# Natural transformations

The ladder of abstractions in category theory starts with categories, then functors, then natural transformations. Unfortunately, natural transformations don’t seem very natural when you first see the definition. This is ironic since the original motivation for developing category theory was to formalize the intuitive notion of a transformation being “natural.” Historically, functors were defined in order to define natural transformations, and categories were defined in order to define functors, just the opposite of the order in which they are introduced now.

A category is a collection of objects and arrows between objects. Usually these “arrows” are functions, but in general they don’t have to be.

A functor maps a category to another category. Since a category consists of objects and arrows, a functor maps objects to objects and arrows to arrows.

A natural transformation maps functors to functors. Sounds reasonable, but what does that mean?

You can think of a functor as a way to create a picture of one category inside another. Suppose you have some category and pick out two objects in that category, A and B, and suppose there is an arrow f between A and B. Then a functor F would take A and B and give you objects FA and FB in another category, and an arrow Ff between FA and FB. You could do the same with another functor G. So the objects A and B and the arrow between them in the first category have counterparts under the functors F and G in the new category as in the two diagrams below.

A natural transformation α between F and G is something that connects these two diagrams into one diagram that commutes.

The natural transformation α is a collection of arrows in the new category, one for every object in the original category. So we have an arrow αA for the object A and another arrow αB for the object B. These arrows are called the components of α at A and B respectively.

Note that the components of α depend on the objects A and B but not on the arrow f. If f represents any other arrow from A to B in the original category, the same arrows αA and αB fill in the diagram.

Natural transformations are meant to capture the idea that a transformation is “natural” in the sense of not depending on any arbitrary choices. If a transformation does depend on arbitrary choices, the arrows αA and αB would not be reusable but would have to change when f changes.

The next post will discuss the canonical examples of natural and unnatural transformations.

Related: Applied category theory

# Tidying up trivial details

The following quote gives a good description of the value of abstract mathematics. The quote speaks specifically of “universal algebra,” but consistent with the spirit of the quote you could generalize it to other areas of mathematics, especially areas such as category theory.

Universal algebra is the study of features common to familiar algebraic systems … [It] places the algebraic notions in their proper setting; it often reveals connexions between seemingly different concepts and helps to systemize one’s thoughts. … [T]his approach does not usually solve the whole problem for us, but only tidies up a mass of rather trivial detail, allowing us to concentrate our powers on the hard core of the problem.

Emphasis added. Source: Universal Algebra by P. M. Cohn

Related: Applied category theory

From Drew Armstrong’s notes on adjoint functors:

Once upon a time, my opinion of category theory was the same as my opinion of Facebook: if I ignore it for long enough, hopefully it will go away. It is now my educated opinion that category theory will not go away, and in fact the language of category theory will continue to spread until it becomes the default foundation of mathematics.

# Turning math inside-out

Here’s one of the things about category theory that takes a while to get used to.

Mathematical objects are usually defined internally. For example, the Cartesian product P of two sets A and B is defined to be the set of all ordered pairs (ab) where a comes from A and b comes from B. The definition of P depends on the elements of A and B but it does not depend on any other sets.

Category theory turns this inside-out. Operations such as taking products are not defined in terms of elements of objects. Category theory makes no use of elements or subobjects [1]. It defines things by how they act, not their inner workings. People often stress what category theory does not depend on, but they less often stress what it does depend on. The definition of the product of two objects in any category depends on all objects in that category: The definition of the product of objects A and B contains the phrase “such that for any other object X …” [More on categorical products].

The payoff for this inside-out approach to products is that you can say something simultaneously about everything that acts like a product, whether it’s products of sets, products of fields (i.e. that they don’t exist), products of groups, etc. You can’t say something valid across multiple categories if you depend on details unique to one categories.

This isn’t unique to products. Universal properties are everywhere. That is, you see definitions containing “such that for any other object X …” all the time. In this sense, category theory is extremely non-local. The definition of a widget often depends on all widgets.

There’s a symmetry here. Traditional definitions depend on the internal workings of objects, but only on the objects themselves. There are no third parties involved in the definition. Categorical definitions have zero dependence on internal workings, but depend on the behavior of everything in the category. There are an infinite number of third parties involved! [2] You can have a definition that requires complete internal knowledge but zero external knowledge, or a definition that requires zero internal knowledge and an infinite amount of external knowledge.

Related: Applied category theory

* * *

[1] Category theory does have notions analogous to elements and subsets, but they are defined the same way everything else is in category theory, in terms of objects and morphisms, not by appealing to the inner structure of objects.

[2] You can have a category with a finite number of objects, but usually categories are infinite. In fact, they are usually so large that they are “classes” of objects rather than sets.

# Category theory and Koine Greek

When I was in college, I sat in on a communication workshop for Latin American preachers. This was unusual since I’m neither Latin American nor a preacher, but I’m glad I was there.

I learned several things in that workshop that I’ve used ever since. For example, when you’re gesturing about something moving forward in time, move your hand from left to right from the audience’s perspective. Since English speakers (and for the audience of this workshop, Spanish speakers) read from left to right, we think of time progressing from left to right. If you see someone talking about time moving forward, but you see motion from right to left, you feel a subtle cognitive dissonance. (Presumably you should reverse this when speaking to an audience whose primary language is Hebrew or Arabic.)

Another lesson from that workshop, the one I want to focus on here, is that you don’t always need to convey how you arrived at an idea. Specifically, the leader of the workshop said that if you discover something interesting from reading the New Testament in Greek, you can usually present your point persuasively using the text in your audience’s language without appealing to Greek. This isn’t always possible—you may need to explore the meaning of a Greek word or two—but you can use Greek for your personal study without necessarily sharing it publicly. The point isn’t to hide anything, only to consider your audience. In a room full of Greek scholars, bring out the Greek.

This story came up in a recent conversation with Brent Yorgey about category theory. You might discover something via category theory but then share it without discussing category theory. If your audience is well versed in category theory, then go ahead and bring out your categories. But otherwise your audience might be bored or intimidated, as many people would be listening to an argument based on the finer points of Koine Greek grammar. Microsoft’s LINQ software, for example, was inspired by category theory principles, but you’d be hard pressed to find any reference to this because most programmers don’t want to know or need to know where it came from. They just want to know how to use it.

Some things may sound profound when expressed in esoteric language, such as category theory or Koine Greek, that don’t seem so profound in more down-to-earth language. Expressing yourself in a different language helps filter out pedantry from useful ideas. (On the other hand, some things that looked like pure pedantry have turned out to be very useful. Some hairs are worth splitting.)

Sometimes you have to introduce a new terms because there isn’t a colloquial counterpart. Monads are a good example, a concept from category theory that has entered software development. A monad is what it is, and analogies to burritos and other foods don’t really help. Better to introduce the term and say plainly what it is.

More on applied category theory

# New Twitter account for functional programming and categories

I’m starting a new Twitter account @FunctorFact for functional programming and category theory.

These two subjects have a lot of overlap, and some tweets will combine both, but many will be strictly about one or the other. So some content will be strictly about programming, some strictly about math, and some combining ideas from both.

# Categorical products

## Introduction

There’s an odd sort of partisan spirit to discussions of category theory. They often have the flavor of “Category theory is great!” or “Category theory is a horrible waste of time!” You don’t see this sort of partisanship around, say, probability. Probability theory is what it is, and if you need it, you use it. If you don’t need it, you don’t use it. I think of category theory in a similar way. It’s good for some things and not for others.

In this post I’ll look at just one little piece of category theory, the definition of products, and use it to give a flavor of category theory in general.

## Initial objections

The first time I saw category theory’s definition of a product I thought it was a bizarre complication. “The product of A and B is an object P such that for any other object X …”

What is this X doing in our definition? It’s not our product, nor is it one of the things we’re taking the product of.  And why introduce a diagram? Is the product of two mathematical objects a picture?! Why not come out and say what a product is rather than saying what it does? It’s just ordered pairs, right?

Category theory is all about how things behave rather than what they’re made of inside. So you could say that talking about pairs of elements violates the rules of the game. But that raises the question of why play this game at all. What do we get in return for placing such severe and unusual restrictions on ourselves?

The answer is that we get to see broader connections. When we focus on behavior rather than internal composition, we can see that two things behave the same even though they look different inside. Software developers should be familiar with this idea: depend on interface rather than implementation.

## Definition

OK, so what is this mysterious definition of product? It’s a mouthful, but we’ll explain why it has to be what it is.

Given two objects A and B in some category, a product of A and B is an object P in that category and a pair of morphisms π1: PA and π2: PB such that for every object X with morphisms f1: X → A and f2: X → B, there exists a unique morphism f that makes the following diagram commute.

Whew! That’s a lot more work than saying a product is the set of ordered pairs (ab) with a from A and b from B. And it’s not the first definition of product a student should see. However, there are three reasons why it’s worth introducing later:

1. The ordered pair definition is not complete.
2. The categorical definition is not as complex as it seems.
3. The categorical definition makes new connections visible.

## Why not ordered pairs

Saying “a product is just ordered pairs” isn’t enough. You have to say how the product relates to the things it’s a product of. In the case of a Cartesian product of sets, the projections are so obvious that it’s hard to realize they’re even there, but in general they need to be specified.

Another reason the ordered pair definition isn’t complete is that you need to say how the product is structured. If you’re taking the product of groups, for example, then you have to say how the group operation is defined on these ordered pairs. (There’s more than one way to do this. See here.) Or if you’re taking the product of two topological spaces, then you have to say what the topology is on this set whose points are the ordered pairs.

The categorical definition doesn’t tell you how to construct a product, but it tells you how to know when you’ve found something that works. That’s the trade-off: in order to have a theory that exposes wider connections, it can’t be tied to a specific example. Whether that’s an acceptable trade-off depends on your aim.

To reach further with our theory, we have to look at how things behave rather than how they are constructed. So how does a product behave? It lets you take components: here’s the first component, here’s the second. That’s about it. The categorical definition formalizes this in terms of projections, and it says that this is a universal property of products: anything else that acts like a product factors uniquely through the product.

In general you can’t just say products are ordered pairs. Sometimes products are not pairs, and sometimes pairs are not products. So the ordered pair definition doesn’t always apply. And when it does apply, it keeps us from seeing how products relate to coproducts, limits, and other operations.

### When products are not pairs

Here’s an example of a product that’s not a pair. A partially ordered set can be viewed as a category. The elements of the set are the objects of the category, and there is an there is a morphism from a to b if a ≤ b. In that case the product of a and b is their minimum a ∧ b.

### When pairs are not products

Here’s an example of a pair that’s not a product. The category of fields does not generally have products. You can form ordered pairs of elements from two fields, but you can’t always define any operation on these pairs that will turn them into a field.

For example, the number of elements in a finite field must be a power of a prime. If you take a field of order 5 and a field of order 7, there are 35 ordered pairs of elements, but there is no field of order 35.

## But is it worth it?

The categorical definition of products is difficult to understand. It’s analogous to the δ-ε definition of limits: not the first thing you think of, but the rigorous definition that will generalize well into new situations.

Abstraction should follow experience, not precede it. You need to have multiple examples of products in you mind before you see any advantage to abstracting the idea of a product.

So what does the abstraction buy you? Maybe nothing! It depends on what you’re after. One thing it might do for you is help you to be more consistent. Programming language designers, for example, use category theory to make languages more consistent and easier to think about. A language might want to handle various kinds of products uniformly, even when the products look very different at first. In addition to consistently implementing what they should, category theory might guide designers to not implement what they shouldn’t. For example, above we said that it doesn’t make sense in general to take the product of two fields.

Category theory also suggests new questions. For example, duality is pervasive through out category theory. For every concept, there’s a co-concept. So once you identify a product in some context, it’s natural to ask what coproducts are, and these tend to be less obvious than products. And going back to consistency, category theory might guide you to handle dual concepts in a dual manner.

# Next areas of math to be applied

Not that long ago number theory was considered strictly pure math. Then came applications to cryptography. Now number theory is at the foundation of the online economy.

What are the next areas of pure math to find widespread application? Some people are saying algebraic topology and category theory.

[I saw a cartoon to this effect the other day but I can’t find it. If I remember correctly, someone was standing on a hill labeled “algebraic topology” and looking over at hills in the distance labeled with traditional areas of applied math. Differential equations, Fourier analysis, or things like that. If anybody can find that cartoon, please let me know.]

## Algebraic topology

The big idea behind algebraic topology is to turn topological problems, which are hard, into algebraic problems, which are easier. For example, you can associate a group with a space, the fundamental group, by looking at equivalence classes of loops. If two spaces have different fundamental groups, they can’t be topologically equivalent. The converse generally isn’t true: having the same fundamental group does not prove two spaces are equivalent. There’s some loss of information going from topology to algebra, which is a good thing. As long as information you need isn’t lost, you get a simpler problem to work with.

Fundamental groups are easy to visualize, but hard to compute. Fundamental groups are the lowest dimensional case of homotopy groups, and higher dimensional homotopy groups are even harder to compute. Homology groups, on the other hand, are a little harder to visualize but much easier to compute. Applied topology, at least at this point, is applied algebraic topology, and more specifically applied homology because homology is practical to compute.

People like Robert Ghrist are using homology to study, among other things, sensor networks. You start with a point cloud, such as the location of sensors, and thicken the points until they fuse into spaces that have interesting homology. This is the basic idea of persistent homology.  You’re looking for homology that persists over some range of thickening. As the amount of thickening increases, you may go through different ranges with different topology. The homology of these spaces tells you something about the structure of the underlying problem. This information might then be used as features in a machine learning algorithm. Topological invariants might prove to be useful features for classification or clustering, for example.

Most applications of topology that I’ve seen have used persistent homology. But there may be entirely different ways to apply algebraic topology that no one is looking at yet.

## Category theory

Category theory has been getting a lot of buzz, especially in computer science. One of the first ideas in category theory is to focus on how objects interact with each other, not on their internal structure. This should sound very familiar to computer scientists: focus on interface, not implementation. That suggests that category theory might be useful in computer science. Sometimes the connection between category theory and computer science is quite explicit, as in functional programming. Haskell, for example, has several ideas from category theory explicit in the language: monads, natural transformations, etc.

Outside of computer science, applications of category theory are less direct. Category theory can guide you to ask the right questions, and to avoid common errors. The mathematical term “category” was borrowed from philosophy for good reason. Mathematicians seek to avoid categorical errors, just as Aristotle and Kant did. I think of category theory as analogous to dimensional analysis in engineering or type checking in software development, a tool for finding and avoiding errors.

I used to be very skeptical of applications of category theory. I’m still skeptical, though not as much. I’ve seen category theory used as a smoke screen, and I’ve seen it put to real use. More about my experience with category theory here.

* * *

Topology illustration from Barcodes: The persistent topology of data by Robert Ghrist.

Category theory diagram from Category theory for scientists by David Spivak

# Why is an empty sum 0 and an empty product 1?

In response to my earlier post on why 0! should be 1, several people replied that 0! = 1 because an empty product is 1. You can define the factorial of an integer n as the product of all positive numbers less than or equal to n. There are no positive integers less than or equal to 0, so 0! is an empty product. But this raises the question of why an empty product should be 1.

You could say that an empty sum is 0 because 0 is the additive identity and an empty product is 1 because 1 is the multiplicative identity. If you’d like a simple answer, maybe you should stop reading here.

The problem with the answer above is that it doesn’t say why an operation on an empty set should be defined to be the identity for that operation. The identity is certainly a plausible candidate, but why should it make sense to even define an operation on an empty set, and why should the identity turn out so often to be the definition that makes things proceed smoothly?

The convention that the sum over an empty set should be defined as 0, and that a product over an empty set should be defined to be 1 works well in very general settings where “sum”, “product”, “0”, and “1” take on abstract meanings.

The ultimate generalization of products is the notion of products in category theory. Similarly, the ultimate generalization of sums is categorical co-products. (Co-products are sometimes called sums, but they’re usually called co-products due to a symmetry with products.) Category theory simultaneously addresses a wide variety of operations that could be called products or sums (co-products).

The particular advantage of bringing category theory into this discussion is that it has definitions of product and co-product that are the same for any number of objects, including zero objects; there is no special definition for empty products. Empty products and co-products are a consequence of a more general definition, not special cases defined by convention.

In the category of sets, products are Cartesian products. The product of a set with n elements and one with m elements is one with nm elements. Also in the category of sets, co-products are disjoint unions. The co-product of a set with n elements and one with m elements is one with n+m elements. These examples show a connection between products and sums in arithmetic and products and co-products in category theory.

You can find the full definition of a categorical product here. Below I give the definition leaving out details that go away when we look at empty products.

The product of a set of objects is an object P such that given any other object X … there exists a unique morphism from X to P such that ….

If you’ve never seen this before, you might rightfully wonder what in the world this has to do with products. You’ll have to trust me on this one. [1]

When the set of objects is empty, the missing parts of the definition above don’t matter, so we’re left with requiring that there is a unique morphism [2] from each object X to the product P. In other words, P is a terminal object, often denoted 1. So in category theory, you can say empty products are 1.

But that seems like a leap, since “1” now takes on a new meaning that isn’t obviously connected to the idea of 1 we learned as a child. How is an object such that every object has a unique arrow to it at all like, say, the number of noses on a human face?

We drew a connection between arithmetic and categories before by looking at the cardinality of sets. We could define the product of the numbers n and m as the number of elements in the product of a set with n elements and one with m elements. Similarly we could define 1 as the cardinality of the terminal element, also denoted 1. This is because there is a unique map from any set to the set with 1 element. Pick your favorite one-element set and call it 1. Any other choice is isomorphic to your choice.

Now for empty sums. The following is the definition of co-product (sum), leaving out details that go away when we look at empty co-products.

The co-product of a set of objects is an object S such that given any other object X … there exists a unique morphism from S to X such that ….

As before, when the set of objects is empty, the missing parts don’t matter. Notice that the direction of the arrow in the definition is reversed: there is a unique morphism from the co-product S to any object X. In other words, S is an initial object, denoted for good reasons as 0.  [3]

In set theory, the initial object is the empty set. (If that hurts your head, you’re not alone. But if you think of functions in terms of sets of ordered pairs, it makes a little more sense. The function that sends the empty set to another set is an empty set of ordered pairs!) The cardinality of the initial object 0 is the integer 0, just as the cardinality of the initial object 1 is the integer 1.

Related: Applied category theory

* * *

[1] Category theory has to define operations entirely in terms of objects and morphisms. It can’t look inside an object and describe things in terms of elements the way you’d usually do to define the product of two numbers or two sets, so the definition of product has to look very different. The benefit of this extra work is a definition that applies much more generally.

To understand the general definition of products, start by understanding the product of two objects. Then learn about categorical limits and how products relate to limits. (As with products, the categorical definition of limits will look entirely different from familiar limits, but they’re related.)

[2] Morphisms are a generalization of functions. In the category of sets, morphisms are functions.

[3] Sometimes initial objects are denoted by ∅, the symbol for the empty set, and sometimes by 0. To make things more confusing, a “zero,” spelled out as a word rather than a symbol, has a different but related meaning in category theory: an object that is both initial and terminal.

# The great reformulation of algebraic geometry

“Tate helped shape the great reformulation of arithmetic and geometry which has taken place since the 1950’s.” — Andrew Wiles

At the Heidelberg Laureate Forum I had a chance to interview John Tate. In his remarks below, Tate briefly comments on his early work on number theory and cohomology. Most of the post consists of his comments on the work of Alexander Grothendieck.

* * *

JT: My first significant work after my thesis was to determine the cohomology groups of class field theory. The creators of the theory, including my thesis advisor Emil Artin, didn’t think in terms of cohomology, but their work could be interpreted as finding the cohomology groups H0, H1, and H2.

I was invited to give a series of three talks at MIT on class field theory. I’d been at a party, and I came home and thought about what I’d talk about. And I got this great idea: I realized I could say what all the higher groups are. In a sense it was a disappointing answer, though it didn’t occur to me then, that there’s nothing new in them; they were determined by the great work that had already been done. For that I got the Cole prize in number theory.

Later when I gave a talk on this work people would say “This is number theory?!” because it was all about cohomology groups.

JC: Can you explain what the great reformulation was that Andrew Wiles spoke of? Was it this greater emphasis on cohomology?

JT: Well, in the class field theory situation it would have been. And there I played a relatively minor part. The big reformulation of algebraic geometry was done by Grothendieck, the theory of schemes. That was really such a great thing, that unified number theory and algebraic geometry. Before Grothendieck, going between characteristic 0, finite characteristic 2, 3, etc. was a mess.

Grothendieck’s system just gave the right framework. We now speak of arithmetic algebraic geometry, which means studying problems in number theory by using your geometric intuition. The perfect background for that is the theory of schemes. ….

Grothendieck ideas [about sheaves] were so simple. People had looked at such things in particular cases: Dedekind rings, Noetherian rings, Krull rings, …. Grothendieck said take any ring. … He just had an instinct for the right degree of generality. Some people make things too general, and they’re not of any use. But he just had an instinct to put whatever theory he thought about in the most general setting that was still useful. Not generalization for generalization’s sake but the right generalization. He was unbelievable.

He started schemes about the time I got serious about algebraic geometry, as opposed to number theory. But the algebraic geometers classically had affine varieties, projective varieties, … It seemed kinda weird to me. But with schemes you had a category, and that immediately appealed to me. In the classical algebraic geometry there are all these birational maps, or rational maps, and they’re not defined everywhere because they have singularities. All of that was cleared up immediately from the outset with schemes. ….

There’s a classical algebraic geometer at Harvard, Joe Harris, who works mostly over the complex numbers. I asked him whether Grothendieck made much of a difference in the classical case — I knew for number theorists he had made a tremendous difference — and Joe Harris said yes indeed. It was a revolution for classical algebraic geometry too.

Related: Applied number theory