In response to my earlier post on why 0! should be 1, several people replied that 0! = 1 because an empty product is 1. You can define the factorial of an integer n as the product of all positive numbers less than or equal to n. There are no positive integers less than or equal to 0, so 0! is an empty product. But this raises the question of why an empty product should be 1.
You could say that an empty sum is 0 because 0 is the additive identity and an empty product is 1 because 1 is the multiplicative identity. If you’d like a simple answer, maybe you should stop reading here.
The problem with the answer above is that it doesn’t say why an operation on an empty set should be defined to be the identity for that operation. The identity is certainly a plausible candidate, but why should it make sense to even define an operation on an empty set, and why should the identity turn out so often to be the definition that makes things proceed smoothly?
The convention that the sum over an empty set should be defined as 0, and that a product over an empty set should be defined to be 1 works well in very general settings where “sum”, “product”, “0”, and “1” take on abstract meanings.
The ultimate generalization of products is the notion of products in category theory. Similarly, the ultimate generalization of sums is categorical co-products. (Co-products are sometimes called sums, but they’re usually called co-products due to a symmetry with products.) Category theory simultaneously addresses a wide variety of operations that could be called products or sums (co-products).
The particular advantage of bringing category theory into this discussion is that it has definitions of product and co-product that are the same for any number of objects, including zero objects; there is no special definition for empty products. Empty products and co-products are a consequence of a more general definition, not special cases defined by convention.
In the category of sets, products are Cartesian products. The product of a set with n elements and one with m elements is one with nm elements. Also in the category of sets, co-products are disjoint unions. The co-product of a set with n elements and one with m elements is one with n+m elements. These examples show a connection between products and sums in arithmetic and products and co-products in category theory.
You can find the full definition of a categorical product here. Below I give the definition leaving out details that go away when we look at empty products.
The product of a set of objects is an object P such that given any other object X … there exists a unique morphism from X to P such that ….
If you’ve never seen this before, you might rightfully wonder what in the world this has to do with products. You’ll have to trust me on this one. [1]
When the set of objects is empty, the missing parts of the definition above don’t matter, so we’re left with requiring that there is a unique morphism [2] from each object X to the product P. In other words, P is a terminal object, often denoted 1. So in category theory, you can say empty products are 1.
But that seems like a leap, since “1” now takes on a new meaning that isn’t obviously connected to the idea of 1 we learned as a child. How is an object such that every object has a unique arrow to it at all like, say, the number of noses on a human face?
We drew a connection between arithmetic and categories before by looking at the cardinality of sets. We could define the product of the numbers n and m as the number of elements in the product of a set with n elements and one with m elements. Similarly we could define 1 as the cardinality of the terminal element, also denoted 1. This is because there is a unique map from any set to the set with 1 element. Pick your favorite one-element set and call it 1. Any other choice is isomorphic to your choice.
Now for empty sums. The following is the definition of co-product (sum), leaving out details that go away when we look at empty co-products.
The co-product of a set of objects is an object S such that given any other object X … there exists a unique morphism from S to X such that ….
As before, when the set of objects is empty, the missing parts don’t matter. Notice that the direction of the arrow in the definition is reversed: there is a unique morphism from the co-product S to any object X. In other words, S is an initial object, denoted for good reasons as 0. [3]
In set theory, the initial object is the empty set. (If that hurts your head, you’re not alone. But if you think of functions in terms of sets of ordered pairs, it makes a little more sense. The function that sends the empty set to another set is an empty set of ordered pairs!) The cardinality of the initial object 0 is the integer 0, just as the cardinality of the initial object 1 is the integer 1.
Related: Applied category theory
* * *
[1] Category theory has to define operations entirely in terms of objects and morphisms. It can’t look inside an object and describe things in terms of elements the way you’d usually do to define the product of two numbers or two sets, so the definition of product has to look very different. The benefit of this extra work is a definition that applies much more generally.
To understand the general definition of products, start by understanding the product of two objects. Then learn about categorical limits and how products relate to limits. (As with products, the categorical definition of limits will look entirely different from familiar limits, but they’re related.)
[2] Morphisms are a generalization of functions. In the category of sets, morphisms are functions.
[3] Sometimes initial objects are denoted by ∅, the symbol for the empty set, and sometimes by 0. To make things more confusing, a “zero,” spelled out as a word rather than a symbol, has a different but related meaning in category theory: an object that is both initial and terminal.
Isn’t it approximately as accurate, and much more accessible, to say that sums or products over empty sets yield the respective identity because that makes associativity work intuitively?
Take three sequences A, B and C. (A op B) op C == A op (B op C). If A is the empty sequence, we’d like the RHS to reduce to just B op C. The only way to do this is if the operation over the empty set is the identity.
Michael: You give a good argument for why empty associative operations should be defined as identities. I don’t know whether this generalizes well. Maybe it does, but I haven’t thought about it.
Another simple argument is that the empty sum being zero makes the definition of multiplication in terms of addition work: A times B is (A + A + …. + A) where there are B copies of A. Similarly, the empty product being 1 makes the definition of exponentiation in terms of multiplication work.
Here is my argument for the identity: Let ⊗ be a commutative and associative operation, then you can define ⨂ for nonempty sets in an obvious way. For disjoint nonempty sets A and B, we have (⨂A)⊗(⨂B)=⨂(A∪B).
If we want this identity to still hold when we extend ⨂ to empty sets, we have to define ⨂∅ as the neutral element of ⊗. (We’re out of luck if it doesn’t exist.)
I’d have to write it all down very formally to be sure, but I wonder if this all boils down to the convention of vacuous truth.
If $A\cap B = \varnothing$, then
$$\sum_{x\in A\cup B} x = \sum{x\in A} x + \sum{x\in B} x$$
That also means that, since $A\cap\varnothing=\varnothing$, the sum $\sum{x\in\varnothing} x$ has to be zero—or, of course we could just make the empty set a special case, {\bf but the whole point of the empty set is that it can be treated just as any other set, not as a special case}.
Sometimes, of course, it’s impossible. For example, what should $\operatorname{min}_{x\in\varnothing} x$ be? It has to be an element of $\varnothing$ (that’s a reasonable requirement for a minimum), but there are none. Still, the case with addition and multiplication is different: there is a reasonable way to extend the definition, and it doesn’t cause any troubles.
On an unrelated note: if $f(x)\ne0$, one could call $x$ a zero-order root of $f$. And Bezout’s theorem still holds.
I got stuck at:
Tnx for the great stats tutoring :)
Jim: That sentence was mangled! I don’t know how I didn’t catch that. I’ve rewritten the sentence in English. :)
Vacuous truth: indeed, p(x) for all x∈X={x1,x2,…} means p(1) and p(2) and … . Since the neutral element for “and” is true, a forall statement over the empty set is true.
If we order the booleans by false<true, "and" becomes "min" which has the top element true as identity. So if we don't insist that minimum gives an element from the set (it's about generalizations after all) or replace minimum by infimum, we again get true for the empty set, but now we can use the same reasoning for any ordered set: inf ∅ is the top element.
Interestingly, we can again also use category theory as an argument for that fact: an ordered set gives rise to a category where the objects are the elements of the ordered set and there is exactly one morphism from x to y iff x<y. In this category the product is the infimum, and the terminal object is the top element.
However, we must not forget the missing parts in the definition of products. They go away in the case of the empty product because they then form a vacuous truth. So we need a prior understanding of that before we can argue with category theory.
Given 1! = 1 and the sequence of factorials going backwards is (n – 1)! = n! / n then 0! must be 1.
One way to get an empty set is to start with a non-empty set and remove all the members. If you remove x from a set S, then you update the sum by subtracting x and update the product by dividing by x:
sum(S – {x}) = sum(S) – s
product(S – {x}) = product(S) / x
and if you remove all the elements:
sum(∅) = sum(S – S) = sum(S) – sum(S) = 0
product(∅) = product(S – S) = product(S) / product(S) = 1
If we define the set function f which computes the sum of the members of a nonempty set, it is clearly additive over unions of disjoint sets. To extend the set function domain to include the empty set and preserve this additivity, the only possible definition for f at the empty set is 0.
Similarly, if we define the set function g which computes the product of the members of a nonempty set, it is clearly multiplicative over unions of disjoint sets. To extend the set function domain to include the empty set and preserve this multiplicativity, the only possible definition for the value at the empty set is 1.