The **Moore-Penrose pseudoinverse** of a matrix is a way of coming up with something like an inverse for a matrix that doesn’t have an inverse. If a matrix does have an inverse, then the pseudoinverse is in fact the inverse. The Moore-Penrose pseudoinverse is also called a generalized inverse for this reason: it’s not just *like* an inverse, it actually *is* an inverse when that’s possible.

Given an *m* by *n* matrix *A*, the Moore-Penrose pseudoinverse *A*^{+} is the unique *n* by *m* matrix satisfying four conditions:

*A**A*^{+}*A*=*A**A*^{+}*A**A*^{+}=*A*^{+}- (
*A**A*^{+})* =*A**A*^{+} - (
*A*^{+}*A*)* =*A*^{+}*A*

The first equation says that *A**A*^{+} is a left identity for *A*, and *A*^{+}*A* is a identity for *A*.

The second equation says *A*^{+}*A* is a left identity for *A*^{+}, and *A* *A*^{+} is a right identity for *A*^{+}.

The third and fourth equations say that *A* *A*^{+} and *A*^{+}*A* are Hermitian.

If *A* is invertible, *A* *A*^{+} and *A*^{+}*A* are both the identity matrix. Otherwise *A* *A*^{+} and *A*^{+}*A* act an awful lot like the identity, as much as you could expect, maybe a little more than you’d expect.

**Update**: See this post for the relationship between the singular value decomposition and pseudoinverses, and how to compute both in Python and Mathematica.

## Galois connections and adjoints

John Baez recently wrote that a **Galois connection**, a kind of categorical **adjunction**, is

“the best approximation to reversing a computation that can’t be reversed.”

That sounds like a pseudoinverse! And the first two equations defining a pseudoinverse look a lot like things you’ll see in the context of adjunctions, so the pseudoinverse must be an adjunction, right?

The question was raised on MathOverflow and Michal R. Przybylek answered

I do not think the concept of Moore-Penrose Inverse and the concept of categorical adjunction have much in common (except they

both try to generalise the concept of inverse) …

and gives several reasons why. (Emphasis added.)

Too bad. It would have made a good connection. Applied mathematicians are likely to be familiar with Moore-Penrose pseudoinverses but not categorical adjoints. And pure mathematicians, depending on their interests, may be more familiar with adjoint functors than matrix pseudoinverses.

So what about John Baez’ comment? His comment was expository (and very helpful) but not meant to be rigorous. To make it rigorous you’d have to be rigorous about what you mean by “best approximation” etc. And when you define your terms carefully, in the language of category theory, you get adjoints. This means that the Moore-Penrose inverse, despite its many nice properties [1], doesn’t mesh well with categorical definitions. It’s not the best approximate inverse from a categorical perspective because it doesn’t compose well, and category theory values composition above all else. The Moore-Penrose pseudoinverse may be the best approximate inverse from some perspectives, but not from a categorical perspective.

Przybylek explains

… adjunctions

compose… but Moore-Penrose pseudoinverses—generally—do not. … pseudoinverses are not stable underisomorphisms, thus are notcategorical.

That’s the gist of his final point. Now let me fill in and expand slightly part of what I cut out.

If

f:A→Bis left adjoint tof^{+}:B→Aandg:B→Cis left adjoint tog^{+}:C→Bthen the compositiongf:A→Cis left adjoint to the compositionf^{+}g^{+}: C → A, but Moore-Penrose pseudoinverses do not compose this way in general.

This turns out to be an interesting example, but not of what I first expected. Rather than the pseudoinverse of a matrix being an example of an adjoint, it is an example of something that despite having convenient properties does not compose well from a categorical perspective.

## Related math posts

- What do you mean by “can’t”?
- How to differentiate a non-differentiable function
- Approximating a solution that doesn’t exist
- Applied category theory

[1] The book Matrix Mathematics devotes about 40 pages to stating theorems about the Moore-Penrose pseudoinverse.

The are also other interesting generalised matrix inverses. For example, the Drazin inverse and inverses that are consistent with respect to changing the units in a linear system or with respect to similarity transforms [1].

[1] A Rank-Preserving Generalized Matrix Inverse for Consistency with Respect to Similarity, Jeffrey Uhlmann

https://arxiv.org/abs/1804.07334

Do you mean left and right identity instead of left and right inverse? If they were inverses, shouldn’t you end up with the identity matrix on the right of the equations?

Yes, that’s what I meant to say. Thanks. Just updated.