Putting topological data analysis in context

I got a review copy of The Mathematics of Data recently. Five of the six chapters are relatively conventional, a mixture of topics in numerical linear algebra, optimization, and probability. The final chapter, written by Robert Ghrist, is entitled Homological Algebra and Data. Those who grew up with Sesame Street may recall the song “Which one of these, is not like the other …”

When I first heard of topological data analysis (TDA), I was excited about the possibility of putting some beautiful mathematics to practical application. But it was hard for me to put TDA in context. How do you get actionable information out of it? If you find a seven-dimensional doughnut hiding in your data, that’s very interesting, but what do you do with that information?

Robert’s chapter in the book I’m reviewing has a nice introductory paragraph that helps put TDA in context. The section heading for the paragraph is “When is Homology Useful?”

Homological methods are, almost by definition, robust, relying on neither precise coordinates nor careful estimates for efficiency. As such, they are most useful in settings where geometric precision fails. With great robustness comes both great flexibility and great weakness. Topological data analysis is more fundamental than revolutionary: such techniques are not intended to supplant analytic, probabilistic, or spectral techniques. They can however reveal a deeper basis for why some data sets and systems behave the way they do. It is unwise to wield topological techniques in isolation, assuming that the weapons of unfamiliar “higher” mathematics are clad with incorruptible silver.

Robert’s background was in engineering and more conventional applied mathematics before he turned to applications of topology, and so he brings a broader perspective to TDA than someone trained in topology looking for ways to make topology useful. He also has a decade more experience applying TDA than when I interviewed him here. I’m looking forward to reading his new chapter carefully.

As I wrote about the other day, apparently the US Army believes that topological data analysis can be useful, presumably in combination with more quantitative methods. [1] More generally, it seems the Army is interested in mathematical models that are complementary to traditional models, models that are robust and flexible. The quote above cautions that with robustness and flexibility comes weakness, though ideally weakness that is offset by other models.

Related posts

[1] Algebraic topology is quantitative in one sense and qualitative in another. It aims to describe qualitative properties using algebraic invariants. It’s quantitative in the sense of computing homology groups, but it’s not as directly quantitative as more traditional mathematical models. It’s quantitative at a higher level of abstraction.

A genius can admit finding things difficult

Karen Uhlenbeck

Karen Uhlenbeck has just received the Abel Prize. Many say that the Fields Medal is the analog of the Nobel Prize for mathematics, but others say that the Abel Prize is a better analog. The Abel prize is a recognition of achievement over a career whereas the Fields Medal is only awarded for work done before age 40.

I had a course from Karen Uhlenbeck in graduate school. She was obviously brilliant, but what I remember most from the class was her candor about things she didn’t understand. She was already famous at the time, having won a MacArthur genius award and other honors, so she didn’t have to prove herself.

When she presented the definition of a manifold, she made an offhand comment that it took her a month to really understand that definition when she was a student. She obviously understands manifolds now, having spent her career working with them.

I found her comment about extremely encouraging. It shows it’s possible to become an expert in something you don’t immediately grasp, even if it takes you weeks to grok its most fundamental concept.

Uhlenbeck wasn’t just candid about things she found difficult in the past. She was also candid about things she found difficult at the time. She would grumble in the middle of a lecture things like “I can never remember this.” She was not a polished lecturer—far from it—but she was inspiring.

Related posts

(The connection between Karen Uhlenbeck, Ted Odell, and John Tate is that they were all University of Texas math faculty.)

Photo of Karen Uhlenbeck in 1982 by George Bergman [GFDL], via Wikimedia Commons

Summarizing homotopy groups of spheres

I don’t understand homotopy groups of spheres, and it’s OK if you don’t either. Nobody fully understands them. This post is really more about information compression than homotopy. That is, I’ll be looking at ways to summarize what is known without being overly concerned about what the results mean.

The task: map two integers to a list of integers

For each positive integer k, and non-negative integer n, the kth homotopy group of the sphere Sn is a finitely generated Abelian group, something that can be described by a finite list of numbers. So we’re looking at simply writing a function that takes two integers as input and returns a list of integers. This function is implemented in an online calculator that lets you lookup homotopy groups.

Computing homotopy groups of spheres is far from easy. The first Fields medal given to a topologist was for partial work along these lines. There are still groups that haven’t been computed, and potentially more Fields medals to win. But our task is much more modest: simply to summarize what has been discovered.

This is not going to be too easy, as suggested by the sample of results in the table below.

table of homotopy groups of spheres

This table was taken from the Homotopy Type Theory book, and was in turn based on the Wikipedia article on homotopy groups of spheres.

Output data representation

To give an example of what we’re after, the table says that π13(S²), the 13th homotopy group of the 2-sphere, is Z12 × Z2. All we need to know is the subscripts on the Z‘s, the orders of the cyclic factors, and so our function would take as input (13, 2) and return (12, 2).

The table tells us that π8(S4) = Z22. This is another way of writing Z2 × Z2, and so our function would take (8, 4) as input and return (2, 2).

When I said above that our function would return a list of integers I glossed over one thing: some of the Z‘s don’t have a subscript. That is some of the factors are the group of integers, not the group of integers mod some finite number. So we need to add an extra symbol to indicate a factor with no subscript. I’ll use ∞ because the integers as the infinite cyclic group. For example, our function would take (7, 4) and return (∞, 12). I will also use 1 to denote the trivial group, the group with 1 element.

Some results are unknown, and so we’ll return an empty list for these.

The order of the numbers in the output doesn’t matter, but we will list the numbers in descending order because that appears to be conventional.


Some of the values of our function can be filled in by a general theorem, and some will simply be data.

If we call our function f, then there is a theorem that says f(kn) = (1) if kn.  This accounts for the zeros in the upper right corner of the chart above.

There’s another theorem that says f(n+mn) is independent of n if n > m + 1. These are the so-called stable homotopy groups.

The rest of the groups are erratic; we can’t do much better than just listing them as data.

(By the way, the corresponding results for homology rather than homotopy are ridiculously simple by comparison. For k > 0, the kth homology group of Sn is isomorphic to the integers if k = n and is trivial otherwise.)

Most useful math class

A few years ago someone asked me what was my most useful undergraduate math class. My first thought was topology.

I have never directly applied topology for a client. Nobody has ever approached me wanting to know, for example, whether two objects were in the same homotopy class. But I believe topology was one of the most important classes I took for three reasons.

First, I learned how to prove things in that course. It was a small, interactive class with an excellent teacher (Jim Vick). I might have learned the same techniques in a different class, but for me I learned them in topology.

Second, the course built my confidence. I was apprehensive about taking the course because I knew nothing about it. The little I’d heard about topology—stretching coffee cups into donuts etc.—made me wonder what a class could possibly be like. I proved to myself that I could jump into something unfamiliar and do well.

Finally, the course gave me a solid foundation for analysis, and analysis I have applied more directly. I got a thorough understanding of foundational ideas like continuity and compactness, and a foretaste of measure theory. The course also provided my first brief exposure to category theory. To this day, my Pavlovian response to a mention of functors is to think of the fundamental group of a topological space.

I look back on topology the way many look back on a classical education, something not directly useful but indirectly very useful.

Agile software development and homotopy

One of the things I learned from my tenure as a software project manager was that a project is more likely to succeed if there’s a way to get where you want to go continuously. You want to move a project from A to B gradually, keeping a working code base all along the way. At the end of each day, the software may not be fully functional, but it should at least build. Anything that requires a big bang change, tearing the system apart for several days and putting it back together, is less likely to succeed.

This is very much like the idea of homotopy from topology, a continuous deformation of one thing into another. No discontinuities along the way — no ripping, no jumping suddenly from one thing to another.

Next areas of math to be applied

Not that long ago number theory was considered strictly pure math. Then came applications to cryptography. Now number theory is at the foundation of the online economy.

What are the next areas of pure math to find widespread application? Some people are saying algebraic topology and category theory.

[I saw a cartoon to this effect the other day but I can’t find it. If I remember correctly, someone was standing on a hill labeled “algebraic topology” and looking over at hills in the distance labeled with traditional areas of applied math. Differential equations, Fourier analysis, or things like that. If anybody can find that cartoon, please let me know.]

Algebraic topology

The big idea behind algebraic topology is to turn topological problems, which are hard, into algebraic problems, which are easier. For example, you can associate a group with a space, the fundamental group, by looking at equivalence classes of loops. If two spaces have different fundamental groups, they can’t be topologically equivalent. The converse generally isn’t true: having the same fundamental group does not prove two spaces are equivalent. There’s some loss of information going from topology to algebra, which is a good thing. As long as information you need isn’t lost, you get a simpler problem to work with.

Fundamental groups are easy to visualize, but hard to compute. Fundamental groups are the lowest dimensional case of homotopy groups, and higher dimensional homotopy groups are even harder to compute. Homology groups, on the other hand, are a little harder to visualize but much easier to compute. Applied topology, at least at this point, is applied algebraic topology, and more specifically applied homology because homology is practical to compute.

People like Robert Ghrist are using homology to study, among other things, sensor networks. You start with a point cloud, such as the location of sensors, and thicken the points until they fuse into spaces that have interesting homology. This is the basic idea of persistent homology.  You’re looking for homology that persists over some range of thickening. As the amount of thickening increases, you may go through different ranges with different topology. The homology of these spaces tells you something about the structure of the underlying problem. This information might then be used as features in a machine learning algorithm. Topological invariants might prove to be useful features for classification or clustering, for example.

Most applications of topology that I’ve seen have used persistent homology. But there may be entirely different ways to apply algebraic topology that no one is looking at yet.

Category theory

Category theory has been getting a lot of buzz, especially in computer science. One of the first ideas in category theory is to focus on how objects interact with each other, not on their internal structure. This should sound very familiar to computer scientists: focus on interface, not implementation. That suggests that category theory might be useful in computer science. Sometimes the connection between category theory and computer science is quite explicit, as in functional programming. Haskell, for example, has several ideas from category theory explicit in the language: monads, natural transformations, etc.

Outside of computer science, applications of category theory are less direct. Category theory can guide you to ask the right questions, and to avoid common errors. The mathematical term “category” was borrowed from philosophy for good reason. Mathematicians seek to avoid categorical errors, just as Aristotle and Kant did. I think of category theory as analogous to dimensional analysis in engineering or type checking in software development, a tool for finding and avoiding errors.

I used to be very skeptical of applications of category theory. I’m still skeptical, though not as much. I’ve seen category theory used as a smoke screen, and I’ve seen it put to real use. More about my experience with category theory here.

* * *

Topology illustration from Barcodes: The persistent topology of data by Robert Ghrist.

Category theory diagram from Category theory for scientists by David Spivak

A subway topologist

One of my favorite books when I was growing up was the Mathematics volume in the LIFE Science Library. I didn’t own the book, but my uncle did, and I’d browse through the book whenever I visited him. I was too young at the time to understand much of what I was reading.

One of the pages that stuck in my mind was a photo of Samuel Eilenberg. His name meant nothing to me at the time, but the caption titled “A subway topologist” caught my imagination.

… Polish-born Professor Samuel Eilenberg sprawls contemplatively in his Greenwich Village apartment in New York City. “Sometimes I like to think lying down,” he says, “but mostly I like to think riding on the subway.” Mainly he thinks about algebraic topology — a field so abstruse that even among mathematicians few understand it. …

I loved the image of Eilenberg staring intensely at the ceiling or riding around on a subway thinking about math. Since then I’ve often thought about math while moving around, though usually not on a subway. I’ve only lived for a few months in an area with a subway system.

The idea that a field of math would be unknown to many mathematicians sounded odd. I had no idea at the time that mathematicians specialized.

Algebraic topology doesn’t seem so abstruse now. It’s a routine graduate course and you might get an introduction to it in an undergraduate course. The book was published in 1963, and I suppose algebraic topology would have been more esoteric at the time.

* * *

For daily tweets on topology and geometry, follow @TopologyFact on Twitter.

TopologyFact logo