Fredholm index

The previous post on kernels and cokernels mentioned that for a linear operator TV → W, the index of T is defined as the difference between the dimension of its kernel and the dimension of its cokernel:

index T = dim ker T − dim coker T.

The index was first called the Fredholm index, because of it came up in Fredholm’s investigation of integral equations. (More on this work in the next post.)

Robustness

The index of a linear operator is robust in the following sense. If V and W are Banach spaces and TV → W is a continuous linear operator, then there is an open set around T in the space of continuous operators from V to W on which the index is constant. In other words, small changes to T don’t change its index.

Small changes to T may alter the dimension of the kernel or the dimension of the cokernel, but they don’t alter their difference.

Relation to Fredholm alternative

The next post discusses the Fredholm alternative theorem. It says that if K is a compact linear operator on a Hilbert space and I is the identity operator, then the Fredholm index of IK is zero. The post will explain how this relates to solving linear (integral) equations.

Analogy to Euler characteristic

We can make an exact sequence with the spaces V and W and the kernel and cokernel of T as follows:

0 → ker TVW → coker T → 0

All this means is that the image of one map is the kernel of the next.

We can take the alternating sum of the dimensions of the spaces in this sequence:

dim ker T − dim V + dim W − dim coker T.

If V and W have the same finite dimension, then this alternating sum equals the index of T.

The Euler characteristic is also an alternating sum. For a simplex, the Euler characteristic is defined by

V − EF

where V is the number of vertices, E the number of edges, and F the number of faces. We can extend this to higher dimensions as the number of zero-dimensional object (vertices), minus the number of one-dimensional objects (edges), plus the number of two-dimensional objects, minus the number of three dimensional objects, etc.

A more sophisticated definition of Euler characteristic is the alternating sum of the dimensions of cohomology spaces. These also form an exact sequence.

The Atiyah-Singer index theorem says that for elliptic operators on manifolds, two kinds of index are equal: the analytical index and the topological index. The analytical index is essentially the Fredholm index. The topological index is derived from topological information about the manifold.

This is analogous to the Gauss-Bonnet theorem that says you can find the Euler characteristic, a topological invariant, by integrating Gauss curvature, an analytic calculation.

Other posts in this series

This is the middle post in a series of three. The first was on kernels and cokernels, and the next is on the Fredholm alternative.

The glass disk game

The glass disk game is played on a grid. You have translucent colored glass disks you can either place on an edge or a vertex.

There are two kinds of disks that can be placed on an edge: blue or yellow. A vertex with a blue and yellow disk looks green.

There are two kinds of disks that can be placed on a vertex: red or white. A vertex with a red and white disk looks pink.

For this post we only need red disks. We may use white and pink in a future post.

The following rules tell you how you are allowed to add disks.

Rule 1

If you have this configuration

then you may add a green disk in the middle.

Mnemonic: Think of copying the blue and yellow dots on the end and placing them both in the middle.

Rule 2

If you have this configuration

you may add a blue disk between the two blue disks.

If you have this configuration

you may add a yellow disk between the two yellow disks.

Interpretation

The rules above are all the rules of the game. You do not need to know any mathematics to play the game.

But the game does have a mathematical interpretation. The grid is a commutative diagram. A red disk means the horizontal diagram is exact at that vertex, i.e. the image of the function coming in from the left is the kernel of the function going out to the right.

A blue disk means a function is surjective (onto) and a yellow disk means a function is injective (one-to-one).

The diagram need not represent sets and functions. It could represent objects in a category along with morphisms. In that case blue disks represent epimorphisms and yellow disks monomorphisms.

Rule 1 is known as the five lemma. Rule 2 is called the four lemma.

The glass disk game is a pair of theorems in algebraic topology, or more generally homological algebra.

Limitations on Venn diagrams

Why do Venn diagrams almost always show the intersections of three sets and not more? Can Venn diagrams be generalized to show all intersections of more sets?

That depends on the rules you give yourself for generalization. If you require that your diagram consist of circles, then three is the limit. As John Venn put it in the original paper on Venn diagrams [1]

Beyond three terms circles fail us, since we cannot draw a fourth circle which shall intersect three others in the way required.

But Mr. Venn noted that you could create what we now call a Venn diagram using four ellipses and included the following illustration.

Venn diagram with four ellipses by John Venn

(It’s confusing that there is an X inside the diagram. Venn meant that to be an asterisk and not the same symbol as the X outside. He says in the paper “Thus the one which is asterisked is instantly seen to be ‘X that is Y and Z, but is not W’.” Maybe someone else, like the publisher, drew the diagram for him.)

So the answer to whether, or how far, it is possible to generalize the classic Venn diagram depends on permissible generalizations of a circle. If you replace circles with arbitrary closed curves then Venn diagrams exist for all orders. If you demand the curves have some sort of symmetry, there are fewer options. It’s possible to make a Venn diagram from five ellipses, and that may be the limit for ellipses.

A Venn diagram is a visualization device, and so an important question is what is the limit for the use of Venn diagrams as an effective visualization technique. This is an aesthetic / pedagogical question, and not one with an objective answer, but in my mind the answer is four. Venn’s diagram made from four ellipses is practical for visualization, though it takes more effort to understand than the typical three-circle diagram.

Although my upper bound of four is admittedly subjective, it may be possible to make it objective post hoc. A Venn diagram made from n curves divides the plane into 2n regions [2]. In order to use more than four curves, you either have to gerrymander the curves or else tolerate some regions being much smaller than others. The former makes the diagram hard to understand, and he latter makes it hard to label the regions.

I suspect that if you make precise what it means for a curve to be simple [3], such as using ellipses or convex symmetric curves, and specify a maximum ratio between the largest and smallest bounded regions, then four curves will be the maximum.

Update: Here are a couple useful references.

[1] John Venn. On the Diagrammatic and Mechanical Representation of Propositions and Reasonings. The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science. July 1880.

[2] This includes the outside of the diagram representing the empty set. The diagram shows the intersection of 0, 1, 2, …, n sets, and the intersection of no sets is the empty set. This last statement might seem like an arbitrary definition, but it can be justified using category theory.

[3] Simple in the colloquial sense, which is more restrictive than the technical mathematical sense of a simple curve.

An unusual introduction to manifolds

Here is an introduction to manifolds (PDF, 23 MB) unlike any I’ve seen before. These notes by Brian Beckman devote a substantial amount of time to thinking about the problem of describing a location on a manifold, including an unexpected diversion into What3Words.

The notes are in the form of a Mathematica notebook. The link above is to a PDF rendering of the notebook. You can find the notebook itself here.

Here’s an excerpt from the abstract.

This tutorial offers a bridge between the abstract mathematics of manifolds and computational practice. Computational practice means writing simulation and control software in terms of matrices and vectors. We offer elementary examples fully explained and illustrated at length. …

Contemporary books and papers can be challenging. Many are written in highly abstract mathematical style (proofs without examples), with more generality than needed for applications (unphysical topologies), and with unfamiliar notation. If you didn’t learn fiber bundles, exterior calculus, and Lie theory in school and you want to catch up fast, this tutorial might be interesting to you. You’ll be equipped to read papers without choking on things you’ve never heard of, and get through to the juicy bits where we learn from the heavy math how to do calculations with matrices and vectors.

Adding tubes to knots

Several months ago I wrote a blog post about Lissajous curves and knots that included the image below.

Lissajous knot

Here’s an improved version of the same knot.

Lissajous knot plotted in Mathematica with Tube function

The original image was like tying the knot in thread. The new image is like tying it in rope, which makes it easier to see. The key was to use Mathematica’s Tube function to thicken the curve and to see the crossings. I also removed the bounding box in the image.

This was the original code:

    x[t_] := Cos[3 t]
    y[t_] := Cos[4 t + Sqrt[2]]
    z[t_] := Cos[5 t + Sqrt[3]]
   
    ParametricPlot3D[{x[t], y[t], z[t]}, {t, 0, 2 Pi}]

The new code keeps the definitions of x, y, and z but creates a plot using Tube.

    
    Graphics3D[
        Tube[
             Table[{x[i], y[i], z[i]}, {i, 0, 2 Pi, 0.01}], 
             0.05
        ], 
        Boxed -> False
    ]

I went back and changed out the plots in my post Total curvature of a knot with plots similar to the one above.

A connected topology for the integers

You can define a topology on the positive integers by choosing as an open basis sets the series of the form an + b where a and b are relatively prime positive integers. Solomon Golumb defines this topology in [1] and proves that it is connected.

But that paper doesn’t prove that proposed basis really is a basis. That is implicitly an exercise for the reader, and we carry out that exercise here. The proof will say that certain things can be computed, and we’ll go a step further an actually compute them with Python.

Proof

To prove that these sets form the basis of a topology, we need to establish two things.

  1. Every point is contained in some basis element.
  2. The intersection of basis elements is another basis element or empty.

The first is easy. For any positive number x, the sequence (x + 1)n + x contains x, and x + 1 is relatively prime to x. Any other number relatively prime to x would work just as well.

The second is more interesting. Consider two sequences: an + b and cm + d. A number x in the intersection satisfies the pair of equations

x = b mod a
x = d mod c.

If a and c are relatively prime, the Chinese Remainder Theorem says that the equation has a solution y, and the solutions are unique mod ac. So the intersection of the two series is acn + y.

If a and c are not relatively prime, let r be their greatest common divisor. Then the intersection of an + b and cm + d is another sequence of the same form if b and d are congruent mod r and empty otherwise.

Separating points

A topological space is connected if it is not the union of two disjoint open sets. An easy way to make a space connected is to not have any disjoint open sets. But Golumb’s topology has plenty of disjoint open sets.

In fact, his topology is Hausdorff, meaning that any two points can be separated by disjoint open sets. The proof is simple. Take two points s and t. Let p be any prime bigger than both s and t, and consider the two sets pn + s and pn + t.

Python calculation

Our proof split into two cases: when the strides of the two series are relatively prime and when they are not.

Relatively prime strides

To get concrete, suppose we have the sequences

2, 9, 16, 23, …

and

3, 14, 25, 35, …

or in other words 7n + 2 and 11n + 3. According to the theory above, these sequences intersect because 7 and 11 are relatively prime, and the intersection should be of the form 77n + y, and we should be able to compute y from the Chinese Remainder Theorem. We will use the function crt from SymPy. (See another example of using this function here.)

    >>> from sympy.ntheory.modular import crt
    >>> crt([7,11], [2, 3], symmetric=False)
    >>> (58, 77)

This reports that y = 58.

Now let’s verify that the intersection of our two series looks like 77n + 58.

    >>> A = set(2+7*n for n in range(100))
    >>> B = set(3+11*n for n in range(100))
    >>> sorted(A.intersection(B))
    [58, 135, 212, 289, 366, 443, 520, 597, 674]

Strides with a common factor: empty intersection

Now for the case where a and c have greatest common divisor r > 1. Consider, for example,

1, 16, 31, 46, …

and

2, 14, 26, 38, …

The sequences 15n + 1 and 12m + 2 do not intersect. The greatest common divisor of 15 and 12 is 3. Numbers in the first series are congruent to 1 mod 3 and numbers in the second series are congruent to 2 mod 3, so no number can be in both. If we didn’t realize this, we could call

    crt([15,12], [1, 2], symmetric=False)

and see that it returns None.

Strides with a common factor: finding the intersection

But now let’s look at

1, 16, 31, 46, …

and

13, 25, 37, 49, …

Now both sequences are congruent to 1 mod 3, so it’s at least feasible that the two series may intersect. And in fact they do.

    >>> crt([15,12], [1, 13], symmetric=False)
    (1, 60)

This says that the set of solutions are all congruent to 1 mod 60. Now 1 itself is not a solution, but 61 is, and so is 60n + 1 for all positive integers n. We can illustrate this as before by computing the intersection of the two series.

    >>> A = set(1+15*n for n in range(30))
    >>> B = set(13+12*n for n in range(30))
    >>> sorted(A.intersection(B))
    [61, 121, 181, 241, 301, 361]

Related posts

[1] Solomon W. Golomb. A Connected Topology for the Integers. The American Mathematical Monthly , Oct., 1959, pp. 663-665

Total curvature of a knot

Tie a knot in a rope and join the ends together. At each point in the rope, compute the curvature, i.e. how much the rope bends, and integrate this over the length of the rope. The Fary-Milnor theorem says the result must be greater than 4π. This post will illustrate this theorem by computing numerically integrating the curvature of a knot.

If p and q are relatively prime integers, then the following equations parameterize a knot.

x(t) = cos(pt) (cos(qt) + 2)
y(t) = sin(pt) (cos(qt) + 2)
z(t) = -sin(qt)

This is in fact a torus knot because the curve stays on the surface of a torus (doughnut) [1]. We can get a trefoil knot, for example, by setting p = 2 and q = 3.

Trefoil knot

We’ll use Mathematica to compute the total curvature of this knot and other examples. First the parameterization:

    x[t_, p_, q_] := Cos[p t] (Cos[q t] + 2)
    y[t_, p_, q_] := Sin[p t] (Cos[q t] + 2) 
    z[t_, p_, q_] := -Sin[q t]

We can plot torus knots as follows.

    k[t_, p_, q_] := { 
        x[t, p, q], 
        y[t, p, q], 
        z[t, p, q]
    }
    Graphics3D[Tube[Table[k[i, p, q], {i, 0, 2 Pi, 0.01}], 
        0.08], Boxed -> False]

Here’s a more complicated knot with p = 3 and q = 7.

(3, 7) knot

Before we can integrate the curvature with respect to arc length, we need an expression for an element of arc length as a function of the parameter t.

    ds[t_, p_, q_] :=  Sqrt[ 
        D[x[t, p, q], t]^2 + 
        D[y[t, p, q], t]^2 + 
        D[z[t, p, q], t]^2
    ]

Now we can compute the total curvature.

    total[p_, q_] := NIntegrate[
        ArcCurvature[k[t, p, q], t] ds[t, p, q], 
        {t, 0, 2 Pi}
    ]

We can use this to find that the total curvature of the (2,3) torus knot, the trefoil, is 17.8224, whereas 4π is 12.5664. So the Fary-Milnor theorem holds.

Let’s do one more example, this time a (1,4) knot.

unknot

You can see that this is not actually knot. This doesn’t contradict what we said above because 1 divides 4. When p or q equal 1, we get an unknot.

When we compute its total curvature we get 24.2737, more than 4π. The Fary-Milnor theorem doesn’t say that total curvature in excess of 4π is a sufficient condition for a loop to be knotted; it says it’s necessary. Total curvature less than 4π proves that something isn’t a knot, but curvature greater than 4π doesn’t prove anything.

More on curvature and knots

[1] If you change out the 2s in the parameterization with a larger number, it’s easier to see from the graphs that the curves are on a torus. For example, if we plot the (3,7) knot again, replacing 2’s in the definition of x(t) and y(t) with 5’s, we can kinda see a torus inside the knot.

(3, 7) knot torus

Stone-Weierstrass on a disk

A couple weeks ago I wrote about a sort of paradox, that Weierstrass’ approximation theorem could seem to contradict Morera’s theorem. Weierstrass says that the uniform limit of polynomials can be an arbitrary continuous function, and so may have sharp creases. But Morera’s theorem says that the uniform limit of polynomials is analytic and thus very smooth.

Both are true, but they involve limits in different spaces. Weierstrass’ theorem applies to convergence over a compact interval of the real line, and Morera’s theorem applies to convergence over compact sets in the complex plane. The uniform limit of polynomials is better behaved when it has to be uniform in two dimensions rather than just one.

This post is a sort of variation on the post mentioned above, again comparing convergence over the real line and convergence in the complex plane, this time looking at what happens when you conjugate variables [1].

There’s an abstract version of Weierstrass’ theorem due to Marshall Stone known as the Stone-Weierstrass theorem. It generalizes the real interval of Weierstrass’ theorem to any compact Hausdorff space [2]. The compact Hausdorff space we care about for this post is the unit disk in the complex plane.

There are two versions of the Stone-Weierstrass theorem, one for real-valued functions and one for complex-valued functions. I’ll state the theorems below for those who are interested, but my primary interest here is a corollary to the complex Stone-Weierstrass theorem. It says that any continuous complex-valued function on the unit disk can be approximated as closely as you like with polynomials in z and the conjugate of z with complex coefficients, i.e. by polynomials of the form

p(z, \bar{z})

When you throw in conjugates, things change a lot. The uniform limit of polynomials in z alone must be analytic, very well behaved. But the uniform limit of polynomials in z and z conjugate is merely continuous. It can have all kinds of sharp edges etc.

Conjugation opens up a lot of new possibilities, for better or for worse. As I wrote about here, an analytic function can only do two things to a tiny disk:  stretch it or rotate it. It cannot flip it over, as conjugation does.

By adding or subtracting a variable and its conjugate, you can separate out the real and imaginary parts. The the parts are no longer inextricably linked and this allows much more general functions. The magic of complex analysis, the set of theorems that seem too good to be true, depends on the real and imaginary parts being tightly coupled.

***

Now for those who are interested, the statement of the Stone-Weierstrass theorems.

Let X be a compact Hausdorff space. The set of real or complex valued functions on X forms an algebra. That is, it is closed under taking linear combinations and products of its elements.

The real Stone-Weierstrass theorem says a subalgebra of the continuous real-valued functions on X is dense if it contains a non-zero constant function and if it separates points. The latter condition means that for any two points, you can find functions in the subalgebra that take on different values on the two points.

If we take the interval [0, 1] as our Hausdorff space, the Stone-Weierstrass theorem says that the subalgebra of polynomials is dense.

The complex Stone-Weierstrass theorem says a subalgebra of he continuous complex-valued functions on X is dense if it contains a non-zero constant function, separates points, and is closed under taking conjugates. The statement above about polynomials in z and z conjugate follows immediately.

***

[1] For a complex variable z of the form a + bi where a and b are real numbers and i is the imaginary unit, the conjugate is abi.

[2] A Hausdorff space is a general topological space. All it requires is that for any two distinct points in the space, you can find disjoint open sets that each contain one of the points.

In many cases, a Hausdorff space is the most general setting where you can work without running into difficulties, especially if it is compact. You can often prove something under much stronger assumptions, then reuse the proof to prove the same result in a (compact) Hausdorff space.

See this diagram of 34 kinds of topological spaces, moving from strongest assumptions at the top to weakest at the bottom. Hausdorff is almost on bottom. The only thing weaker on that diagram is T1, also called a Fréchet space.

In a T1 space, you can separate points, but not simultaneously. That is, given two points p1 and p2, you can find an open set U1 around p1 that does not contain p2, and you can find an open set U2 around p2 that does not contain p1. But you cannot necessarily find these sets in such a way that U1 and U2 are disjoint. If you could always choose these open sets to be distinct, you’d have a Hausdorff space.

Here’s an example of a space that is T1 but not Hausdorff. Let X be an infinite set with the cofinite topology. That is, open sets are those sets whose complements are finite. The complement of p1 is an open set containing p2 and vice versa. But you cannot find disjoint open sets separating two points because there are no disjoint open sets. The intersection of any pair of open sets must contain an infinite number of points.

Putting topological data analysis in context

I got a review copy of The Mathematics of Data recently. Five of the six chapters are relatively conventional, a mixture of topics in numerical linear algebra, optimization, and probability. The final chapter, written by Robert Ghrist, is entitled Homological Algebra and Data. Those who grew up with Sesame Street may recall the song “Which one of these, is not like the other …”

When I first heard of topological data analysis (TDA), I was excited about the possibility of putting some beautiful mathematics to practical application. But it was hard for me to put TDA in context. How do you get actionable information out of it? If you find a seven-dimensional doughnut hiding in your data, that’s very interesting, but what do you do with that information?

Robert’s chapter in the book I’m reviewing has a nice introductory paragraph that helps put TDA in context. The section heading for the paragraph is “When is Homology Useful?”

Homological methods are, almost by definition, robust, relying on neither precise coordinates nor careful estimates for efficiency. As such, they are most useful in settings where geometric precision fails. With great robustness comes both great flexibility and great weakness. Topological data analysis is more fundamental than revolutionary: such techniques are not intended to supplant analytic, probabilistic, or spectral techniques. They can however reveal a deeper basis for why some data sets and systems behave the way they do. It is unwise to wield topological techniques in isolation, assuming that the weapons of unfamiliar “higher” mathematics are clad with incorruptible silver.

Robert’s background was in engineering and more conventional applied mathematics before he turned to applications of topology, and so he brings a broader perspective to TDA than someone trained in topology looking for ways to make topology useful. He also has a decade more experience applying TDA than when I interviewed him here. I’m looking forward to reading his new chapter carefully.

As I wrote about the other day, apparently the US Army believes that topological data analysis can be useful, presumably in combination with more quantitative methods. [1] More generally, it seems the Army is interested in mathematical models that are complementary to traditional models, models that are robust and flexible. The quote above cautions that with robustness and flexibility comes weakness, though ideally weakness that is offset by other models.

More on topological data analysis

[1] Algebraic topology is quantitative in one sense and qualitative in another. It aims to describe qualitative properties using algebraic invariants. It’s quantitative in the sense of computing homology groups, but it’s not as directly quantitative as more traditional mathematical models. It’s quantitative at a higher level of abstraction.

A genius can admit finding things difficult

Karen Uhlenbeck

Karen Uhlenbeck has just received the Abel Prize. Many say that the Fields Medal is the analog of the Nobel Prize for mathematics, but others say that the Abel Prize is a better analog. The Abel prize is a recognition of achievement over a career whereas the Fields Medal is only awarded for work done before age 40.

I had a course from Karen Uhlenbeck in graduate school. She was obviously brilliant, but what I remember most from the class was her candor about things she didn’t understand. She was already famous at the time, having won a MacArthur genius award and other honors, so she didn’t have to prove herself.

When she presented the definition of a manifold, she made an offhand comment that it took her a month to really understand that definition when she was a student. She obviously understands manifolds now, having spent her career working with them.

I found her comment about extremely encouraging. It shows it’s possible to become an expert in something you don’t immediately grasp, even if it takes you weeks to grok its most fundamental concept.

Uhlenbeck wasn’t just candid about things she found difficult in the past. She was also candid about things she found difficult at the time. She would grumble in the middle of a lecture things like “I can never remember this.” She was not a polished lecturer—far from it—but she was inspiring.

More posts on UT math profs

Photo of Karen Uhlenbeck in 1982 by George Bergman [GFDL], via Wikimedia Commons