Fredholm index

The previous post on kernels and cokernels mentioned that for a linear operator TV → W, the index of T is defined as the difference between the dimension of its kernel and the dimension of its cokernel:

index T = dim ker T − dim coker T.

The index was first called the Fredholm index, because of it came up in Fredholm’s investigation of integral equations. (More on this work in the next post.)

Robustness

The index of a linear operator is robust in the following sense. If V and W are Banach spaces and TV → W is a continuous linear operator, then there is an open set around T in the space of continuous operators from V to W on which the index is constant. In other words, small changes to T don’t change its index.

Small changes to T may alter the dimension of the kernel or the dimension of the cokernel, but they don’t alter their difference.

Relation to Fredholm alternative

The next post discusses the Fredholm alternative theorem. It says that if K is a compact linear operator on a Hilbert space and I is the identity operator, then the Fredholm index of IK is zero. The post will explain how this relates to solving linear (integral) equations.

Analogy to Euler characteristic

We can make an exact sequence with the spaces V and W and the kernel and cokernel of T as follows:

0 → ker TVW → coker T → 0

All this means is that the image of one map is the kernel of the next.

We can take the alternating sum of the dimensions of the spaces in this sequence:

dim ker T − dim V + dim W − dim coker T.

If V and W have the same finite dimension, then this alternating sum equals the index of T.

The Euler characteristic is also an alternating sum. For a simplex, the Euler characteristic is defined by

V − EF

where V is the number of vertices, E the number of edges, and F the number of faces. We can extend this to higher dimensions as the number of zero-dimensional object (vertices), minus the number of one-dimensional objects (edges), plus the number of two-dimensional objects, minus the number of three dimensional objects, etc.

A more sophisticated definition of Euler characteristic is the alternating sum of the dimensions of cohomology spaces. These also form an exact sequence.

The Atiyah-Singer index theorem says that for elliptic operators on manifolds, two kinds of index are equal: the analytical index and the topological index. The analytical index is essentially the Fredholm index. The topological index is derived from topological information about the manifold.

This is analogous to the Gauss-Bonnet theorem that says you can find the Euler characteristic, a topological invariant, by integrating Gauss curvature, an analytic calculation.

Other posts in this series

This is the middle post in a series of three. The first was on kernels and cokernels, and the next is on the Fredholm alternative.

Linear KdV dispersion

The Korteweg–De Vries (KdV) equation

u_t - 6 u\, u_x + u_{xxx} = 0

is a nonlinear PDE used to model shallow water waves. The linear counterpart omits the nonlinear term in the middle.

u_t + u_{xxx} = 0

This variant is useful in itself, but also for understanding the nonlinear KdV equation.

Solitons

Solutions to the linear KdV equation spread out over time. The nonlinear term in the KdV equation counterbalances the dispersion term uxxx so that solutions to the nonlinear PDE behave in some ways like linear solutions.

Solutions to a nonlinear equation don’t add, and yet they sorta add. Here’s a description from [1].

At the heart of these observations is the discovery that these nonlinear waves can interact strongly and then continue thereafter almost as if there had been no interaction at all. This persistence of the wave led Zabusky and Kruskal to coin the term ‘soliton’ (after photon, proton, etc.) to emphasize the particle-like character of these waves which retain their identities in a collision.

I added the emphasis on almost because many descriptions leave out this important qualifier.

Solution to linear KdV

There is a compact expression for the solution to the linear KdP equation if u, ux , and uxx all go to 0 as |x| → ∞. If u(x, 0) = f(x), then the solution is

u(x, t) = (3t)^{-1/3} \int_{-\infty}^\infty f(y) \text{ Ai}\!\left( \frac{x-y}{(3t)^{1/3}} \right) \,dy

Here Ai(z) is the Airy function. This function has come up several times here. For example, I’ve written about the Airy function in the context of triple factorials and in connection with Bessel functions.

Aside on third order equations

Third order differential equations are uncommon. Third order linear ODEs are quite rare. Third order differential equations are usually nonlinear PDEs, like the KdV equation. The linear KdV is an example of a linear third order PDE that arises in applications.

 

[1] P. G. Drazin and R. S. Johnson. Solitons: an introduction. Cambridge University Press. 1989.

Closed-form minimal surface solutions

Differential equations, especially nonlinear differential equations, rarely have a closed-form solution, but sometimes it happens. As I wrote about a year ago

It is unusual for a nonlinear PDE to have a closed-form solution, but it is not unheard of. There are numerous examples of nonlinear PDEs, equations with important physical applications, that have closed-form solutions.

This post will present some closed-form solutions of the minimal surface equation

(1 + |ux|²) uyy − 2 ux uy uxy + (1 + |uy|²) uxx = 0

One trivial class of closed-form solutions are planes.

u(xy) = axbyc.

There are three non-trivial classes of solutions as far as I know. Jean Baptiste Marie Meusnier discovered two of these in 1776, namely the helicoild

u(x, y) = tan−1(y/x)

and the catenoid

u(x, y) = cosh−1(a (x² + y²)½) / a

Heinrich Scherk discovered another closed form solution in 1830:

u(x, y) = log( cos(ay) / cos(ax) ) / a

Here’s a plot.

The surface formed by the graph of the solution is known as Scherk’s surface. You could image that if the edges of this surface were made of wire and the wire was dipped in soapy wanter, it would form a bubble like Sherk’s surface.

Note that the closed-form solutions satisfy the minimal surface PDE itself, but do not satisfy any given boundary conditions, unless the boundary values you’d like to specify happen to be exactly the values this function has.

Fundamental solution

The “fundamental solution” to a PDE solves the equation with the right-hand side set to δ. Intuitively, you can think of the delta function as striking something with a hammer in order to see how it rings.

An aside on rigor

A novice might be OK with the explanation above.

A sophomore might rightly object that this doesn’t make sense. This delta “function” isn’t even a function. How can you set one side of a differential equation to something that isn’t even a function?

An expert would understand that calling δ a function is just a convenient figure of speech for a rigorous construction using distribution theory. You can find a high-level introduction here.

Bell curve meme.

As with many of the bell curve memes, the horizontal axis is really experience rather than intelligence. “Whatever you say” could be an intelligent response to someone talking about things they understand but you don’t. And objecting that something doesn’t make sense (as stated) is an intelligent response when you’re exposed to a metaphor that you didn’t realize was a metaphor. A mature response is to appreciate the value of rigor and the value of metaphor.

Why fundamental

The reason a fundamental solution is called “fundamental” is that once you have the fundamental solution, you can find more solutions by convolving the right-hand side with it.

So if L is a linear differential operator and F is a fundamental solution, i.e.

L F = δ

then the convolution f =  Fh is a solution to

L fh.

Poisson’s equation

The fundamental solution to Poisson’s equation

∇² f = h

depends on dimension.

For dimension d > 2 the solution is proportional to rd−2 where r is the radial distance to the origin.

For dimension d = 2 the solution is proportional to log r.

This is an example of the phenomenon alluded to in the article titled A Zeroth Power Is Often a Logarithm Yearning to Be Free by Sanjoy Mahajan. If we naively stuck d = 2 into the fundamental solution rd−2 for higher dimensions we’d get r0  = 1, which doesn’t work. But we’d read Mahajan’s article, we might guess that log r and then verify that it works.

I give a couple more examples of “logarithms yearning to be free” in this post.

 

Delay differential equations

Sometimes the future state of a system depends not only on the current state (position, velocity, acceleration, etc.) but also on the previous state. Equations for modeling such systems are known as delay differential equations (DDEs), difference differential equations, retarded equations, etc. In a system with hysteresis, it matters not only where you are but how you got there.

The most basic theory of delay differential equations is fairly simple. Suppose you have an equation like the following.

u′(t) + u(t − ω) = f(t).

To uniquely determine a solution, you’d need an initial condition. And we’d need more than the value of u(0). We’d need a function g(t) that give the value of u on the entire interval [0, ω].

So we initially have the value of u over [0, ω]. Next, over the interval [ω, 2ω] the value of u(t − ω) is known. We could replace that term in the DDE with g(t), And after we’ve solved our equation over [ω, 2ω], we can use the solution to solve the equation over [2ω, 3ω]. This process is called the method of steps.

Although you can solve DDEs using the method of steps, this might not be the best approach. It might be more computationally efficient, or theoretically convenient, to use another method to solve such equations, such as Laplace transforms. The method of transforms might convince you that a solution exists, but it might not, for example, be the best way to determine the limiting behavior of solutions.

 

Laplace transform inversion theorems

The way Laplace transforms, as presented in a typical differential equation course, are not very useful. Laplace transforms are useful, but not as presented.

The use of Laplace transforms is presented is as follows:

  1. Transform your differential equation into an algebraic equation.
  2. Solve the algebraic equation.
  3. Invert the transform to obtain your solution.

This is correct, but step 3 is typically presented in a misleading way. For pedagogical reasons, students are only given problems for which the last step is easy. They’re given a table with functions on the left and transforms on the right, and you compute an inverse transform by recognizing the result of step 2 in the right column.

Because of the limitations listed above, Laplace transforms, as presented in an introductory course, can only solve problems that could just as easily be solved by other methods presented in the same course.

What good is it, in an undergraduate classroom setting, if you reduce a problem to inverting a Laplace transform but the inverse problem doesn’t have a simple solution?

Of course in practice, rather than in a classroom, it might be very useful to reduce a complicated problem to the problem of inverting a Laplace transform. The latter problem may not be trivial, but it’s a standard problem. You could ask someone to solve the inversion problem who does not understand where the transform of the solution came from.

Laplace inversion theorem

The most well-known Laplace inversion theorem states that if f is a function and F is the Laplace transform of f, then you can recover f from F via the following integral.

f(x) = \frac{1}{2\pi i} \int_{c -\infty i}^{c + \infty i} \exp(sx) \, F(s)\, ds

It’s understandable that you wouldn’t want to present this to most differential equation students. It’s not even clear what the right hand side means, much less how you would calculate it. As for what it means, it says you can calculate the integral along any line parallel to the imaginary axis. In practice, the integral may be evaluated using contour integration, in particular using the so-called Bromwich contour.

It might be difficult to invert the Laplace transform, either numerically or analytically, but at least this is a separate problem from whatever led to this. Maybe the original problem was more difficult, such as a complicated delay differential equation.

Post-Widder theorem

There is a lesser-known theorem for inverting a Laplace transform, the Post-Widder theorem. It says

f(x) = \lim_{n\to\infty} \frac{(-1)^n}{n!} \left( \frac{n}{x} \right)^{n+1} F^{(n)}\left( \frac{n}{x} \right)

where F(n) is the nth derivative of F. This may not be an improvement—it might be much worse than evaluating the integral above—but it’s an option. It doesn’t involve functions of a complex variable, so in that sense it is more elementary [1].

Related posts

[1] The use of the word elementary in mathematics can be puzzling. Particularly in the context of number theory, elementary essentially means “without using complex variables.” An elementary proof may be far more difficult to follow than a proof using complex variables.

Separable functions in different contexts

I was skimming through the book Mathematical Reflections [1] recently. He was discussing a set of generalizations [2] of the Star of David theorem from combinatorics.

\gcd\left(\binom{n - 1}{r - 1}, \binom{n}{r+1}, \binom{n+1}{r}\right) = \gcd\left(\binom{n-1}{r}, \binom{n+1}{r+1}, \binom{n}{r-1}\right)

The theorem is so named because if you draw a Star of David by connecting points in Pascal’s triangle then each side corresponds to the vertices of a triangle.

diagram illustrating the Star of David Theorem

One such theorem was the following.

\binom{n - \ell}{r - \ell} \binom{n - k}{r} \binom{n - k - \ell}{r - k} = \binom{n-k}{r-k} \binom{n - \ell}{r} \binom{n - k - \ell}{r - \ell}

This theorem also has a geometric interpretation, connecting vertices within Pascal’s triangle.

The authors point out that the binomial coefficient is a separable function of three variables, and that their generalized Star of David theorem is true for any separable function of three variables.

The binomial coefficient C(nk) is a function of two variables, but you can think of it as a function of three variables: n, k, and nk. That is

{n \choose k} = f(n) \, g(k) \, g(n-k)

where f(n) = n! and g(k) = 1/k!.

I was surprised to see the term separable function outside of a PDE context. My graduate work was in partial differential equations, and so when I hear separable function my mind goes to separation of variables as a technique for solving PDEs.

Coincidentally, I was looking a separable coordinate systems recently. These are coordinate systems in which the Helmholtz equation can be solved by separable function, i.e. a coordinate system in which the separation of variables technique will work. The Laplacian can take on very different forms in different coordinate systems, and if possible you’d like to choose a coordinate system in which a PDE you care about is separable.

Related posts

[1] Peter Hilton, Derek Holton, and Jean Pedersen. Mathematical Reflections. Springer, 1996.

[2] Hilton et al refer to a set of theorems as generalizations of the Star of David theorem, but these theorems are not strictly generalizations in the sense that the original theorem is clearly a special case of the generalized theorems. The theorems are related, and I imagine with more effort I could see how to prove the older theorem from the newer ones, but it’s not immediately obvious.

Closed-form solutions to nonlinear PDEs

The traditional approach to teaching differential equations is to present a collection of techniques for finding closed-form solutions to ordinary differential equations (ODEs). These techniques seem completely unrelated [1] and have arcane names such as integrating factors, exact equations, variation of parameters, etc.

Students may reasonably come away from an introductory course with the false impression that it is common for ODEs to have closed-form solutions because it is common in the class.

My education reacted against this. We were told from the beginning that differential equations rarely have closed-form solutions and that therefore we wouldn’t waste time learning how to find such solutions. I didn’t learn the classical solution techniques until years later when I taught an ODE class as a postdoc.

I also came away with a false impression, the idea that differential equations almost never have closed-form solutions in practice, especially nonlinear equations, and above all partial differential equations (PDEs). This isn’t far from the truth, but it is an exaggeration.

I specialized in nonlinear PDEs in grad school, and I don’t recall ever seeing a closed-form solution. I heard rumors of a nonlinear PDE with a closed form solution, the KdV equation, but I saw this as the exception that proves the rule. It was the only nonlinear PDE of practical importance with a closed-form solution, or so I thought.

It is unusual for a nonlinear PDE to have a closed-form solution, but it is not unheard of. There are numerous examples of nonlinear PDEs, equations with important physical applications, that have closed-form solutions.

Yesterday I received a review copy of Analytical Methods for Solving Nonlinear Partial Differential Equations by Daniel Arrigo. If I had run across with a book by that title as a grad student, it would have sounded as eldritch as a book on the biology of Bigfoot or the geography of Atlantis.

A few pages into the book there are nine exercises asking the reader to verify closed-form solutions to nonlinear PDEs:

  1. a nonlinear diffusion equation
  2. Fisher’s equation
  3. Fitzhugh-Nagumo equation
  4. Berger’s equation
  5. Liouville’s equation
  6. Sine-Gordon equation
  7. Korteweg–De Vries (KdV) equation
  8. modified Korteweg–De Vries (mKdV) equation
  9. Boussinesq’s equation

These are not artificial examples crafted to have closed-form solutions. These are differential equations that were formulated to model physical phenomena such as groundwater flow, nerve impulse transmission, and acoustics.

It remains true that differential equations, and especially nonlinear PDEs, typically must be solved numerically in applications. But the number of nonlinear PDEs with closed-form solutions is not insignificant.

Related posts

[1] These techniques are not as haphazard as they seem. At a deeper level, they’re all about exploiting various forms of symmetry.

 

Blow up in finite time

A few years ago I wrote a post about approximating the solution to a differential equation even though the solution did not exist. You can ask a numerical method for a solution at a point past where the solution blows up to infinity, and it will dutifully give you a finite solution. The result is meaningless, but will give a result anyway.

The more you can know about the solution to a differential equation before you attempt to solve it numerically the better. At a minimum, you’d like to know whether there even is a solution before you compute it. Unfortunately, a lot of theorems along these lines are local in nature: the theorem assures you that a solution exists in some interval, but doesn’t say how big that interval might be.

Here’s a nice theorem from [1] that tells you that a solution is going to blow up in finite time, and it even tells you what that time is.

The initial value problem

y′ = g(y)

with y(0) = y0 with g(y) > 0 blows up at T if and only if the integral

\int_{y_0}^\infty \frac{1}{g(t)} \, dt
converges to T.

Note that it is not necessary to first find a solution then see whether the solution blows up.

Note also that an upper (or lower) bound on the integral gives you an upper (or lower) bound on T. So the theorem is still useful if the integral is hard to evaluate.

This theorem applies only to autonomous differential equations, i.e. the right hand side of the equation depends only on the solution y and not on the solution’s argument t. The differential equation alluded to at the top of the post is not autonomous, and so the theorem above does not apply. There are non-autonomous extensions of the theorem presented here (see, for example, [2]) but I do not know of a theorem that would cover the differential equation presented here.

[1] Duff Campbell and Jared Williams. Exloring finite-time blow-up. Pi Mu Epsilon Journal, Spring 2003, Vol. 11, No. 8 (Spring 2003), pp. 423–428

[2] Jacob Hines. Exploring finite-time blow-up of separable differential equations. Pi Mu Epsilon Journal, Vol. 14, No. 9 (Fall 2018), pp. 565–572

When is a function of two variables separable?

Given a function f(xy), how can you tell whether f can be factored into the product of a function g(x) of x alone and a function h(y) of y alone? Depending on how an expression for f is written, it may or may not be obvious whether f(x, y) can be separated into g(x) h(y).

There are several situations in which you might want to know whether a function is separable. For example, the ordinary differential equation

y′ = f(x, y)

can be solved easily when f(x, y) = g(x) h(y).

You might want to do something similar for a partial differential equation, using separation of variables, possibly choosing a coordinate system that allows the separation of variables trick to work.

Aside from applications to differential equations, you might want to know whether a polynomial in two variables can be factored into the product of polynomials in each variable separately.

In [1] David Scott gives a simple necessary condition for f to be separable:

f fxy = fx fy

Here the subscripts indicate partial derivatives.

It’s easy to see this condition is necessary. Scott shows the condition is also sufficient under some mild technical assumptions.

As an example, determine the value of k such that the differential equation

y′ = 6xy² + 3y² −4x + k

is separable.

Scott’s equation

f fxy = fx fy

becomes

(6xy² + 3y² −4x + k)(12y) = (6y² −4)(12xy + 6y)

which holds if and only if k = −2.

Related posts

[1] David Scott. When is an Ordinary Differential Equation Separable? The American Mathematical Monthly, Vol. 92, No. 6, pp. 422–423