The separation of variables technique for solving partial differential equations looks like a magic trick the first time you see it. The lecturer, or author if you’re more self-taught, makes an audacious assumption, like pulling a rabbit out of a hat, and it works.

For example, you might first see the heat equation

*u*_{t} = *c*² *u*_{xx}.

The professor asks you to assume the solution has the form

*u*(*x*, *t*) = *X*(*x*) *T*(*t*).

i.e. the solution can be separated into the product of a function of *x* alone and a function of *t* alone.

Following that you might see Laplace’s equation on a rectangle

*u*_{xx} + *u*_{yy} = 0

with the analogous assumption that

*u*(*x*, *y*) = *X*(*x*) *Y*(*y*),

i.e. the product of a function of *x* alone and a function of *y* alone.

There are several possible responses to this assumption.

- Whatever you say, doc.
- How can you assume that?
- How do you know you’re not missing any possibilities?
- What made someone think to try this?

As with many things, separation of variables causes the most consternation for the moderately sophisticated students. The least sophisticated students are untroubled, and the most sophisticated student can supply their own justification (at least after the fact).

One response to question (2) is “Bear with me. I’ll show that this works.”

Another response would be “OK, how about assuming the solution is a *sum* of such functions. That’s a much larger space to look in. And besides, we *are* going to take sums of such solutions in a few minutes.” One could argue from functional analysis or approximation theory that the sums of separable functions are dense in reasonable space of functions [1].

This is a solid explanation, but it’s kind of anachronistic: most students see separation of variables long before they see functional analysis or approximation theory. But it would be a satisfying response for someone who is seeing all this for the second time. Maybe they were exposed to separation of variables as an undergraduate and now they’re taking a graduate course in PDEs. In an undergraduate class a professor could do a little foreshadowing, giving the students a taste of approximation theory.

Existence of solutions is easier to prove than uniqueness in this case because you can concretely construct a solution. This goes back to the “it works” justification. This argument deserves more respect than a sophomoric student might give it. Mathematics research is not nearly as deductive and mathematics education. You often have to make inspired guesses and then show that they work.

Addressing question (3) requires saying something about uniqueness. A professor could simply assert that there are uniqueness theorems that allow you to go from “I’ve found something that works” to “and so it must be the only thing that works.” Or one could sketch a uniqueness theorem. For example, you might apply a maximum principle to show that the difference between any two solutions is zero.

Question (4) is in some sense the most interesting question. It’s not a mathematical question *per se* but a question about how people do mathematics. I don’t know what was going through the mind of the first person to try separation of variables, or even who this person was. But a plausible line of thinking is that ordinary differential equations are easier than partial differential equations. How might you reduce a PDE to an ODE? Well, if the solution could be factored into functions of one variable, …

The next post will illustrate using separation of variables by solving the wave equation on a disk.

## Related posts

[1] Also, there’s the mind-blowing Kolmogorov-Arnol’d theorem. This theorem says any continuous function of several variables can be written as a sum of continuous separable functions. It doesn’t say you can make the functions in your sum smooth, but it suggests that sums of separable functions are more expressive than you might have imagined.

“You often have to make inspired guesses and then show that they work.”

That’s referred to as an ansatz.

I’ve seen books use that term, but I’ve never head anyone say it.

More than a little bit of a wooly-headed question from the Poet’s gallery —

Reading this post just now, for the first time ever I thought of this undergrad factoring in the sense of “prime vs non-prime” — but in a functional sense. “Prime” multivariable functions would be unseparable into single-variable factors, while “non prime” functions would be separable.

Since non-primes outnumber primes, perhaps non-prime (i.e. separable) functions outnumber non-separable functions, and if so, this separable form would not be relatively uncommon.

More poetry – if we can, in fact, even talk about something like “primeness” in a functional sense, might we leverage other tools from number theory to explore PDE solutions?

Having nowhere near the kind of adult mathematics to know where to explore this, I have to ask: Is anything like this analogy explored in functional analysis? Is there a “number theory” of functions? Does some Category Wizard have a map that covers this?

Meta-comment: the crisp formatting and uncluttered thinking of these blog posts makes it easy to see these kind of parallels.

Ross: Your comment made me think about the Kolmogorov-Arnol’d theorem and so I updated the post to link to it.

This idea only made sense to me when I learned about tensor products: if A and B are vector spaces, not everything in A ⊗ B can be written as a ⊗ b, but the tensor products of basis vectors do form a basis for A ⊗ B.