Justifying separation of variables

Posted on 12 June 2021 by John

The separation of variables technique for solving partial differential equations looks like a magic trick the first time you see it. The lecturer, or author if you’re more self-taught, makes an audacious assumption, like pulling a rabbit out of a hat, and it works.

For example, you might first see the heat equation

u_t = c² u_xx.

The professor asks you to assume the solution has the form

u(x, t) = X(x) T(t).

i.e. the solution can be separated into the product of a function of x alone and a function of t alone.

Following that you might see Laplace’s equation on a rectangle

u_xx + u_yy = 0

with the analogous assumption that

u(x, y) = X(x) Y(y),

i.e. the product of a function of x alone and a function of y alone.

There are several possible responses to this assumption.

Whatever you say, doc.
How can you assume that?
How do you know you’re not missing any possibilities?
What made someone think to try this?

As with many things, separation of variables causes the most consternation for the moderately sophisticated students. The least sophisticated students are untroubled, and the most sophisticated student can supply their own justification (at least after the fact).

One response to question (2) is “Bear with me. I’ll show that this works.”

Another response would be “OK, how about assuming the solution is a sum of such functions. That’s a much larger space to look in. And besides, we are going to take sums of such solutions in a few minutes.” One could argue from functional analysis or approximation theory that the sums of separable functions are dense in reasonable space of functions [1].

This is a solid explanation, but it’s kind of anachronistic: most students see separation of variables long before they see functional analysis or approximation theory. But it would be a satisfying response for someone who is seeing all this for the second time. Maybe they were exposed to separation of variables as an undergraduate and now they’re taking a graduate course in PDEs. In an undergraduate class a professor could do a little foreshadowing, giving the students a taste of approximation theory.

Existence of solutions is easier to prove than uniqueness in this case because you can concretely construct a solution. This goes back to the “it works” justification. This argument deserves more respect than a sophomoric student might give it. Mathematics research is not nearly as deductive and mathematics education. You often have to make inspired guesses and then show that they work.

Addressing question (3) requires saying something about uniqueness. A professor could simply assert that there are uniqueness theorems that allow you to go from “I’ve found something that works” to “and so it must be the only thing that works.” Or one could sketch a uniqueness theorem. For example, you might apply a maximum principle to show that the difference between any two solutions is zero.

Question (4) is in some sense the most interesting question. It’s not a mathematical question per se but a question about how people do mathematics. I don’t know what was going through the mind of the first person to try separation of variables, or even who this person was. But a plausible line of thinking is that ordinary differential equations are easier than partial differential equations. How might you reduce a PDE to an ODE? Well, if the solution could be factored into functions of one variable, …

The next post will illustrate using separation of variables by solving the wave equation on a disk.

[1] Also, there’s the mind-blowing Kolmogorov-Arnol’d theorem. This theorem says any continuous function of several variables can be written as a sum of continuous separable functions. It doesn’t say you can make the functions in your sum smooth, but it suggests that sums of separable functions are more expressive than you might have imagined.

Visual geometry

Posted on 11 June 2021 by John

If you’re puzzled by the title of this post, allow me to explain.

A natural reaction would be “Isn’t geometry intrinsically visual?” Indeed, geometry is motivated by things we can visualize. But modern developments of geometry have become heavy with formal machinery, so much so that one could reasonably ask “What happened to the geometry?”

Tristan Needham has a new book entitled Visual Differential Geometry and Forms that aims to put the geometry back into a first course on differential geometry. I expect it’s a good read based on having read his previous book Visual Complex Analysis.

I just got a review copy in the mail, and flipping through the book I can see that it lives up to its title. It has lots of illustrations, just as you’d expect from a book on differential geometry if you hadn’t taken courses in the subject.

Volunteer-generated errata pages

Posted on 10 June 2021 by John

I picked up a used copy of Quaternions and Rotation Sequences by Jack B. Kuipers for a project I’m starting to work on. The feedback I’ve seen on the book says it has good content but also has lots of typos. My copy has a fair number of corrections that someone penciled in. Someone on Amazon alluded to an errata page for the book but I’ve been unable to find it.

This made me wonder more generally: Is there a project to create errata pages? I’m thinking especially of mathematical reference books. I’m not concerned with spelling errors and such, but rather errors in equations that could lead to hours of debugging.

I would be willing to curate and host errata pages for a few books I care about, but it would be better if this were its own site, maybe a Wiki.

I don’t want to duplicate someone else’s effort. So if there’s already a site for community-generated errata pages, I could add a little content there. But if there isn’t such a project out there already, maybe someone would like to start one.

***

[1] Update: Jan Van lent found the errata page. See the first comment. Apparently the changes that were penciled into my book were copied from the author’s errata list. Also, these changes were applied to the paperback edition of the book.

Random drug screening

Posted on 10 June 2021 by John

Suppose in a company of N employees, m are chosen randomly for drug screening. In two independent screenings, what is the probability that someone will be picked both times? It may be unlikely that any given individual will be picked twice, while being very likely that someone will be picked twice.

Imagine m employees being given a red ticket representing the first screening, and m being given a blue ticket representing the second screening. The tickets be passed out in

${N \choose m}^2$

different ways. Of these, the number of ways the tickets could be passed out so that no one has both a red and a blue ticket is

${N \choose m} {N-m \choose m}$

because you can first pass out the red tickets, then choose m of the employees who did not get a red ticket for the blue tickets. And so the probability that no one will be picked twice, the probability that nobody holds both a red and a blue ticket, is

$\frac{ {N-m \choose m} }{ {N \choose m} }$

Now let’s plug in some numbers. Suppose a company has 100 employees and 20 are randomly screened each time. In two screenings, there is only a 0.7% chance that no one will be tested twice. Said another way, there’s a 99.3% chance that at least one person will be screened twice. Any given individual has a 20% chance of being selected each time, and so a 4% chance of being picked twice.

A variation on this problem is to compute the expected number of overlaps between two tests. With N = 100 and m = 20, we expect four people to be tested twice.

By the way, what if over half the employees are tested each time? For example, if a company of 100 people tests 60 people each time, it’s certain to test somebody twice. But the derivation above still works. The general definition of binomial coefficients takes care of this because the numerator will be zero if m is larger than half N. The number of ways to choose 60 things from a set of 40 things, for instance, is zero.

Blog email subscription

Posted on 8 June 2021 by John

As I mentioned a couple weeks ago, Feedburner, the service I’ve been using for blog email subscriptions, is shutting down. I’m switching over to MailerLite. The new email subscription is up and running. You can sign up here if you’d like.

If you’re already subscribed via Feedburner, there’s no need to sign up again with MailerLite. Some time in the next few weeks I will import all the email addresses from Feedburner into MailerLite. There will be some formatting changes and hopefully that will be the only difference you notice.

You could also subscribe via my RSS feed or follow one of my Twitter accounts if you’d like.

Recommending division

Posted on 8 June 2021 by John

A friend and I were discussing how to analyze his data one time and at the end of the conversation he said “So, basically you’re recommending division.” And indeed I was. The conclusion was to divide one thing by another.

I’ve also recommended to clients that they use an extended Kalman filter or homomorphic encryption; sometimes fancy math is called for. But often they need something simple.

However, knowing about division is not enough to appropriately recommend division. A ten-year-old child should know how to carry out division, but would be unlikely to realize when data is best analyzed by dividing one thing by another.

I saw something the other day saying that there are no child prodigies in applied math. Child prodigies flourish in closed worlds. Applied math is an open world. Not textbook math—that’s a closed world—but the skillful application of math to the messy real world.

It doesn’t take decades of experience to carry out division, but it may take decades of experience to wield it well.

Fourier, Gauss, and Heisenberg

Posted on 7 June 2021 by John

Several weeks ago I wrote about the Fourier uncertainty principle which gives a lower bound on the product of the variance of a function f and the variance of its Fourier transform. This post expands on the earlier post by quoting some results from a recent paper [1].

Gaussian density

The earlier post said that the inequality in the Fourier uncertainty principle is exact when f is proportional to a Gaussian probability density. G. H. Hardy proved this result in 1933 in the form of the following theorem.

Let f be a square-integrable function on the real line and assume f and its Fourier transform satisfy the following bounds

$\begin{align*} |f(x)| \leq& \,C \exp(-a|x|^2) \\ |\hat{f}(\xi)| \leq& \,C \exp(-b|\xi|^2\,) \\ \end{align*}$

for some constant C. Then if ab > 1/4, then f = 0. And if ab = 1/4, f(x) = c exp(−ax²) for some constant c.

Let’s translate this into probability terms by setting

$\begin{align*} a =& \,\frac{1}{2\sigma^2} \\ b =& \,\frac{1}{2\tau^2} \end{align*}$

Now Hardy’s theorem says that if f is bounded by a multiple of a Gaussian density with variance σ² and its Fourier transform is bounded by a multiple of a Gaussian density with variance τ², then the product of the two variances is no greater than 1. And if the product of the variances equals 1, then f is a multiple of a Gaussian density with variance σ².

Heisenberg uncertainty

Theorem 3 in [1] says that if u(t, x) is a solution to the free Schrödinger’s equation

$\partial_t u = i \Delta u$

then u at different points in time satisfies a theorem similar to Hardy’s theorem. In fact, the authors show that this theorem is equivalent to Hardy’s theorem.

Specifically, if u is a sufficiently smooth solution and

$\begin{align*} |u(0,x)| \leq& \,C \exp(-\alpha|x|^2) \\ |u(T,x)| \leq& \,C \exp(-\beta|x|^2) \\ \end{align*}$

then αβ > (4T)⁻² implies u(t, x) = 0, and αβ = (4T)⁻² implies

$u(t,x) = c \exp(-(\alpha + i/(4T))|x|^2)$

[1] Aingeru Fernández-Bertolin and Eugenia Malinnikova. Dynamical versions of Hardy’s uncertainty principle: A survey. Bulletin of the American Mathematical Society. DOI: https://doi.org/10.1090/bull/1729

What does RIPEMD stand for?

Posted on 5 June 2021 by John

The RIPEMD-160 secure hash function may be best known these days for its role as part of the implementation of Bitcoin. I’ve wondered what “RIPEMD” stands for, and today I stumbled on an explanation [1]:

“RIPEMD” stands for “RIPE Message Digest,” where “RIPE” stands for “RACE Integrity Primitives Evaluation” and where “RACE” stands for “Research and Development in Advanced Communications Technologies in Europe”—a nice example of a recursive abbreviation.

This deserves a diagram:

I created the diagram above with DITAA.

[1] Introduction to Cryptography with Open-Source Software by Alasdair McAndrew. CRC Press. 2011

Time dilation in SF and GPS

Posted on 5 June 2021 by John

I’m reading Voyage to Alpha Centauri and ran into a question about relativity. The book says in one place that their ship is moving a 56.7% of the speed of light, and in another place it says that time moves about 20% slower for them relative to folks on Earth. Are those two statements consistent?

It wouldn’t bother me if they weren’t consistent. I ordinarily wouldn’t bother to check such things. But I remember looking into time dilation before and being surprised how little effect velocity has until you get very close to the speed of light. I couldn’t decide whether the relativistic effect in the novel sounded too large or too small.

If a stationary observer is watching a clock moving at velocity v, during one second of the observer’s time,

$\sqrt{1 - \frac{v^2}{c^2}}$

seconds will have elapsed on the moving clock.

Even at 20% of the speed of light, the moving clock only appears to slow down by about 2%.

If, as in the novel, a spaceship is moving at 56.7% of the speed of light, then for every second an Earth-bound observer experiences, someone on the ship will experience √(1 − 0.567²) = 0.82 seconds. So time would run about 20% slower on the ship, as the novel says.

The author must have either done this calculation or asked someone to do it for him. I had a science fiction author ask me for something a while back, though I can’t remember right now what it was.

Small velocities

You can expand the expression above in a Taylor series to get

$\sqrt{1 - \frac{v^2}{c^2}} = 1 -\frac{v^2}{2c^2} -\frac{v^4}{8c^4} + \cdots$

and so velocities much smaller than the speed of light, the effect of time dilation is 0.5 v²/c², a quadratic function of velocity. You can use this to confirm the comment above that when v/c = 0.2, the effect of time dilation is about 2%.

GPS satellites travel at about 14,000 km/hour, and so the effect of time dilation is on the order of 1 part in 10¹⁰. This would seem insignificant, except it amounts to milliseconds per year, and so it does make a practical difference.

For something moving 100 times slower, like a car, time dilation would be 10,000 times smaller. So time in a car driving at 90 miles per hour slows down by one part in 10¹⁴ relative to a stationary observer.

Tape measures

The math in the section above is essentially the same as the math in the post explaining why it doesn’t matter much if a tape measure does run exactly straight when measuring a large distance. They both expand an expression derived from the Pythagorean theorem in a Taylor series.

Month: June 2021

Justifying separation of variables

Related posts

Visual geometry

Volunteer-generated errata pages

Random drug screening

Blog email subscription

Recommending division

Fourier, Gauss, and Heisenberg

Gaussian density

Heisenberg uncertainty

Related posts

More readable lambda calculus

Related posts

What does RIPEMD stand for?

Related posts

Time dilation in SF and GPS

Small velocities

Tape measures