Laplacian in various coordinate systems

The recent post on the wave equation on a disk showed that the Laplace operator has a different form in polar coordinates than it does in Cartesian coordinates. In general, the Laplacian is not simply the sum of the second derivatives with respect to each variable.

Mathematica has a function, unsurprisingly called Laplacian, that will compute the Laplacian of a given function in a given coordinate system. If you give it a specific function, it will compute the Laplacian of that function. But you can also give it a general function to find a general formula.

For example,

    Simplify[Laplacian[f[r, θ], {r, θ}, "Polar"]]

returns

\frac{f^{(0,2)}(r,\theta )}{r^2}+\frac{f^{(1,0)}(r,\theta )}{r}+f^{(2,0)}(r,\theta )

This is not immediately recognizable as the Laplacian from this post

\Delta u = \frac{\partial^2 u}{\partial r^2} + \frac{1}{r} \frac{\partial u}{\partial r} + \frac{1}{r^2} \frac{\partial^2 u}{\partial \theta^2}

because Mathematica is using multi-index notation, which is a little cumbersome for simple cases, but much easier to use than classical notation when things get more complicated. The superscript (0,2), for example, means do not differentiate with respect to the first variable and differentiate twice with respect to the second variable. In other words, take the second derivative with respect to θ.

Here’s a more complicated example with oblate spheroidal coordinates. Such coordinates come in handy when you need to account for the fact that our planet is not exactly spherical but is more like an oblate spheroid.

When we ask Mathematica

    Simplify[Laplacian[f[ξ, η, φ], {ξ, η, φ}, "OblateSpheroidal"]]

the result isn’t pretty.

(Csc[\[Eta]]^2*Sech[\[Xi]]^2*Derivative[0, 0, 2][f][\[Xi], \[Eta], \[Phi]] + (2*(Cot[\[Eta]]*Derivative[0, 1, 0][f][\[Xi], \[Eta], \[Phi]] + Derivative[0, 2, 0][f][\[Xi], \[Eta], \[Phi]] + Tanh[\[Xi]]*Derivative[1, 0, 0][f][\[Xi], \[Eta], \[Phi]] + Derivative[2, 0, 0][f][\[Xi], \[Eta], \[Phi]]))/ (Cos[2*\[Eta]] + Cosh[2*\[Xi]]))/\[FormalA]^2

I tried using TeXForm and editing it into something readable, but after spending too much time on this I gave up and took a screenshot. But as ugly as the output is, it would be uglier (and error prone) to do by hand.

Mathematica supports the following 12 coordinate systems in addition to Cartesian coordinates:

  • Cylindrical
  • Bipolar cylindrical
  • Elliptic cylindrical
  • Parabolic cylindrical
  • Circular parabolic
  • Confocal paraboloidal
  • Spherical
  • Bispherical
  • Oblate spheroidal
  • Prolate spheroidal
  • Conical
  • Toroidal

These are all orthogonal, meaning that surfaces where one variable is held constant meet at right angles. Most curvilinear coordinate systems used in practice are orthogonal because this simplifies a lot of things.

Laplace’s equation is separable in Stäckel coordinate systems. These are all these coordinate systems except for toroidal coordinates. And in fact Stäckel coordinates are the only coordinate systems in which Laplace’s equation is separable.

It’s often the case that Laplace’s equation is separable in orthogonal coordinate systems, but not always. I don’t have a good explanation for why toroidal coordinates are an exception.

If you’d like a reference for this sort of thing, Wolfram Neutsch’s tome Coordinates is encyclopedic. However, it’s expensive new and hard to find used.

Vibrating circular membranes

'Snare drum' by YannickWhee is licensed under CC BY 2.0

This post will tie together many things I’ve blogged about before. The previous post justified separation of variables. This post will illustrate separation of variables.

Also, this post will show why you might care about Bessel functions and their zeros. I’ve written about Bessel functions before, and said that Bessel functions are to polar coordinates what sines and cosines are to rectangular coordinates. This post will make this analogy more concrete.

Separation of variables

The separation of variables technique is typically presented in three contexts in introductory courses on differential equations:

  1. One-dimensional heat equation
  2. One-dimensional wave equation
  3. Two-dimensional (rectangular) Laplace equation

My initial motivation for writing this post was to illustrate separation of variables outside the most common examples. Separation of variables requires PDEs to have a special form, but not as special as the examples above might imply. A secondary motivation was to show Bessel functions in action.

Radially symmetric wave equation

Suppose you have a thin membrane, like a drum head, and you want to model its vibrations. By “thin” I mean that the membrane is sufficiently thin that we can adequately model it as a two-dimensional surface bobbing up and down in three dimensions. It’s not so thick that we need to model the material in more detail.

Let u(x, y, t) be the height of the membrane at location (x, y) and time t. The wave equation is a PDE modeling the motion of the membrane by

\frac{\partial^2 u}{\partial t^2} = c^2 \Delta u. width=

where Δ is the Laplacian operator. In rectangular coordinates the Laplacian is given by

\Delta u = \frac{\partial^2 u}{\partial x^2} + \frac{\partial^2 u}{\partial y^2}
We’re interested in a circular membrane, and so things will be much easier if we work in polar coordinates. In polar coordinates the Laplacian is given by

\Delta u = \frac{\partial^2 u}{\partial r^2} + \frac{1}{r} \frac{\partial u}{\partial r} + \frac{1}{r^2} \frac{\partial^2 u}{\partial \theta^2}

We will assume that our boundary conditions and initial conditions are radially symmetric, and so our solution will be radially symmetric, i.e. derivatives with respect to θ are zero. And in our case the wave equation simplifies to

\frac{\partial^2 u}{\partial t^2} = c^2 \left( \frac{\partial^2 u}{\partial r^2} + \frac{1}{r} \frac{\partial u}{\partial r} \right)

Boundary and initial conditions

Let a be the radius of our membrane. We will assume our membrane is clamped down on its boundary, like a drum head, which means we have the boundary condition

u(a, t) = 0

for all t ≥ 0. We assume the initial displacement and initial velocity of the membrane are given by

\begin{align*} u(r, 0) &= f(r) \\ \frac{\partial u}{\partial r}(r, 0) &= g(r) \end{align*}

for all r between 0 and a.

Separating variables

Now we get down to separation of variables. Because we’re assuming radial symmetry, we’re down to a function of two variables: r for the distance from the center and t for time. We assume

u(r, t) = R(r) T(t)

and stick it into the wave equation. A little calculation shows that

\frac{T''}{c^2 T} = \frac{1}{R} \left(R'' + \frac{1}{r} R'\right)

The left side is a function of t alone, and the right side is a function of r alone. The only way this can be true for all t and all r is for both sides to be constant. Call this constant -λ². Why? Because looking ahead a little we find that this will make things easier shortly.

Separation of variables allowed us to reduce our PDE to the following pair of ODEs.

r R'' + R' +\lambda^2 r R &= 0 \\ T'' + c^2 \lambda^2 T &= 0

The solutions to the equation for R are linear combinations of the Bessel functions J0r) and Y0r) [1].

And the solutions to the equation for T are linear combinations of the trig functions cos(cλt) and sin(cλt).

The boundary condition u(a, t) = 0 implies the boundary condition R(a) = 0. This implies that λa must be a zero of our Bessel function, and that all the Y0 terms drop out. This means that our solution is

u(r,t) = \sum_{n=1}^\infty J_0(\lambda_n r) \left( A_n \cos(c \lambda_n t) + B_n \sin(c \lambda_n t)\right)

where

\lambda_n = \frac{\alpha_n}{a}.

Here αn are the zeros of the Bessel function J0.

The coefficients An and Bn are determined by the initial conditions. Specifically, you can show that

\begin{align*} A_n &= \frac{2}{\phantom{ca}a^2 J_1^2(\alpha_n)} \int_0^a f(r)\, J_0(\lambda_n r) \, r\, dr \\ B_n &= \frac{2}{ca \alpha_n J_1^2(\alpha_n)} \int_0^a g(r)\, J_0(\lambda_n r) \, r\, dr \\ \end{align*}

The function J1 in the expression for the coefficients is another Bessel function.

The functions J0 and J1 are so important in applications that even the otherwise minimalist Unix calculator bc includes these functions. (As much as I appreciate Bessel functions, this still seems strange to me.) And you can find functions for zeros of Bessel functions in many libraries, such as scipy.special.jn_zeros in Python.

Related posts

[1] This is why introductory courses are unlikely to include an example in polar coordinates. Separation of variables itself is no harder in polar coordinates than in rectangular coordinates, and it shows the versatility of the method to apply it in a different setting.

But the resulting ODEs have Bessel functions for solutions, and it’s understandable that an introductory course might not want to go down this rabbit trail, especially since PDEs are usually introduced at the end of a differential equation class when professors are rushed and students are tired.

[2] Snare drum image above “Snare drum” by YannickWhee is licensed under CC BY 2.0

Justifying separation of variables

The separation of variables technique for solving partial differential equations looks like a magic trick the first time you see it. The lecturer, or author if you’re more self-taught, makes an audacious assumption, like pulling a rabbit out of a hat, and it works.

For example, you might first see the heat equation

ut = c² uxx.

The professor asks you to assume the solution has the form

u(x, t) = X(x) T(t).

i.e. the solution can be separated into the product of a function of x alone and a function of t alone.

Following that you might see Laplace’s equation on a rectangle

uxx + uyy = 0

with the analogous assumption that

u(x, y) = X(x) Y(y),

i.e. the product of a function of x alone and a function of y alone.

There are several possible responses to this assumption.

  1. Whatever you say, doc.
  2. How can you assume that?
  3. How do you know you’re not missing any possibilities?
  4. What made someone think to try this?

As with many things, separation of variables causes the most consternation for the moderately sophisticated students. The least sophisticated students are untroubled, and the most sophisticated student can supply their own justification (at least after the fact).

One response to question (2) is “Bear with me. I’ll show that this works.”

Another response would be “OK, how about assuming the solution is a sum of such functions. That’s a much larger space to look in. And besides, we are going to take sums of such solutions in a few minutes.” One could argue from functional analysis or approximation theory that the sums of separable functions are dense in reasonable space of functions [1].

This is a solid explanation, but it’s kind of anachronistic: most students see separation of variables long before they see functional analysis or approximation theory. But it would be a satisfying response for someone who is seeing all this for the second time. Maybe they were exposed to separation of variables as an undergraduate and now they’re taking a graduate course in PDEs. In an undergraduate class a professor could do a little foreshadowing, giving the students a taste of approximation theory.

Existence of solutions is easier to prove than uniqueness in this case because you can concretely construct a solution. This goes back to the “it works” justification. This argument deserves more respect than a sophomoric student might give it. Mathematics research is not nearly as deductive and mathematics education. You often have to make inspired guesses and then show that they work.

Addressing question (3) requires saying something about uniqueness. A professor could simply assert that there are uniqueness theorems that allow you to go from “I’ve found something that works” to “and so it must be the only thing that works.” Or one could sketch a uniqueness theorem. For example, you might apply a maximum principle to show that the difference between any two solutions is zero.

Question (4) is in some sense the most interesting question. It’s not a mathematical question per se but a question about how people do mathematics. I don’t know what was going through the mind of the first person to try separation of variables, or even who this person was. But a plausible line of thinking is that ordinary differential equations are easier than partial differential equations. How might you reduce a PDE to an ODE? Well, if the solution could be factored into functions of one variable, …

The next post will illustrate using separation of variables by solving the wave equation on a disk.

Related posts

[1] Also, there’s the mind-blowing Kolmogorov-Arnol’d theorem. This theorem says any continuous function of several variables can be written as a sum of continuous separable functions. It doesn’t say you can make the functions in your sum smooth, but it suggests that sums of separable functions are more expressive than you might have imagined.

Fourier, Gauss, and Heisenberg

Several weeks ago I wrote about the Fourier uncertainty principle which gives a lower bound on the product of the variance of a function f and the variance of its Fourier transform. This post expands on the earlier post by quoting some results from a recent paper [1].

Gaussian density

The earlier post said that the inequality in the Fourier uncertainty principle is exact when f is proportional to a Gaussian probability density. G. H. Hardy proved this result in 1933 in the form of the following theorem.

Let f be a square-integrable function on the real line and assume f and its Fourier transform satisfy the following bounds

\begin{align*} |f(x)| \leq& \,C \exp(-a|x|^2) \\ |\hat{f}(\xi)| \leq& \,C \exp(-b|\xi|^2\,) \\ \end{align*}

for some constant C. Then if ab > 1/4, then f = 0. And if ab = 1/4, f(x) = c exp(−ax²) for some constant c.

Let’s translate this into probability terms by setting

\begin{align*} a =& \,\frac{1}{2\sigma^2} \\ b =& \,\frac{1}{2\tau^2} \end{align*}

Now Hardy’s theorem says that if f is bounded by a multiple of a Gaussian density with variance σ² and its Fourier transform is bounded by a multiple of a Gaussian density with variance τ², then the product of the two variances is no greater than 1. And if the product of the variances equals 1, then f is a multiple of a Gaussian density with variance σ².

Heisenberg uncertainty

Theorem 3 in [1] says that if u(tx) is a solution to the free Schrödinger’s equation

\partial_t u = i \Delta u

then u at different points in time satisfies a theorem similar to Hardy’s theorem. In fact, the authors show that this theorem is equivalent to Hardy’s theorem.

Specifically, if u is a sufficiently smooth solution and

\begin{align*} |u(0,x)| \leq& \,C \exp(-\alpha|x|^2) \\ |u(T,x)| \leq& \,C \exp(-\beta|x|^2) \\ \end{align*}

then αβ > (4T)−2 implies u(t, x) = 0, and αβ = (4T)−2 implies

u(t,x) = c \exp(-(\alpha + i/(4T))|x|^2)

Related posts

[1] Aingeru Fernández-Bertolin and Eugenia Malinnikova. Dynamical versions of Hardy’s uncertainty principle: A survey. Bulletin of the American Mathematical Society. DOI: https://doi.org/10.1090/bull/1729

A stiffening spring

Imagine a spring with stiffness k1 attached to a ceiling and a mass m1 handing from the spring.

There’s a second spring attached to the first mass with stiffness k2 and a mass m2 handing from that.

mass spring system

The motion of the system is described by the pair of differential equations

\begin{align*} m_1 x_1'' &= -k_1x_1 + k_2(x_2-x_1) \\ m_2 x_2'' &= -k_2(x_2 - x_1) \end{align*}

If the second spring were infinitely stiff, the two masses would be joined with a rigid rod, and so the system would act like a single mass m1 + m2 hanging from the first spring. This motion would be described by the single differential equation

(m_1 + m_2) x_3'' &= -k_1x_3

So it’s not surprising that as k2 gets stiffer, the solution to the two-mass system converges to the solution to the system with a single combined mass. This is proven in [1].

However, what is missing from [1] is any visualization of how the solution to the two-mass system converges to that of the combined-mass system.

The plot below shows solutions for k2 equal to 10, 100, and 1000, and finally the system to the combined-mass system, labeled k2 = ∞. I used k1 = 1, m1 = 3, and m2 = 5.

solution plots

The coupled-mass system has a high-frequency component due to the oscillation of the second mass relative to the first one.

The authors in [1] show that the amplitude of the high-frequency component decays as k2 goes to infinity, though this is not apparent from the plots above. This is due to a limitation of the numerical method used to produce the plots.

Analytical solution

The numerical solution above raises two questions. First, how fast should the amplitude of high frequency component decay. Second, why did the numerical method apparently get the frequency of this component correct but the amplitude wrong?

The second question is more difficult and will have to wait for another post. The first question, however, we can settle fairly quickly.

The authors in [1] make the simplifying assumption that the two masses are equal to 1. They then define show that

x_2(t) = \frac{d_1}{\alpha_2} \cos(\alpha_2 t) + \frac{d_2}{\alpha_2} \sin(\alpha_2 t) + \frac{d_3}{\alpha_4} \cos(\alpha_4 t) + \frac{d_4}{\alpha_4} \cos(\alpha_4 t)

where the d‘s are constants that depend on initial conditions but not on the spring stiffnesses. and

\begin{align*} a &= -(k_2 + 2k_2) \\ b &= \sqrt{k_1^2 + 4k_2^2} \\ \alpha_2 &= \sqrt{(b-a)/2} \\ \alpha_4 &= \sqrt{-(b+a)/2} \end{align*}

α4, the frequency of the low frequency component, approaches a finite limit as k2 → ∞,

α2, the frequency of the high frequency component, is approximately √(2k2) for large k2.

The amplitude of the high frequency component should be inversely proportional to its frequency.

More differential equation posts

[1] K. E. Clark and S. Hill. The Effects of a Stiffening Spring. The College Mathematics Journal , Nov., 1999, Vol. 30, No. 5 (Nov., 1999), pp. 379-382

A generalization of sine and cosine

David Shelupsky [1] suggested a generalization of sine and cosine based on solutions to the system of differential equations

\begin{align*} \frac{d}{dt} \alpha_s(t) &= \phantom{-}\beta_s^{s-1}(t) \\ \frac{d}{dt} \beta_s(t) &= -\alpha_s^{s-1}(t) \end{align*}

with initial conditions αs(0) = 0 and βs(0) = 1.

If s = 2, then α(t) = sin(t) and β(t) = cos(t). The differential equations above reduce to the familiar fact that the derivative of sine is cosine, and the derivative of cosine is negative sine.

For larger even values of s, the functions αs and βs look like sine and cosine respectively, though flatter at their maxima and minima. Numerical experiments suggest that the solutions are periodic and the period increases with s. [2]

Here’s a plot for s = 4.

The first zero of α(t) is at 3.7066, greater than π. In the plot t ranges from 0 to 4π, but the second period isn’t finished.

If we look at the phase plot, i.e (α(t), β(t)), we get a shape that I’ve blogged about before: a squircle!

This is because, as Shelupsky proved,

\alpha_s^s(t) + \beta_s^s(t) = 1

Odd order

The comments above mostly concern the case of even s. When s is odd, functions αs and βs don’t seem much like sine or cosine. Here are plots for s = 3

and s = 5.

Other generalizations of sine and cosine

[1] David Shelupsky. A Generalization of the Trigonometric Functions. The American Mathematical Monthly, Dec. 1959, pp. 879-884

[2] After doing my numerical experiments I looked back more carefully at [1] and saw that the author proves that the solutions for even values of s are periodic, and that the periods increase with s, converging to 4 as s goes to infinity.

Ripples and hyperbolas

I ran across a paper [1] this morning on the differential equation

y‘ = sin(xy).

The authors recommend having students explore numerical solutions to this equation and discover theorems about its solutions.

Their paper gives numerous theorems relating solutions and the hyperbolas xy = a: how many times a solution crosses a hyperbola, at what angle, under what conditions a solution can be tangent to a hyperbola, etc.

The plot above is based on a plot in the original paper, but easier to read. It wasn’t so easy to make nice plots 40 years ago. In the original plot the solutions and the asymptotes were plotted with the same thickness and color, making them hard to tell apart.

More differential equation posts

[1] Wendell Mills, Boris Weisfeiler and Allan M. Krall. Discovering Theorems with a Computer: The Case of y‘ = sin(xy). The American Mathematical Monthly, Nov., 1979, Vol. 86, No. 9 (Nov., 1979), pp. 733-739

Driving vibrations with sawtooth waves

The previous post looked at driving a vibrating system with square waves rather than the more customary sine waves.

You could think of a square wave as a crude approximation to a sine wave. A sawtooth wave is another crude approximation to a sine wave, and so it would be interesting to see how systems driven by a sawtooth forcing function differ from those driven by a square or sinusoidal forcing function.

Sine wave, sawtooth wave, and square wave

With a square wave, the differences were most pronounced at low frequencies, i.e. when the driving frequency was low relative to the natural frequency. The same is true for sawtooth waves, but with some interesting differences.

As before we will look at the equation

u″ + u = sin(ωt)

and with sine replaced with a variation, this time a sawtooth wave rather than a square. We will use initial conditions u(0) = u‘(0) = 1.

We start with ω = 1, i.e. driving the system at its natural frequency.

Resonance with sawtooth forcing

We see the resonance we’d expect when driving a system at its natural frequency. The amplitudes are lower for the sawtooth forcing function than the sinusoidal forcing function.

When we back away a little from the resonant frequency, setting ω = 0.9, we see beats.

Sawtooth wave beats

When we reduce ω to 0.5, we get resonance again for the sawtooth forcing solutions.

Beats for sawtooth-driven solution

At ω = 0.4 we see something like beats again.

Sawtooth driven beats

And at ω = 1/3 it looks like we have resonance again for the sawtooth-driven system.

Sawtooth driven resonance

We also see beats at ω = 1/n for all integer n. I’ve tried this for n up to 12. But at other frequencies that are not reciprocals of integers I see periodic solutions.

I have only investigated this numerically. Do any of you know of analytical results that apply here?

Update: There’s a simple reason why driving frequencies of 1/n do indeed create resonance: The Fourier series for the driving function has a component at the resonant frequency. Thanks to gmvh for pointing this out.

The square wave has a similar resonance at driving frequency 1/n, but only when n is an odd number. I missed that because I didn’t happen to try any frequencies of that form. Here’s an example with n = 5.

Driving oscillations with square waves

What happens when you drive a vibrating system with a square wave rather than a sine wave? Do you still see the same kinds of behavior, such as beats and resonance? When does the difference between a square wave and a sine wave matter most? Those are the questions this post will address.

Background

Basic mechanical oscillations are modeled by the equation

m u″ + γ u′ + k u = F sin(ωt + φ)

You can think of m, γ, and k as mass, damping, and spring constant. The same equation describes non-mechanical oscillations, such as electrical systems. Sometimes the terms such as “mass” are still used when they are metaphorical rather than literal. I wrote a four-part series of posts on mechanical vibrations a while back starting here.

The term on the right hand side is the forcing function. F is the amplitude of the driving force and ω is its frequency.

The natural frequency of a system modeled by the equation above is

ω02 = k/m.

When F = 0, the solutions to the differential equation will have this frequency. When F is not zero, the solutions a component with the natural frequency and a component with the driving frequency. When the two frequencies are different, you get beats. When the two frequencies are the same, you get resonance. More on beats and resonance here.

In this post we will look at solutions to the equation above where the forcing function is a square wave rather than a sine wave. That is, we will replace

sin(ωt + φ)

with

sign( sin(ωt + φ) )

where

sign(x)

is 1 when x is positive and -1 when x is negative.

Exploration

To simplify things a little, we will set the damping term γ to zero, and set the phase φ in the driving function to zero. Also, we will set m, k, and F all equal to 1. We will focus on varying only the driving frequency ω.

We need initial conditions for our differential equations, so let’s pick u(0) = 1 and u‘(0) = 0.

When we drive the system at its natural frequency, i.e. with ω = 1, we get the same kind of resonance from a sine wave and a square wave.

Square wave resonance

Update: There are more resonant frequencies [1].

When we lower the driving frequency to ω = 0.5 we see more of a difference.

Square beats omega = 0.5

In general we see more difference as ω gets smaller, and more difference when the period 1/ω is not an integer. For example, here is ω = 0.25, period T= 4:

Square wave beats omega = 0.25

And here is ω = 0.3, period T = 1.66.

Square wave beats, omega = 0.3

Here’s an example of driving a system with a square wave at a frequency higher than the natural frequency.

Square wave beats, omega = 1.7

Here the difference between the solutions for the square wave and sine wave are closer together. Apparently the higher frequency makes more difference than the non-integer period.

In the examples above, the solution with a square wave forcing function starts out flat. That’s because of our initial conditions u(0) = 1 and u‘(0) = 0. If we change the initial velocity condition to u‘(0) = 1, we get smoother solutions.

Here are the solutions with u‘(0) = 1 and ω = 0.25

Square wave beats, omega = 0.25, initial velocity 1

and ω = 0.3.

Square wave beats, omega = 0.30, initial velocity 1

Changing the initial velocity made more of a difference when the period was an integer.

See the next post for systems driven by a sawtooth wave rather than a square wave.

Related posts

[1] Sine-driven systems exhibit resonance if and only if the driving frequency matches the natural frequency. While writing the next post on sawtooth forcing functions, I accidentally discovered resonance at lower frequencies. The same can happen here with square wave-driven systems if ω = 1/(2n + 1). For example, here’s a plot with ω = 1/5. The reason resonance shows up for odd periods is that the square wave has Fourier components at every odd frequency.

Symplectic Euler

This post will look at simple numerical approaches to solving the predator-prey (Lotka-Volterra) equations. It turns out that the simplest approach does poorly, but a slight variation does much better.

Following [1] we will use the equations

u‘ = u (v − 2)
v‘ = v (1 − u)

Here u represents the predator population over time and v represents the prey population. When the prey v increase, the predators u increase, leading to a decrease in prey, which leads to a decrease in predators, etc. The exact solutions are periodic.

Euler’s method replaces the derivatives with finite difference approximations to compute the solution in increments of time of size h. The explicit Euler method applied to our example gives

u(th) = u(t) + h u(t) (v(t) − 2)
v(th) = v(t) + h v(t) (1 − u(t)).

The implicit Euler method gives

u(th) = u(t) + h u(t + h) (v(t + h) − 2)
v(th) = v(t) + h v(t + h) (1 − u(t + h)).

This method is implicit because the unknowns, the value of the solution at the next time step, appear on both sides of the equation. This means we’d either need to do some hand calculations first, if possible, to solve for the solutions at time t + h, or use a root-finding method at each time step to solve for the solutions.

Implicit methods are more difficult to implement, but they can have better numerical properties. See this post on stiff equations for an example where implicit Euler is much more stable than explicit Euler. I won’t plot the implicit Euler solutions here, but the implicit Euler method doesn’t do much better than the explicit Euler method in this example.

It turns out that a better approach than either explicit Euler or implicit Euler in our example is a compromise between the two: use explicit Euler to advance one component and use implicit Euler on the other. This is known as symplectic Euler for reasons I won’t get into here but would like to discuss in a future post.

If we use explicit Euler on the predator equation but implicit Euler on the prey equation we have

u(th) = u(t) + h u(t) (v(t + h) − 2)
v(th) = v(t) + h v(t + h) (1 − u(t)).

Conversely, if we use implicit Euler on the predator equation but explicit Euler on the prey equation we have

u(th) = u(t) + h u(t + h) (v(t) − 2)
v(th) = v(t) + h v(t) (1 − u(t + h)).

Let’s see how explicit Euler compares to either of the symplectic Euler methods.

First some initial setup.

    import numpy as np

    h  = 0.08  # Step size
    u0 = 6     # Initial condition
    v0 = 2     # Initial condition
    N  = 100   # Numer of time steps

    u = np.empty(N)
    v = np.empty(N)
    u[0] = u0
    v[0] = v0

Now the explicit Euler solution can be computed as follows.

    for n in range(N-1):
        u[n+1] = u[n] + h*u[n]*(v[n] - 2)
        v[n+1] = v[n] + h*v[n]*(1 - u[n])

The two symplectic Euler solutions are

    
    for n in range(N-1):
        v[n+1] = v[n]/(1 + h*(u[n] - 1))
        u[n+1] = u[n] + h*u[n]*(v[n+1] - 2)

and

    for n in range(N-1):
        u[n+1] = u[n] / (1 - h*(v[n] - 2))
        v[n+1] = v[n] + h*v[n]*(1 - u[n+1])

Now let’s see what our solutions look like, plotting (u(t), v(t)). First explicit Euler applied to both components:

And now the two symplectic methods, applying explicit Euler to one component and implicit Euler to the other.

Next, let’s make the step size 10x smaller and the number of steps 10x larger.

Now the explicit Euler method does much better, though the solutions are still not quite periodic.

The symplectic method solutions hardly change. They just become a little smoother.

More differential equations posts

[1] Ernst Hairer, Christian Lubich, Gerhard Wanner. Geometric Numerical Integration: Structure-Preserving Algorithms for Ordinary Differential Equations.