Sensitive dependence on initial conditions

The following problem illustrates how the smallest changes to a problem can have large consequences. As explained at the end of the post, this problem is a little artificial, but it illustrates difficulties that come up in realistic problems.

Suppose you have a simple differential equation,

y'' - y = 0

where y(t) gives displacement in meters at a time measured in seconds. Assume initial conditions y(0) = 1 and y'(0) = -1.

Then the unique solution is y(t) = exp(-t). The solution is a decreasing, positive solution. But a numerical solution to the equation might turn around and start increasing, or it might go negative.

Suppose the initial conditions are y(0) = 1 as before but now y'(0) = -1 ± ε. You could think of the ε as a tiny measurement error in the initial velocity, or a limitation in representing the initial velocity as a number in a computer.

The following graph shows the solutions corresponding to ε = 10-6, 0, and -10-6.

plot of 3 solutions, epsilon positive, negative, zero

The exact solution is

 y(t) = \left(1 \mp \frac{\varepsilon}{2}\right)\exp(-t) \pm \frac{\varepsilon}{2}\exp(t)

If the initial velocity is exactly -1, then ε = 0 and we only have the solution exp(-t). But if the initial velocity is a little more than -1, there is an exponentially increasing component of the solution that will eventually overtake the exponentially decreasing component. And if the initial velocity is a little less than -1, there is a negative component increasing exponentially in magnitude. Eventually it will overtake the positive component and cause the solution to become negative.

The solution starts to misbehave (i.e. change direction or go negative) at

t = \frac{1}{2} \log\left( \frac{2}{\varepsilon} \mp 1 \right)

If ε is small, 2/ε is much larger than 1, and so the final 1 above makes little difference. So we could say the solution goes bad at approximately

t = \frac{1}{2} \log\left( \frac{2}{\varepsilon} \right)

whether the error is positive or negative. If ε is ±0.000001, for example, the solution goes bad around t = 7.25 seconds. (The term we dropped only effects the 6th decimal place.)

On a small time scale, less than 7.25 seconds, we would not notice the effect of the (initially) small exponentially growing component. But eventually it matters, a lot. We can delay the point where the solution goes bad by controlling the initial velocity more carefully, but this doesn’t buy us much time. If we make ε ten times smaller, we postpone the time when the solution goes bad to 8.41, only a little over a second later.

[Looking at the plot, the solution does not obviously go wrong at t = 7.25. But it does go qualitatively wrong at that point: a solution that should be positive and decreasing becomes either negative or increasing.]

Trying to extend the quality of the solution by making ε smaller is a fool’s errand. Every order of magnitude decrease in ε only prolongs the inevitable by an extra second or so.

This is a somewhat artificial example. The equation and initial conditions were selected to make the sensitive dependence obvious. The sensitive dependence is itself sensitive: small changes to the problem, such as adding a small first derivative term, make it better behaved.

However, sensitive dependence is a practical phenomena. Numerical methods can, for example, pick up an initially small component of an increasing solution while trying to compute a decreasing solution. Sometimes the numerical sensitivity is a reflection of a sensitive physical system. But if the physical system is stable and the numerical solution is not, the problem may not be numerical. The problem may be a bad model. The numerical difficulties may be trying to tell you something, in which case increasing the accuracy of the numerical method may hide a more basic problem.

Related post: Approximating a solution that doesn’t exist

Relating Airy and Bessel functions

The Airy functions Ai(x) and Bi(x) are independent solutions to the differential equation

y'' - xy = 0

For negative x they act something like sin(x) and cos(x). For positive x they act something like exp(x) and exp(-x). This isn’t surprising if you look at the differential equation. If you replace x with a negative constant, you get sines and cosines, and if you replace it with a positive constant, you get positive and negative exponentials.

The Airy functions can be related to Bessel functions as follows:

\mathrm{Ai}(x) = \left\{ 	\begin{array}{ll} 		\frac{1}{3}\sqrt{\phantom{-}x} \left(I_{-1/3}(\hat{x}) - I_{1/3}(\hat{x})\right) & \mbox{if } x > 0 \\<br /><br /><br /><br />
                \\<br /><br /><br /><br />
		\frac{1}{3}\sqrt{-x} \left(J_{-1/3}(\hat{x}) + J_{1/3}(\hat{x})\right) & \mbox{if } x < 0 	\end{array} \right.

and

\mathrm{Bi}(x) = \left\{ 	\begin{array}{ll} 		\sqrt{\phantom{-}x/3} \left(I_{-1/3}(\hat{x}) + I_{1/3}(\hat{x})\right) & \mbox{if } x > 0 \\<br />                 \\<br /> 		\sqrt{-x/3} \left(J_{-1/3}(\hat{x}) - J_{1/3}(\hat{x})\right) & \mbox{if } x < 0 	\end{array} \right.

Here J is a “Bessel function of the first kind” and I is a “modified Bessel function of the first kind.” Also

\hat{x} = \frac{2}{3} \left(\sqrt{|x|}\right)^3

To verify the equations above, and to show how to compute these functions in Python, here’s some code.

The SciPy function airy computes both functions, and their first derivatives, at once. I assume that’s because it doesn’t take much longer to compute all four functions than to compute one. The code for Ai2 and Bi2 below uses np.where instead of if ... else so that it can operate on NumPy vectors all at once. You can plot Ai and Ai2 and see that the two curves lie on top of each other. The same holds for Bi and Bi2.

 

from scipy.special import airy, jv, iv
from numpy import sqrt, where

def Ai(x):
    (ai, ai_prime, bi, bi_prime) = airy(x)
    return ai

def Bi(x):
    (ai, ai_prime, bi, bi_prime) = airy(x)
    return bi

def Ai2(x):
    third = 1.0/3.0
    hatx = 2*third*(abs(x))**1.5
    return where(x > 0,
        third*sqrt( x)*(iv(-third, hatx) - iv(third, hatx)),
        third*sqrt(-x)*(jv(-third, hatx) + jv(third, hatx)))

def Bi2(x):
    third = 1.0/3.0
    hatx = 2*third*(abs(x))**1.5
    return where(x > 0,
        sqrt( x/3.0)*(iv(-third, hatx) + iv(third, hatx)),
        sqrt(-x/3.0)*(jv(-third, hatx) - jv(third, hatx)))

 

There is a problem with Ai2 and Bi2: they return nan at 0. A more careful implementation would avoid this problem, but that’s not necessary since these functions are only for illustration. In practice, you’d simply use airy and it does the right thing at 0.

Related links:

Hilbert space methods for PDE

When I was in grad school, my advisor asked me to study his out-of-print book, Hilbert Space Methods in Partial Differential Equations. I believe I had a photocopy of a photocopy; I don’t recall ever seeing the original book. I pored over that stack of copies line by line while preparing for my qualifying exams.

Then this evening I was browsing a used book store and was shocked to find a copy of the book, a Dover reprint.

It was an odd feeling to find what was once a precious and mysterious book available for $5.99 as part of a rag-tag assortment of mostly elementary/popular used math books.

Related posts:

Mean residual time

If something has survived this far, how much longer is it expected to survive? That’s the question answered by mean residual time.

For a positive random variable X, the mean residual time for X is a function eX(t) given by

e_X(t) = E(X - t \mid X > t) = \int_t^\infty  \frac{1 - F_X(x)}{1-F_X(t)} \, dx

provided the expectation and integral converge. Here F(t) is the CDF, the probability that X is less than t.

For an exponential distribution, the mean residual time is constant. For a Pareto (power law) distribution, the mean residual time is proportional to t. This has an interesting consequence, known as the Lindy effect.

Now let’s turn things around. Given function a function e(t), can we find a density function for a positive random variable with that mean residual time? Yes.

The equation above yields a differential equation for F, the CDF of the distribution.

If we differentiate both sides of

e(t) (1 - F(t)) = \int_t^\infty 1 - F(x)\, dx

with respect to t and rearrange, we get the first order differential equation

F'(t) + g(t)\, F(t) = g(t)

where

g(t) = \frac{e'(t) + 1}{e(t)}

The initial condition must be F(0) = 0 because we’re looking for the distribution of a positive random variable, i.e. the probability of X being less than zero must be 0. The solution is then

F(t) = 1 - \frac{e(0)}{e(t)} \exp\left( -\int_0^t \frac{dx}{e(x)} \right)

This means that for a desired mean residual time, you can use the equation above to create a CDF function to match. The derivative of the CDF function gives the PDF function, so differentiate both sides to get the density.

Negative damping

An earlier post looked at the effect of damping on free vibrations. We looked at the equation

m u'' + γ u' + k u = 0

where the coefficients m, γ, and k were all positive. But what if some of these terms are negative?

Let’s assume that m is positive. Otherwise multiply the equation above by -1. What happens if γ or k are negative?

The term γ, when positive, takes energy out of the system. A negative value of γ would be a term that adds energy, a sort of negative damping. The behavior of the solutions is determined by the eigenvalues of the system, that is the roots of the equation

m x2 + γ x + k = 0.

If γ is negative, the eigenvalues have positive real part and so the amplitude of the solutions increases exponentially. If γ2 < 4mk then the eigenvalues are complex and so the solutions have an oscillating component. If γ2 = 4mk then there is one repeated, positive eigenvalue. But if γ2 > 4mk the system has one positive and one negative eigenvalue. The solution corresponding to the negative eigenvalue decays exponentially. The other solution increases exponentially. The general solution is a linear combination of these two solutions. As time increases, only the exponentially increasing component of the solution matters because the effect of the other component goes to zero.

In theory, the solution could consist purely of the exponentially decaying component. But in practice, if there is even the tiniest component of the exponentially increasing solution, this component will eventually dominate. A numerical solution, for example, would eventually be dominated by the exponentially increasing solution.

Now what about negative springs? Instead of being a restoring force, a negative spring would be a sort of amplifier, reinforcing rather than resisting displacement. The discriminant γ2 – 4mk will be positive if k is negative. There will be no oscillation because the eigenvalues have no complex part. Also, there will be one positive and one negative eigenvalue, and so the solutions grow exponentially as described above.

Related: If you’re interested in differential equations, check out @diff_eq on Twitter.

Pulp science fiction and vibrations

This morning I ran across Pulp-o-mizer and decided my series of posts on mechanical vibrations could use a little sensational promotion.

The posts are

Damped, driven oscillations

This is the final post in a four-part series on vibrating systems. The first three parts were

and now we consider driven, damped vibrations. We are looking at the equation

m u'' + γ u' + k u = F cos ωt

where all coefficients are positive.

As in the previous post, we need to find one solution to the equation with the forcing term F cos ωt and add it to the general solution to the homogeneous (free) equation.

The particular solution we’re looking for is simply R cos(ωt– φ), but the dependence of R and φ on the coefficients m, γ, and k is complicated. We’ll come back to this later, but first we’ll see what we can understand without knowing the exact form of R and φ.

The solutions to the unforced equation decay exponentially over time because of damping. The long-term behavior of the system, known as the steady state solution, is just R cos(ωt– φ). The frequency of the steady state solution is the same as the frequency of the forcing function, though the phase will be different in general.

To find the amplitude and phase of the steady state solution, we first define Δ > 0 by

Δ2 = m202 – ω2)2 + γ2 ω2

where ω0 = √(k/m) is the natural frequency of the system.

Then R = F /Δ . Note that Δ is small when the driving frequency ω is close to the natural frequency. That’s not where Δ is minimized, though when γ is small the minimum of Δ occurs close to the natural frequency. The minimum of Δ, and hence the maximum of the amplitude R, occurs at a lower frequency, namely

ω2 = ω02 – γ2 / 2m2.

The phase φ satisfies

cos φ = m02 – ω2)/ Δ

and

sin φ = γω/ Δ.

For low frequency ω, the phase φ is small, i.e. the response is nearly in phase with the excitation. For ω =ω0, i.e. forcing at the natural frequency, the response lags the excitation by π/2, a quarter cycle.

Undamped forced vibrations

This is the third in a series of four blog posts on mechanical vibrations. The first two posts were

  • Part I: Introduction and free undamped vibrations
  • Part II: Free undamped vibrations

We’re looking at solutions to the differential equation

m u'' + γ u' + k u = F cos ωt

The first two posts considered “free” vibrations, systems with no external input. The coefficient F is zero in such systems. Now we consider a the effect of an external driving force of the form F cos ωt. We’ll leave out damping effects in this post, that is, we will assume γ = 0. The final post in this series will cover damped, driven oscillations, i.e. the case of γ and F both positive.

When the right-hand side of a linear differential equation is 0, we say the equation is homogeneous. When the right-hand side is not zero, we say the equation is non-homogeneous.

The general solution to a non-homogeneous linear differential equation is the general solution to the corresponding homogeneous equation plus any particular solution to the non-homogeneous equation. That is, we all we have to do is find one solution to the equation with the forcing term, and add it to the general solution to the equation without the forcing term.

Recall that the natural frequency of our equation without forcing is ω defined by

ω02 = k/m.

and that the general solution without forcing is

u(t) = R cos(ω0t – φ)

where the amplitude R and the phase φ are determined by the initial conditions. To find the general solution to the non-homogeneous we need to find one solution to the non-homogeneous equation, and here’s one:

F cos(ωt) /(m(ω02 – ω2))

provided that the natural frequency ω0 is not equal to the driving frequency ω. This says that the general solution is the sum of two periodic functions, one with the natural frequency ω0 and one with the frequency of the driving force ω. In general, the sum of two periodic functions can be fairly complicated, and not necessarily periodic. More on that here.

Next we will look at two phenomena that the solutions can illustrate: beats and resonance. The latter happens when the natural and driving frequencies are equal.

Beats

If the starts at position 0 and with velocity 0, then the amplitudes of the homogeneous and non-homogeneous solutions are the same. The solution is

F /(m(ω02 – ω2)) ( cos(ωt) – cos(ω0t) )

We can apply the trig identity

cos(A) – cos(B) = 2 sin( (A-B)/2 ) sin(( A+B)/2 )

to rewrite the solution as the amplitude 2F /(m(ω02 – ω2)) times

sin( (ω0 – ω) t/2 ) sin( (ω0 + ω) t/2 )

If the natural and driving frequencies are similar, i.e. |ω0 – ω| is small, then the first term is a low frequency factor multiplying the second high frequency term. This phenomena is known as beats. You can hear it, for example, when two instruments are nearly but not quite in tune. If one is playing an A 440 and the other an A 441, you’ll hear a pitch of 440.5 getting louder and softer at a frequency of one cycle per second.

Here’s an example of beats, plotting sin(t) sin(10t). The solid blue line is the product. The dotted green lines are sin(t) and -sin(t).

Incidentally, this is related to how AM radio works: the broadcast content is multiplied by the carrier wave.

Resonance

When we wrote down the solution above, we assumed the natural and driving frequencies were not equal. If they are, if the system is being driven at its natural frequency, we have resonance. In that case, the solution to the non-homogeneous equation is

(F/2mω) t sin ωt.

Because the solution is proportional to t sin ωt, this says we have oscillations whose amplitude grows linearly with time. In theory, the oscillations can become arbitrarily large. In reality there is always some damping, which eventually puts some limit on the amplitude of the oscillations. Also, the system may break before damping would reign in the oscillations.

Here’s an example of resonance, plotting 0.25 t sin(t) in solid blue and ± 0.25 t in dotted green.

Next

The final post in this series will look at the effect of damping on forced vibrations.

Free damped vibrations

This is the second post in a series on vibrations determine by the equation

m u'' + γ u' + k u = F cos ωt

The first post in the series looked at the simplest case, γ = 0 and F = 0. Now we’ll look at the more realistic and more interesting case of damping γ > 0. We won’t consider forcing yet, so F = 0 and our equation reduces to

m u'' + γ u' + k u = 0.

The effect of damping obviously depends on the amount of damping, i.e. the size of γ relative to other terms. Damping removes energy from the system and so the amplitude of the oscillations goes to zero over time, regardless of the amount of damping. However, the system can have three qualitatively different behaviors: under-damping, critical damping, and over-damping.

The characteristic polynomial of the equation is

m x2 + γ x + k

This polynomial has either two complex roots, one repeated real root, or two real roots depending on whether the discriminant γ2 – 4mk is negative, zero, or positive. These three states correspond to under-damping, critical damping, and over-damping.

Under-damping

When damping is small, the system vibrates at first approximately as if there were no damping, but the amplitude of the solutions decreases exponentially. As long as γ2 < 4mk the system is under-damped and the solution is

u(t) = R exp( –γt/2m ) cos( μt– φ )

where

μ = √(4mk – γ2) / 2m.

The amplitude R and the phase φ are determined by the initial conditions.

As an example, let γ =1, m = 1, and k = 5. This implies μ =  √19/2. Set the initial conditions so that R = 6 and φ = 1. The solid blue line below is the solution 6 exp(-t/2) cos(t– 1), and the dotted green lines above and below are the exponential envelope 6 exp(-t/2).

And here is a video of the mass-spring-dashpot system in action made using the code discussed here.

Note that the solution is not simply an oscillation at the natural frequency multiplied by an exponential term. The damping changes the frequency of the periodic term, though the change is small when the damping is small.

The factor μ is called the quasi-frequency. Recall from the previous post that the natural frequency, the frequency of the solution with no damping, is denoted ω0. The quasi-frequency is related to the natural frequency by

μ = √(1 – γ2/4mk) ω0 ≈ (1 – γ2/8mk) ω0.

The approximation above holds for small γ since a Taylor expansion shows that √(1 – x) ≈ 1 – x/2 for small x.

This says that the quasi-frequency decreases the natural frequency by a factor of approximately γ2/8mk. So when γ is small there is little change to the natural frequency, but the amount of change grows quadratically as γ increases.

Critical damping

Critical damping occurs when γ2 = 4mk. In this case the solution is given by

u(t) = (A + Bt) exp( – γt/2m).

The position of the mass will asymptotically approach 0. Depending on the initial velocity, it will either go monotonically to zero or reach some maximum displacement before it turns around and goes to 0.

For an example, let m = 1 and k = 5 as before. For critical damping, γ must equal √20. Choose initial conditions so that A = 1 and B = 10.

Over-damping

When γ2 > 4mk the system is over-damped. When damping is that large, the system does not oscillate at all.

The solution is

u(t) = A exp(r1 t) + B exp(r2 t)

where r1 and r2 are the two roots of

m x2 + γ x + k = 0.

Because γ > 0, both roots are negative and solutions decay exponentially to 0.

Coming up

The next two posts in the series will add a forcing term, i.e. F > 0 in our differential equation. The next post will look at undamped driven vibrations, and the final post will look at damped driven vibrations.

Fourier series before Fourier

I always thought that Fourier was the first to come up with the idea of expressing general functions as infinite sums of sines and cosines. Apparently this isn’t true.

The idea that various functions can be described in terms of Fourier series … was for the first time proposed by Daniel Bernoulli (1700–1782) to solve the one-dimensional wave equation (the equation of motion of a string) about 50 years before Fourier. … However, no one contemporaneous to D. Bernoulli accepted the idea as a general method, and soon the study was forgotten.

Source: The Nonlinear World

Perhaps Fourier’s name stuck to the idea because he developed it further than Bernoulli did.

Related posts:

Mechanical vibrations

My favorite topic in an introductory differential equations course is mechanical and electrical vibrations. I enjoyed learning about it as a student and I enjoyed teaching it later. (Or more accurately, I enjoyed being exposed to it as a student and really learning it later when I had to teach it.)

I find this subject interesting for three reasons.

  1. The same equations describe a variety of mechanical and electrical systems.
  2. You can get practical use out of some relatively simple math.
  3. The solutions display wide variety of behavior as you vary the coefficients.

This is the first of a four-part series of posts on mechanical vibrations. The posts won’t be consecutive: I’ll write about other things in between.

Simple mechanical vibrations satisfy the following differential equation:

m u'' + gamma u' + k u = F cos omega t

We could simply write down the general solution be done with it. But the focus here won’t be finding the solutions but rather understanding how the solutions behave.

We’ll think of our equation as modeling a system with a mass attached to a spring and a dash pot. All coefficients are constant. m is the mass, γ is the damping from the dash pot, and k is the restoring force from the spring. The driving force has amplitude F and frequency ω. The solution u(t) gives the position of the mass at time t.

More complicated vibrations, such as a tall building swaying in the wind, can be approximated by this simple setting.

The same differential equation could model an electrical circuit with an inductor, resistor, and capacitor. In that case replace mass m with the inductance L, damping γ with resistance R, and spring constant k with the reciprocal of capacitance C. Then the equation gives the charge on the capacitor at time t.

We will assume m and k are positive. The four blog posts will correspond to γ zero or positive, and F zero or non-zero. Since γ represents damping, the system is called undamped when γ = 0 and damped when γ is greater than 0. And since F is the amplitude of the forcing function, the system is called free when F = 0 and forced otherwise. So the plan for the four posts is

Free, undamped vibrations

With no damping and no forcing, our equation is simply

m u'' + k u = 0

and we can write down the solution

u(t) = A sin ω0t + B cos ω0t

where

ω02 = k/m.

The value ω0 is called the natural frequency of the system because it gives the frequency of vibration when there is no forcing function.

The values of A and B are determined by the initial conditions, i.e. u(0)  = B and u(0) = A ω0.

Since the sine and cosine components have the same frequency ω0, we can use a trig identity to combine them into a single function

u(t) = R cos(ω0t – φ)

The amplitude R and phase φ are related to the parameters A and B by

A = R cos φ, B = R sin φ.

That’s it for undamped free vibrations: the solutions are just sine waves. The next post in the series will make things more realistic and more interesting by adding damping.

Update: Here’s an animation of undamped free oscillation using code by Stéfan van der Walt described here.

Differential Equations and the City

This afternoon I got a review copy of X and the City: Modeling Aspects of Urban Life by John A. Adam. It’s a book about mathematical model, taking all its examples from urban life: public transportation, growth, pollution, etc. I’ve only skimmed through the book so far, but it looks like most of the applications involve differential equations. Some depend on algebra or probability.

The book looks interesting. I hope to say more about the book once I’ve had a chance to read it. The examples are all short, so it may be any easy book to read a little at a time.

I also got a review copy of The Book of Inkscape today, and I’m expecting several other books soon. It may take a while to get through these since this is a busy time for me. When it rains, it pours.

Castles and quantum mechanics

How are castles and quantum mechanics related? One connection is rook polynomials.

The rook is the chess piece that looks like a castle, and used to be called a castle. It can move vertically or horizontally, any number of spaces.

A rook polynomial is a polynomial whose coefficients give the number of ways rooks can be arranged on a chess board without attacking each other. The coefficient of xk in the polynomial Rm,n(x) is the number of ways you can arrange k rooks on an m by n chessboard such that no two rooks are in the same row or column.

The rook polynomials are related to the Laguerre polynomials by

Rm,n(x) = n! xn Lnm-n(-1/x)

where Lnk(x) is an “associated Laguerre polynomial.” These polynomials satisfy Laguerre’s differential equation

x y” + (n+1-x) y‘ + k y = 0,

an equation that comes up in numerous contexts in physics. In quantum mechanics, these polynomials arise in the solution of the Schrödinger equation for the hydrogen atom.

Related: Relations between special functions

Easiest and hardest classes to teach

I’ve taught a variety of math classes, and statistics has been the hardest to teach. The thing I find most challenging is coming up with homework problems. Most exercises are either blatantly artificial or extremely tedious. It’s hard to find moderately realistic problems that don’t take too long to work out.

The course I’ve found easiest to teach has been differential equations. The course has a flat structure: there’s a list of techniques to cover, all roughly the same level of difficulty. There are no deep analytic or philosophical issues to skirt around as there are in statistics. And it’s not hard to come up with practical applications that can be worked out fairly easily.

Related post: Impure math

Boundary conditions are the hard part

What we call “differential equations” are usually not just differential equations. They also have associated initial conditions or boundary conditions.

With ordinary differential equations (ODEs), the initial conditions are often an afterthought. First you find a full set of solutions, then you plug in initial conditions to get a specific solution.

Partial differential equations (PDEs) have boundary conditions (and maybe initial conditions too). Since people typically learn ODEs first, they come to PDEs expecting boundary values to play a role analogous to ODEs. In a very limited sense they do, but in general boundary values are quite different.

The hard part about PDEs is not the PDEs themselves; the hard part is the boundary conditions. Finding solutions to differential equations in the interior of a domain is easy compared to making the equations have the specified behavior on the boundary.

No model can take everything into account. You have to draw some box around that part of the world that you’re going to model and specify what happens when your imaginary box meets the rest of the universe. That’s the hard part.

Related posts: