Double roots and ODEs

This post will resolve a sort of paradox. The process of solving a difference or differential equation is different when the characteristic equation has a double root. But intuitively there shouldn’t be much difference between having a double root and having two roots very close together.

I’ll first say how double roots effect finding solutions to difference and differential equations, then I’ll focus on the differential equations and look at the difference between a double root and two roots close together.

Double roots

The previous post looked at a second order difference equation and a second order differential equation. Both are solved by analogous techniques. The first step in solving either the difference equation

ayn + byn-1 + cyn-2 = 0

or the differential equation

ay″ + by′ + cy = 0

is to find the roots of the characteristic equation

aλ² + bλ + c = 0.

If there are two distinct roots to the characteristic equation, then each gives rise to an independent solution to the difference or differential equation. The roots may be complex, and that changes the qualitative behavior of the solution, but it doesn’t change the solution process.

But what if the characteristic equation has a double root λ? One solution is found the same way: λn for the difference equation and exp(λt) for the differential equation. We know from general theorems that a second independent solution must exist, but how do we find it?

For the differential equation a second solution is nλn and for the differential equation t exp(λt) is our second solution.

Close roots

For the rest of the post I’ll focus on differential equations, though similar remarks apply to difference equations.

If λ = 5 is a double root of our differential equation then the general solution is

α exp(5t) + β t exp(5t)

for some constants a and b. But if our roots are λ = 4.999 and λ = 5.001 then the general solution is

α exp(4.999 t) + β exp(5.001 t)

Surely these two solutions can’t be that different. If the parameters for the differential equation are empirically determined, can we even tell the difference between a double root and a pair of distinct roots a hair’s breadth apart?

On the other hand, β exp(5t) and β t exp(5t) are quite different for large t. If t = 100, the latter is 100 times larger! However, this bumps up against the Greek letter paradox: assuming that parameters with the same name in two equations mean the same thing. When we apply the same initial conditions to the two solutions above, we get different values of α and β for each, So comparing exp(5t) and t exp(5t) is the wrong comparison. The right comparison is between the solutions to two initial value problems.

We’ll look at two example, one with λ positive and one with λ negative.

Example with λ > 0

If λ > 0, then solutions grow exponentially as t increases. The size of β t exp(λt) will eventually matter, regardless of how small β is. However, any error in the differential equation, such as measurement error in the initial conditions or error in computing the solution numerically, will also grow exponentially. The difference between a double root and two nearby roots may matter less than, say, how accurately the initial conditions are know.

Let’s look at

y″ – 10 y′ + 25 y = 0

and

y″ – 10 y′ + 24.9999 y = 0

both with initial conditions y(0) = 3 and y′(0) = 4.

The characteristic equation for the former has a double root λ = 5 and that of the latter has roots 4.99 and 5.01.

The solution to the first initial value problem is

3 exp(5t) – 11 t exp(5t)

and the solution to the second initial value problem is

551.5 exp(4.99t) – 548.5 exp(5.01t).

Notice that the coefficients are very different, illustrating the Greek letter paradox.

The expressions for the two solutions look different, but they’re indistinguishable for small t, say less than 1. But as t gets larger the solutions diverge. The same would be true if we solved the first equation twice, with y(0) = 3 and y(0) = 3.01.

Example with λ < 0

Now let’s look at

y″ + 10 y′ + 25 y = 0

and

y″ + 10 y′ + 24.9999 y = 0

both with initial conditions y(0) = 3 and y′(0) = 4. Note that the velocity coefficient changed sign from -10 to 10.

Now the characteristic equation of the first differential equation has a double root at -5, and the second has roots -4.99 and -5.01.

The solution to the first initial value problem is

3 exp(-5t)  + 19 t exp(-5t)

and the solution to the second initial value problem is

951.5 exp(-4.99t) – 948.5 exp(5.01t).

Again the written forms of the two solutions look quite different, but their plots are indistinguishable.

Unlike the case λ > 0, now with λ < 0 the difference between the solutions is bounded. Before the solutions started out close together and eventually diverged. Now the solutions are close together forever.

Here’s a plot of the difference between the two solutions.

So the maximum separation between the two solutions occurs somewhere around t = 0.5 where the solutions differ in the sixth decimal place.

Conclusion

Whether the characteristic equation has a double root or two close roots makes a significant difference in the written form of the solution to a differential equation. If the roots are positive, the solutions are initially close together then diverge. If the roots are negative, the solutions are always close together.

More differential equation posts

Difference equations and differential equations

Difference equations are very much analogous to differential equations. Difference equations are more elementary, but differential equations are more familiar.

It seems odd to use a more advanced thing to explain a simpler thing, like saying a quartet is a symphony orchestra with only four instruments. But if for some reason you managed to become familiar with orchestras before learning what a quartet is, this explanation would be helpful.

Differential equations are a standard part of the curriculum for science and engineering students, but difference equations are not. And not without reason: differential equations are a prerequisite for future courses but difference equations are not. But if a curriculum were to include difference equations and differential equations, it might make sense to teach the former first because the topic is more tangible.

To illustrate the similarity between the two topics, this post will solve the difference equation [1]

Fn+2 = Fn+1 + Fn

and the differential equation

y” = y‘ + y

The difference equation defines the Fibonacci numbers, with one set of initial conditions, and the Lucas numbers with different initial conditions. The differential equation is analogous in ways we’ll see below.

Both equations are solved by assuming the form of the solution with a free parameter. Two values of the parameter will be found via a quadratic equation,giving two independent solutions. All solutions are linear combinations of these two solutions.

Difference equation example

The Fibonacci numbers start with F0 = 0 and F1 = 1. From there it’s obvious that F2 = 1, F3 = 2, and so forth.

The Lucas numbers start with L0 = 2 and L1 = 1. From there we see L2 = 3, L4 = 4, etc.

Bootstrapping the solution starting from initial conditions is trivial. However, this does not give us a general expression for the solution, and it does not let us explore the solutions of the difference equation for all initial conditions.

So let’s go back and solve the difference equation first, then apply initial conditions, analogous to what one typically does for ordinary differential equations.

Assume solutions have the form Fn = xn for some x to be determined. Then

xn+2xn+1xn = 0

for all non-negative n and so

x² – x – 1 = 0

and the two roots are the golden ratio

ϕ = (1 + √5)/2

and its complement

1 – ϕ = (1 – √5)/2.

So the general solution to the recurrence relation

Fn+2 = Fn+1 + Fn

is

Fn = aϕn + b(1 – ϕ)n

for constants a and b to be determined by initial conditions. For the Fibonacci numbers, a = 1/√5 and b = –a. For the Lucas numbers, a = b = 1.

Our assumption of the form of the solution is justified because it worked. There can’t be any other solutions of another form because clearly the initial conditions determine the third element of the sequence, and the second and third elements determine the fourth, etc.

Differential equation example

Assume solutions have the form y = exp(kt) for some k to be determined. Sticking this into the differential equation shows

k² exp(kt) – k exp(kt) – exp(kt) = 0

and so

k² – k – 1 = 0.

This is the same equation as above, and so the roots are ϕ and 1 – ϕ. The general solution is

y(t) = a exp(ϕt) + b exp((1-ϕ)t)

where the constants a and b are determined by the initial value of y and y‘.

It’s not as clear in this case that we’ve found all the solutions and that our solution is unique given initial conditions, but it’s true.

Related posts

[1] Someone might object that this is a recurrence relation and not a difference equation: each term is defined in terms of its predecessors, but not in terms of function differences. But you could rewrite the equation in terms of differences.

If ΔFn = FnFn-1 then we can rewrite our equation as Δ² F – 3ΔF + F = 0.

Oscillations in RLC circuits

Electrical and mechanical oscillations satisfy analogous equations. This is the basis of using the word “analog” in electronics. You could study a mechanical system by building an analogous circuit and measuring that circuit in a lab.

Mass, dashpot, spring

Years ago I wrote a series of four posts about mechanical vibrations:

Everything in these posts maps over to electrical vibrations with a change of notation.

That series looked at the differential equation

m u'' + \gamma u' + k u = F \cos\omega t

where m is mass, γ is damping from a dashpot, and k is the stiffness of a spring.

Inductor, resistor, capacitor

Now we replace our mass, dashpot, and spring with an inductor, resistor, and capacitor.

Imagine a circuit with an L henry inductor, and R ohm resistor, and a C farad capacitor in series. Let Q(t) be the charge in coulombs over time and let E(t) be an applied voltage, i.e. an AC power source.

Charge formulation

One can use Kirchhoff’s law to derive

Here we have the correspondences

\begin{align*} u &\leftrightarrow Q \\ m &\leftrightarrow L \\ \gamma &\leftrightarrow R \\ k &\leftrightarrow 1/C \end{align*}

So charge is analogous to position, inductance is analogous to mass, resistance is analogous to damping, and capacitance is analogous to the reciprocal of stiffness.

The reciprocal of capacitance is called elastance, so we can say elastance is proportional to stiffness.

Current formulation

It’s more common to see the differential equation above written in terms of current I.

I = \frac{dQ}{dt}

If we take the derivative of both sides of

we get

LI'' + RI' + \frac{1}{C} I = E'

Natural frequency

With mechanical vibrations, as shown here, the natural frequency is

\omega_0 = \sqrt{\frac{k}{m}}

and with electrical oscillations this becomes

\omega_0 = \frac{1}{\sqrt{LC}}

Steady state

When a mechanical or electrical system is driven by sinusoidal forcing function, the system eventually settles down to a solution that is proportional to a phase shift of the driving function.

To be more explicit, the solution to the differential equation

m u'' + \gamma u' + k u = F \cos\omega t

has a transient component that decays exponentially and a steady state component proportional to cos(ωt-φ). The same is true of the equation

LI'' + RI' + \frac{1}{C} I = E'

The proportionality constant is conventionally denoted 1/Δ and so the steady state solution is

\frac{F}{\Delta} \cos(\omega t - \phi)

for the mechanical case and

\frac{T}{\Delta} \cos(\omega t - \phi)

for the electrical case.

The constant Δ satisfies

\Delta^2 = m^2(\omega_0^2 -\omega^2)^2 + \gamma^2 \omega^2

for the mechanical system and

\Delta^2 = L^2(\omega_0^2 -\omega^2)^2 + R^2 \omega^2

for the electrical system.

When the damping force γ or the resistance R is small, then the maximum amplitude occurs when the driving frequency ω is near the natural frequency ω0.

More on damped, driven oscillations here.

Lyapunov exponents

Chaotic systems are unpredictable. Or rather chaotic systems are not deterministically predictable in the long run.

You can make predictions if you weaken one of these requirements. You can make deterministic predictions in the short run, or statistical predictions in the long run.

Lyapunov exponents are a way to measure how quickly the short run turns into the long run. There is a rigorous definition of a Lyapunov exponent, but I often hear the term used metaphorically. If something is said to have a large Lyapunov exponent, it quickly becomes unpredictable. (Unpredictable as far as deterministic prediction.)

In a chaotic system, the uncertainty around the future state of the system grows exponentially. If you start with a small sphere of initial conditions at time t = 0, a sphere of diameter ε, that sphere becomes an ellipsoid over time. The logarithm of the ratio of the maximum ellipsoid diameter to the diameter of the initial sphere is the Lyapunov exponent λ. (Technically you have to average this log ratio over all possible starting points.)

With a Lyapunov exponent λ the uncertainty in your solution is bounded by ε exp(λt).

You can predict the future state of a chaotic system if one of ε, λ, or t is small enough. The larger λ is, the smaller t must be, and vice versa. So in this sense the Lyapunov exponent tells you how far out you can make predictions; eventually your error bars are so wide as to be worthless.

A Lyapunov exponent isn’t something you can easily calculate exactly except for toy problems, but it is something you can easily estimate empirically. Pick a cloud of initial conditions and see what happens to that cloud over time.

Related posts

A closer look at zero spacing

A few days ago I wrote about Kneser’s theorem. This theorem tells whether the differential equation

u ″(x) + h(x) u(x) = 0

will oscillate indefinitely, i.e. whether it will have an infinite number of zeros.

Today’s post will look at another theorem that gives more specific information about the spacing of the zeros.

Kneser’s theorem said that the growth rate of h determines the oscillatory behavior of the solution u. Specifically, if h grows faster than ¼x-2 then oscillations will continue, but otherwise they will eventually stop.

Zero spacing

A special case of a theorem given in [1] says that an upper bound on h gives a lower bound on zero spacing, and a lower bound on h gives an upper bound on zero spacing.

Specifically, if h is bound above by M, then the spacing between zeros is no more than π/√M. And if h is bound below by M′, then the spacing between zeros is no less than π/√M′.

Let’s look back a plot from the post on Kneser’s theorem:

The spacing between the oscillations appears to be increasing linearly. We could have predicted that from the theorem in this post. The coefficient of the linear term is roughly on the order of 1/x² and so the spacing between the zeros is going to be on the order of x.

Specifically, in the earlier post we looked at

h(x) = (1 + x/10)p

where p was 1.9 and 2.1, two exponents chosen to be close to either side of the boundary in Kneser’s theorem. That post used q and t rather than h and x, but I changed notation here to match [1].

Suppose we’ve just seen a zero at x*. Then from that point on h is bounded above by h(x*) and so the distance to the next zero is bounded below by π/√h(x*). That tells you where to start looking for the next zero.

Our function h is bounded below only by 0, and so we can’t apply the theorem above globally. But we could look some finite distance ahead, and thus get a positive lower bound, and that would lead to an upper bound on the location of the next zero. So we could use theory to find an interval to search for our next zero.

More general theorem

The form of the theorem I quoted from [1] was a simplification. The full theorem considers the differential equation

u ″(x) + g(xu′(x) +  h(x) u(x) = 0

The more general theorem looks at upper and lower bounds on

h(x) – ½ g′(x) – ¼ g(x)².

but in our case g = 0 and so the hypotheses reduced to bounds on just h.

Example

Exercise 3 from [1] says to look at the zeros of solution to

u ″(x) + ½x²  u ′ + (10 + 2x) u(x) = 0.

Here we have

h(x) – ½ g ′(x)- ¼ g(x)² = 10 + 2xx – 1  = 9 + x

and so the lower bound of 9 tells us zeros are spaced at most π/3 apart.

Let’s look at a plot of the solution using Mathematica.

    s = NDSolve[{u''[x] + 0.5 x^2 u'[x] + (10 + 2 x) u[x] == 0, 
                 u[0] == 1, u'[0] == 1 }, u, {x, 0, 5}]
    Plot[Evaluate[u[x] /. s], {x, 0, 5}]

This produces the following.

There are clearly zeros between 0 and 1, between 1 and 2, and between 2 and 3. It’s hard to see whether there are any more. Let’s see exactly where the roots are that we’re sure of.

    FindRoot[u[x] /. s, {x, 0, 1}]
    {x -> 0.584153}

    FindRoot[u[x] /. s, {x, 1, 2}]
    {x -> 1.51114}

    FindRoot[u[x] /. s, {x, 2, 3}]
    {x -> 2.41892}

The distance between the first two zeros is 0.926983 and the distance between the second and third zeros is 0.90778. This is consistent with our theorem because π/3 > 1 and the spacing between our zeros is less than 1.

Are there more zeros we can’t see in the plot above? When I asked Mathematica to evaluate

    FindRoot[u[x] /. s, {x, 2, 4}]

it failed with several warning messages. But we know from our theorem that if there is another zero, it has to be less than a distance of π/3 from the last one we found. We can use this information to narrow down where we want Mathematica look.

    FindRoot[u[x] /. s, {x, 2.41892, 2.41892 + Pi/3}]

Now Mathematica says there is indeed another solution.

    {x -> 3.42523}

Is there yet another solution? If so, it can’t be more than a distance π/3 from the one we just found.

    FindRoot[u[x] /. s, {x, 3.42523, 3.42523 + Pi/3}]

This returns 3.42523 again, so Mathematica didn’t find another zero. Assuming Mathematica is correct, this means there cannot be any more zeros.

On the interval [0, 5] the quantity in the theorem is bound above by 14, so an additional zero would need to be at least π/√14 away from the latest one. So if there’s another zero, it’s in the interval [4.26486, 4.47243]. If we ask Mathematica to find a zero in that interval, it fails with warning messages. We can plot the solution over that interval and see that it’s never zero.

[1] Protter and Weinberger. Maximum Principles in Differential Equations. Springer-Verlag. 1984. Page 46.

Slowing oscillations

Consider the second order linear differential equation

y''(t) + ky(t) = 0
You could think of y as giving the height of a mass attached to a spring at time t where k is the stiffness of the spring. The solutions are sines and cosines with frequency √k. Think about how the solution depends on the spring constant k. If k is decreases, corresponding to softening the spring, the frequency of the solutions is decreases, and the period between oscillations increases.

Now suppose we replace the constant k with a decreasing positive function q(t). It seems that by controlling the decay rate of q we could control the oscillations of our system. For q decaying slowly but not too slowly, the solutions would continue to oscillate, crossing back and forth across the zero position, but with increasing time between crossings.

Below some critical rate of decay, the solutions may stop crossing the x axis at all. It seems plausible that if q decreases quickly enough, the period could decrease along with it at just the right rate so that the next oscillation never fully arrives.

Adolf Kneser (1862–1930) proved that the intuition described above is indeed correct. He proved that if

\limsup _{t \to \infty} t^2 q(t) < \frac{1}{4}

the solutions to

y'' + q(t)y = 0

will eventually stop oscillating, but if

\liminf _{t \to \infty} t^2 q(t) > \frac{1}{4}

they will oscillate forever.

Example

We’ll set for initial conditions y(0) = 1 and y ‘(0) = 0.

Let q(t) = (1 + 0.1 t)-1.9.  This choice of q decays slower than 0.25 t-2 and so our solutions should oscillate indefinitely. And that appears to be the case:

The period between zero crossings is increasing, but according to Kneser’s theorem the crossings will never stop.

Next we redo our calculation with q(t) = (1 + 0.1 t)-2.1. Now we’re on the other side of Kneser’s theorem and so we expect oscillations to eventually stop.

There are a lot more zero crossings than are evident in the plot; they just can’t be seen because the horizontal scale is so large.

According to Kneser’s theorem there is a last time the solution crosses zero, and I believe it occurs around 2×1026.

More differential equation posts

Hypergeometric equation

I’ve asserted numerous times here that hypergeometric functions come up very often in applied math, but I haven’t said why. This post will give one reason why.

One way to classify functions is in terms of the differential equations they satisfy. Elementary functions satisfy simple differential equations. For example, the exponential function satisfies

y′ = y

and sine and cosine satisfy

y″ + y = 0.

Second order linear differential equations are common and important because Newton’s are second order linear differential equations. For the rest of the post, differential equations will be assumed to be second order and linear, with coefficients that are analytic except possibly at isolated singularities.

ODEs can be classified in terms of the sigularities of their coefficients. The equations above have no singularities and so their solutions are some of the most fundamental equations.

If an ODE has two or fewer regular singular points [1], it can be solved in terms of elementary functions. The next step in complexity is to consider differential equations whose coefficients have three regular singular points. Here’s the kicker:

Every ODE with three regular singular points can be transformed into the hypergeometric ODE.

So one way to think of the hypergeometric differential equation is that it is the standard ODE with three regular singular points. These singularities are at 0, 1, and ∞. If a differential equation has singularities elsewhere, a change of variables can move the singularities to these locations.

The hypergeometric differential equation is

x(x – 1) y″ + (c – (a + b + 1)x) y′ + ab y = 0.

The hypergeometric function F(a, b; c; x) satisfies this equation, and if c is not an integer, then a second independent solution to the equation is

x1-c F(1 + ac, 1 + bc; 2 – c; x).

To learn more about the topic of this post, see Second Order Differential Equations: Special Functions and Their Classification by Gerhard Kristensson.

Related posts

[1] This includes singular points at infinity. A differential equation is said to have a singularity at infinity if under the change of variables t = 1/x the equation in u has a singularity at 0. For example, Airy’s equation

y″ + x y = 0

has no singularities in the finite complex plane, and it has no solution in terms of elementary functions. This does not contradict the statement above because the equation has a singularity at infinity.

Quadruple factorials and Legendre functions

Last week I claimed that double, triple, and quadruple factorials came up in applications. The previous post showed how triple factorials come up in solutions to Airy’s equation. This post will show how quadruple factorials come up in solutions to Legendre’s equation.

Legendre’s differential equation is

(1-x^2) y'' - 2x y' + \nu(\nu+1)y = 0

The Legendre functions Pν and Qν are independent solutions to the equation for given values of ν.

When ν is of the form n + ½ for an integer n, the values of Pν and Qν at 0 involve quadruple factorials and also Gauss’s constant.

For example, if ν = 1/2, 5/2, 9/2, …, then Pν(0) is given by

(-1)^{(2\nu - 1)/4} \frac{\sqrt{2}(2\nu - 2)!!!!}{(2\nu)!!!!\pi g}

Source:  An Atlas of Functions

Triple factorials and Airy functions

Last week I wrote in a post on multifactorials in which I said that

Double factorials come up fairly often, and sometimes triple, quadruple, or higher multifactorials do too.

This post gives a couple examples of triple factorials in practice.

One example I wrote about last week. Triple factorial comes up when evaluating the gamma function at an integer plus 1/3. As shown here,

\Gamma\left(n + \frac{1}{3}\right) = \frac{(3n-2)!!!}{3^n} \Gamma\left(\frac{1}{3}\right)

Another example involves solutions to Airy’s differential equation

y'' - xy = 0

One pair of independent solutions consists of the Airy functions Ai(x) and Bi(x). Another pair consists of the functions

\begin{align*} f(x) &= 1 + \frac{1}{3!}x^3 + \frac{1\cdot 4}{6!}x^6 + \frac{1\cdot 4 \cdot 7}{9!}x^9 + \cdots \\ g(x) &= x + \frac{2}{4!}x^4 + \frac{2\cdot 5}{7!}x^7 + \frac{2\cdot 5 \cdot 8}{10!}x^{10} + \cdots \end{align*}

given in A&S 10.4.3. Because both pairs of functions solve the same linear ODE, each is a linear combination of the other.

Notice that the numerators are triple factorials, and so the series above can be rewritten as

\begin{align*} f(x) &= \sum_{n=0}^\infty \frac{(3n+1)!!!}{(3n+1)!} x^{3n}\\ g(x) &= \sum_{n=0}^\infty \frac{(3n+2)!!!}{(3n+2)!} x^{3n+1} \end{align*}

The next post gives an example of quadruple factorials in action.

Nonlinear ODEs with stable limit cycles

It’s not obvious when a nonlinear differential equation will have a periodic solution, or asymptotically approach a periodic solution. But there are theorems that give sufficient conditions. This post is about one such theorem.

Consider the equation

x” + f(x) x‘ + g(x) = 0

where f and g are analytic and define F(x) to be the integral of f(t) from 0 to x. The following six hypotheses are sufficient to guarantee that the equation above has a stable limit cycle [1].

  1. g is an odd function,
  2. g is positive for positive x,
  3. f is an even function,
  4. f(o) < 0,
  5. F(x) → ∞ as x → ∞,
  6. F has a unique positive zero.

These hypotheses are satisfied, for example, by Van der Pol’s equation.

Let’s make up coefficient functions f and g that satisfy the conditions above and look for the limit cycle.

We can let g(x) = x³ and f(x) = x² – 1. And here’s what we get:

The graph shows phase portraits for two different initial conditions, given in the legend of the plot. Note that limit cycle appears solid blue because the trajectories of both the dashed line and the dotted line run around the periodic attractor.

Related posts

[1] You can find this theorem in Topics in Ordinary Differential Equations by William Lakin and David Sanchez.