A more direct approach to series solutions

In the previous post we found a solution to

y'' + 7y' + 12 = x^2.

using operator calculus, i.e. treating the differential operator D like a number and doing tricks with it. See the earlier post for a justification of why we can get away with unorthodox manipulations.

We can generalize the method of the previous post to say that a solution to any differential equation of the form

p(D) y = f(x)

is given by

y = \frac{1}{p(D)} f(x).

In the previous post we had

 p(D) = D^2 + 7D + 12


f(x) = x^2

but the method works more generally.

We then find a power series for 1/p(D), most likely by partial fraction decomposition, and apply the result to f(x). There may be a fair amount of labor left, but it’s purely calculus; all the differential equation work is done.

Conceptually, this method subsumes other differential equation techniques such as undetermined coefficients and power series solutions.

Let’s use this method to find an approximate solution to

y'' + 7y' + 12 = \zeta(x)

where ζ(x) is the Riemann zeta function.

In the previous post we worked out that

 \frac{1}{D^2 + 7D + 12} = \frac{1}{D+3} - \frac{1}{D+4} = \frac{1}{12} - \frac{7}{144}D + \frac{37}{1728}D^2 + {\cal O}(D^3)

and so an approximate solution to our differential equation near 0 is

 y(x) = \frac{1}{12}\zeta(0) - \frac{7}{144} \zeta'(0) + \frac{37}{1728}\zeta''(0) + {\cal O}(x^3).

Numerically this works out to

y(x) = - 0.0416667 + 0.0446706 x - 0.0429602 x^2.

If you want more terms, carry the series for 1/(D² + 7D + 12) out to more terms.

Related links

Operator calculus

Students who take a course in differential equations don’t really learn how to solve differential equations. The problems whose solutions they reproduce were solved over 300 years ago. The methods taught in undergraduate ODE classes are in some sense mnemonics, a way to remember a solution discovered long ago.

Abandon hope of originality

The more scrupulous students in an ODE class cringe at being told “Guess a solution of the form …” because it seems like cheating. They may think “But I really want to learn how to solve things on my own.” This objection isn’t often addressed in part because it’s not often said out loud. But if it were explicitly stated, here’s one way to answer it.

That’s great that you want to learn how to solve differential equations on your own. But solving differential equations, in the sense of finding closed-form solutions by hand, is really hard, and unlikely to succeed. The differential equations that come up in application, that have closed-form solutions, and whose solutions can be calculated by hand in under an hour have been known since the Bernoulli’s. If this is an exaggeration, it’s a slight exaggeration.

You will learn some useful things in this class, and you may even get your first exposure to ideas that latter lead you to do original mathematics [1]. But if so, that original mathematics probably won’t be finding new closed-form solutions to useful differential equations.

This changes the perspective of the ODE class. Instead of pretending that students are going to learn how to solve differential equations ex nihilo, we can be honest that students are going to learn how to “look up” solutions to solved problems. Learning how to look up these solutions is not trivial, and in fact it takes about a semester to learn. You can’t literally look up the solution to a particular differential equation in all its specifics, but you can learn to follow the recipes forclasses of equations

The role of rigor

There is a time for everything under the sun. A time for rigor and a time for handwaving. A time to embrace epsilons and a time to shun epsilons.

You can learn a lot of important reusable analysis techniques in the context of differential equations. And if that’s the goal, being rigorous is good. But if the goal is to recall the solution to a solved problem, then it’s expedient to use dirty hacks; think of them almost as mnemonics rather than as solution techniques.

Once you have a candidate solution in hand, you can always rigorously verify that it works by sticking it into the differential equation. In that sense you’re not sacrificing rigor at all.

Operator calculus

Operator calculus treats the differential operator D as a formal symbol and does calculus with it as if it were a number. This is commonly called an abuse of notation, but it has a certain elegance to it. The manipulations could be made rigorous, though not by students in an undergraduate ODE class.

Operator calculus is very powerful, and with great power comes great responsibility. It is not often taught to undergraduates, and for good reason. Sometimes formal operations are justified and sometimes they’re not. Misuse can lead to results that are simply wrong and not just illegally obtained. With that said, let’s jump in.

Let’s find a solution [2] to

y'' + 7y' + 12 = x^2.

Writing the differential operator as D the equation becomes

(D^2 + 7D + 12)y = x^2

and so the solution is obviously (!)

y = \frac{1}{D^2 + 7D + 12} x^2.

“You can’t just divide by a differential operator like that!” Oh yeah? We’re going to do more than that. We’re going to break it into partial fractions and power series!

 \frac{1}{D^2 + 7D + 12} = \frac{1}{D+3} - \frac{1}{D+4} = \frac{1}{12} - \frac{7}{144}D + \frac{37}{1728}D^2 + {\cal O}(D^3)

The last term means terms involving D³ and higher powers of D.

Now the first derivative of x² is 2x, the second derivative is 2, and all higher derivatives are 0. So when we apply the operator series above to x² only the terms up to D² contribute anything. From this we can tell that

\frac{1}{12}x^2 - \frac{7}{72}x + \frac{37}{864}

is a solution to our differential equation. You may be deeply skeptical of how we got here, and good for you if you are, but it’s easy to show that this is in fact a solution to our differential equation.

Is this worth it?

There are other ways to find a solution to our differential equation. For example, you might have made an inspired guess that the solution is a quadratic polynomial and solved for the coefficients of that polynomial.

On the other hand, the operator calculus approach is more general. We could use a similar approach to solve differential equations with a wide variety of right hand sides. And the approach is systematic enough to be programmed into a computer algebra system.

Related posts

[1] That was my experience. I specialized in PDEs in grad school and learned a lot that was useful later outside of PDEs.

[2] The general solution to linear equations like this is the general solution to the corresponding homogeneous equation (i.e. with the right hand side set to 0) plus a “particular solution” (i.e. any solution to the equation with the original right hand side restored). Any particular solution will do, because any two particular solutions differ by a solution to the homogeneous equation.


Which one is subharmonic?

The Laplace operator Δ of a function of n variables is defined by

\Delta f = \sum_{i=1}^n \frac{\partial^2 f}{\partial x^2_i}

If Δ f = 0 in some region Ω, f is said to be harmonic on Ω.  In that case f takes on its maximum and minimum over Ω at locations on the boundary ∂Ω of Ω. Here is an example of a harmonic function over a square which clearly takes on its maximum on two sides of the boundary and its minimum on the other two sides.

Plot of x^2 - y^2 over [-1,1] cross [-1,1].

The theorem above can be split into two theorems and generalized:

If Δ f ≥ 0, then f takes on its maximum on ∂Ω.

If Δ f ≤ 0, then f takes on its minimum on ∂Ω.

These two theorems are called the maximum principle and minimum principle respectively.

Now just as functions with Δf equal to zero are called harmonic, functions with Δf non-negative or non-positive are called subharmonic and superharmonic. Or is it the other way around?

If Δ f ≥ 0 in Ω, then f is called subharmonic in Ω. And if Δ f ≤ 0 then f is called superharmonic. Equivalently, f is superharmonic if –f is subharmonic.

The names subharmonic and superharmonic may seem backward: the theorem with the greater than sign is for subharmonic functions, and the theorem with the less than sign is for superharmonic functions. Shouldn’t the sub-thing be less than something and the super-thing greater?

Indeed they are, but you have to look f and not Δf. That’s the key.

If a function f is subharmonic on Ω, then f is below the harmonic function interpolating f from ∂Ω into the interior of Ω. That is, if g satisfies Laplace’s equation

\begin{align*} \Delta g &= 0 \mbox{ on }\Omega \\ g &= f \mbox{ on } \partial\Omega \end{align*}

then fg on Ω.

For example, let f(x) = ||x|| and let Ω be the unit ball in ℝn. Then Δ f ≥ 0 and so f is subharmonic. (The norm function f has a singularity at the origin, but this example can be made rigorous.) Now f is constantly 1 on the boundary of the ball, and the constant function 1 is the unique solution of Laplace’s equation on the unit ball with such boundary condition, and clearly f is less than 1 on the interior of the ball.

Related posts

Double roots and ODEs

This post will resolve a sort of paradox. The process of solving a difference or differential equation is different when the characteristic equation has a double root. But intuitively there shouldn’t be much difference between having a double root and having two roots very close together.

I’ll first say how double roots effect finding solutions to difference and differential equations, then I’ll focus on the differential equations and look at the difference between a double root and two roots close together.

Double roots

The previous post looked at a second order difference equation and a second order differential equation. Both are solved by analogous techniques. The first step in solving either the difference equation

ayn + byn-1 + cyn-2 = 0

or the differential equation

ay″ + by′ + cy = 0

is to find the roots of the characteristic equation

aλ² + bλ + c = 0.

If there are two distinct roots to the characteristic equation, then each gives rise to an independent solution to the difference or differential equation. The roots may be complex, and that changes the qualitative behavior of the solution, but it doesn’t change the solution process.

But what if the characteristic equation has a double root λ? One solution is found the same way: λn for the difference equation and exp(λt) for the differential equation. We know from general theorems that a second independent solution must exist, but how do we find it?

For the differential equation a second solution is nλn and for the differential equation t exp(λt) is our second solution.

Close roots

For the rest of the post I’ll focus on differential equations, though similar remarks apply to difference equations.

If λ = 5 is a double root of our differential equation then the general solution is

α exp(5t) + β t exp(5t)

for some constants a and b. But if our roots are λ = 4.999 and λ = 5.001 then the general solution is

α exp(4.999 t) + β exp(5.001 t)

Surely these two solutions can’t be that different. If the parameters for the differential equation are empirically determined, can we even tell the difference between a double root and a pair of distinct roots a hair’s breadth apart?

On the other hand, β exp(5t) and β t exp(5t) are quite different for large t. If t = 100, the latter is 100 times larger! However, this bumps up against the Greek letter paradox: assuming that parameters with the same name in two equations mean the same thing. When we apply the same initial conditions to the two solutions above, we get different values of α and β for each, So comparing exp(5t) and t exp(5t) is the wrong comparison. The right comparison is between the solutions to two initial value problems.

We’ll look at two example, one with λ positive and one with λ negative.

Example with λ > 0

If λ > 0, then solutions grow exponentially as t increases. The size of β t exp(λt) will eventually matter, regardless of how small β is. However, any error in the differential equation, such as measurement error in the initial conditions or error in computing the solution numerically, will also grow exponentially. The difference between a double root and two nearby roots may matter less than, say, how accurately the initial conditions are know.

Let’s look at

y″ – 10 y′ + 25 y = 0


y″ – 10 y′ + 24.9999 y = 0

both with initial conditions y(0) = 3 and y′(0) = 4.

The characteristic equation for the former has a double root λ = 5 and that of the latter has roots 4.99 and 5.01.

The solution to the first initial value problem is

3 exp(5t) – 11 t exp(5t)

and the solution to the second initial value problem is

551.5 exp(4.99t) – 548.5 exp(5.01t).

Notice that the coefficients are very different, illustrating the Greek letter paradox.

The expressions for the two solutions look different, but they’re indistinguishable for small t, say less than 1. But as t gets larger the solutions diverge. The same would be true if we solved the first equation twice, with y(0) = 3 and y(0) = 3.01.

Example with λ < 0

Now let’s look at

y″ + 10 y′ + 25 y = 0


y″ + 10 y′ + 24.9999 y = 0

both with initial conditions y(0) = 3 and y′(0) = 4. Note that the velocity coefficient changed sign from -10 to 10.

Now the characteristic equation of the first differential equation has a double root at -5, and the second has roots -4.99 and -5.01.

The solution to the first initial value problem is

3 exp(-5t)  + 19 t exp(-5t)

and the solution to the second initial value problem is

951.5 exp(-4.99t) – 948.5 exp(5.01t).

Again the written forms of the two solutions look quite different, but their plots are indistinguishable.

Unlike the case λ > 0, now with λ < 0 the difference between the solutions is bounded. Before the solutions started out close together and eventually diverged. Now the solutions are close together forever.

Here’s a plot of the difference between the two solutions.

So the maximum separation between the two solutions occurs somewhere around t = 0.5 where the solutions differ in the sixth decimal place.


Whether the characteristic equation has a double root or two close roots makes a significant difference in the written form of the solution to a differential equation. If the roots are positive, the solutions are initially close together then diverge. If the roots are negative, the solutions are always close together.

More differential equation posts

Difference equations and differential equations

Difference equations are very much analogous to differential equations. Difference equations are more elementary, but differential equations are more familiar.

It seems odd to use a more advanced thing to explain a simpler thing, like saying a quartet is a symphony orchestra with only four instruments. But if for some reason you managed to become familiar with orchestras before learning what a quartet is, this explanation would be helpful.

Differential equations are a standard part of the curriculum for science and engineering students, but difference equations are not. And not without reason: differential equations are a prerequisite for future courses but difference equations are not. But if a curriculum were to include difference equations and differential equations, it might make sense to teach the former first because the topic is more tangible.

To illustrate the similarity between the two topics, this post will solve the difference equation [1]

Fn+2 = Fn+1 + Fn

and the differential equation

y” = y‘ + y

The difference equation defines the Fibonacci numbers, with one set of initial conditions, and the Lucas numbers with different initial conditions. The differential equation is analogous in ways we’ll see below.

Both equations are solved by assuming the form of the solution with a free parameter. Two values of the parameter will be found via a quadratic equation,giving two independent solutions. All solutions are linear combinations of these two solutions.

Difference equation example

The Fibonacci numbers start with F0 = 0 and F1 = 1. From there it’s obvious that F2 = 1, F3 = 2, and so forth.

The Lucas numbers start with L0 = 2 and L1 = 1. From there we see L2 = 3, L4 = 4, etc.

Bootstrapping the solution starting from initial conditions is trivial. However, this does not give us a general expression for the solution, and it does not let us explore the solutions of the difference equation for all initial conditions.

So let’s go back and solve the difference equation first, then apply initial conditions, analogous to what one typically does for ordinary differential equations.

Assume solutions have the form Fn = xn for some x to be determined. Then

xn+2xn+1xn = 0

for all non-negative n and so

x² – x – 1 = 0

and the two roots are the golden ratio

ϕ = (1 + √5)/2

and its complement

1 – ϕ = (1 – √5)/2.

So the general solution to the recurrence relation

Fn+2 = Fn+1 + Fn


Fn = aϕn + b(1 – ϕ)n

for constants a and b to be determined by initial conditions. For the Fibonacci numbers, a = 1/√5 and b = –a. For the Lucas numbers, a = b = 1.

Our assumption of the form of the solution is justified because it worked. There can’t be any other solutions of another form because clearly the initial conditions determine the third element of the sequence, and the second and third elements determine the fourth, etc.

Differential equation example

Assume solutions have the form y = exp(kt) for some k to be determined. Sticking this into the differential equation shows

k² exp(kt) – k exp(kt) – exp(kt) = 0

and so

k² – k – 1 = 0.

This is the same equation as above, and so the roots are ϕ and 1 – ϕ. The general solution is

y(t) = a exp(ϕt) + b exp((1-ϕ)t)

where the constants a and b are determined by the initial value of y and y‘.

It’s not as clear in this case that we’ve found all the solutions and that our solution is unique given initial conditions, but it’s true.

Related posts

[1] Someone might object that this is a recurrence relation and not a difference equation: each term is defined in terms of its predecessors, but not in terms of function differences. But you could rewrite the equation in terms of differences.

If ΔFn = FnFn-1 then we can rewrite our equation as Δ² F – 3ΔF + F = 0.

Oscillations in RLC circuits

Electrical and mechanical oscillations satisfy analogous equations. This is the basis of using the word “analog” in electronics. You could study a mechanical system by building an analogous circuit and measuring that circuit in a lab.

Mass, dashpot, spring

Years ago I wrote a series of four posts about mechanical vibrations:

Everything in these posts maps over to electrical vibrations with a change of notation.

That series looked at the differential equation

m u'' + \gamma u' + k u = F \cos\omega t

where m is mass, γ is damping from a dashpot, and k is the stiffness of a spring.

Inductor, resistor, capacitor

Now we replace our mass, dashpot, and spring with an inductor, resistor, and capacitor.

Imagine a circuit with an L henry inductor, and R ohm resistor, and a C farad capacitor in series. Let Q(t) be the charge in coulombs over time and let E(t) be an applied voltage, i.e. an AC power source.

Charge formulation

One can use Kirchhoff’s law to derive

Here we have the correspondences

\begin{align*} u &\leftrightarrow Q \\ m &\leftrightarrow L \\ \gamma &\leftrightarrow R \\ k &\leftrightarrow 1/C \end{align*}

So charge is analogous to position, inductance is analogous to mass, resistance is analogous to damping, and capacitance is analogous to the reciprocal of stiffness.

The reciprocal of capacitance is called elastance, so we can say elastance is proportional to stiffness.

Current formulation

It’s more common to see the differential equation above written in terms of current I.

I = \frac{dQ}{dt}

If we take the derivative of both sides of

we get

LI'' + RI' + \frac{1}{C} I = E'

Natural frequency

With mechanical vibrations, as shown here, the natural frequency is

\omega_0 = \sqrt{\frac{k}{m}}

and with electrical oscillations this becomes

\omega_0 = \frac{1}{\sqrt{LC}}

Steady state

When a mechanical or electrical system is driven by sinusoidal forcing function, the system eventually settles down to a solution that is proportional to a phase shift of the driving function.

To be more explicit, the solution to the differential equation

m u'' + \gamma u' + k u = F \cos\omega t

has a transient component that decays exponentially and a steady state component proportional to cos(ωt-φ). The same is true of the equation

LI'' + RI' + \frac{1}{C} I = E'

The proportionality constant is conventionally denoted 1/Δ and so the steady state solution is

\frac{F}{\Delta} \cos(\omega t - \phi)

for the mechanical case and

\frac{T}{\Delta} \cos(\omega t - \phi)

for the electrical case.

The constant Δ satisfies

\Delta^2 = m^2(\omega_0^2 -\omega^2)^2 + \gamma^2 \omega^2

for the mechanical system and

\Delta^2 = L^2(\omega_0^2 -\omega^2)^2 + R^2 \omega^2

for the electrical system.

When the damping force γ or the resistance R is small, then the maximum amplitude occurs when the driving frequency ω is near the natural frequency ω0.

More on damped, driven oscillations here.

Lyapunov exponents

Chaotic systems are unpredictable. Or rather chaotic systems are not deterministically predictable in the long run.

You can make predictions if you weaken one of these requirements. You can make deterministic predictions in the short run, or statistical predictions in the long run.

Lyapunov exponents are a way to measure how quickly the short run turns into the long run. There is a rigorous definition of a Lyapunov exponent, but I often hear the term used metaphorically. If something is said to have a large Lyapunov exponent, it quickly becomes unpredictable. (Unpredictable as far as deterministic prediction.)

In a chaotic system, the uncertainty around the future state of the system grows exponentially. If you start with a small sphere of initial conditions at time t = 0, a sphere of diameter ε, that sphere becomes an ellipsoid over time. The logarithm of the ratio of the maximum ellipsoid diameter to the diameter of the initial sphere is the Lyapunov exponent λ. (Technically you have to average this log ratio over all possible starting points.)

With a Lyapunov exponent λ the uncertainty in your solution is bounded by ε exp(λt).

You can predict the future state of a chaotic system if one of ε, λ, or t is small enough. The larger λ is, the smaller t must be, and vice versa. So in this sense the Lyapunov exponent tells you how far out you can make predictions; eventually your error bars are so wide as to be worthless.

A Lyapunov exponent isn’t something you can easily calculate exactly except for toy problems, but it is something you can easily estimate empirically. Pick a cloud of initial conditions and see what happens to that cloud over time.

Related posts

A closer look at zero spacing

A few days ago I wrote about Kneser’s theorem. This theorem tells whether the differential equation

u ″(x) + h(x) u(x) = 0

will oscillate indefinitely, i.e. whether it will have an infinite number of zeros.

Today’s post will look at another theorem that gives more specific information about the spacing of the zeros.

Kneser’s theorem said that the growth rate of h determines the oscillatory behavior of the solution u. Specifically, if h grows faster than ¼x-2 then oscillations will continue, but otherwise they will eventually stop.

Zero spacing

A special case of a theorem given in [1] says that an upper bound on h gives a lower bound on zero spacing, and a lower bound on h gives an upper bound on zero spacing.

Specifically, if h is bound above by M, then the spacing between zeros is no more than π/√M. And if h is bound below by M′, then the spacing between zeros is no less than π/√M′.

Let’s look back a plot from the post on Kneser’s theorem:

The spacing between the oscillations appears to be increasing linearly. We could have predicted that from the theorem in this post. The coefficient of the linear term is roughly on the order of 1/x² and so the spacing between the zeros is going to be on the order of x.

Specifically, in the earlier post we looked at

h(x) = (1 + x/10)p

where p was 1.9 and 2.1, two exponents chosen to be close to either side of the boundary in Kneser’s theorem. That post used q and t rather than h and x, but I changed notation here to match [1].

Suppose we’ve just seen a zero at x*. Then from that point on h is bounded above by h(x*) and so the distance to the next zero is bounded below by π/√h(x*). That tells you where to start looking for the next zero.

Our function h is bounded below only by 0, and so we can’t apply the theorem above globally. But we could look some finite distance ahead, and thus get a positive lower bound, and that would lead to an upper bound on the location of the next zero. So we could use theory to find an interval to search for our next zero.

More general theorem

The form of the theorem I quoted from [1] was a simplification. The full theorem considers the differential equation

u ″(x) + g(xu′(x) +  h(x) u(x) = 0

The more general theorem looks at upper and lower bounds on

h(x) – ½ g′(x) – ¼ g(x)².

but in our case g = 0 and so the hypotheses reduced to bounds on just h.


Exercise 3 from [1] says to look at the zeros of solution to

u ″(x) + ½x²  u ′ + (10 + 2x) u(x) = 0.

Here we have

h(x) – ½ g ′(x)- ¼ g(x)² = 10 + 2xx – 1  = 9 + x

and so the lower bound of 9 tells us zeros are spaced at most π/3 apart.

Let’s look at a plot of the solution using Mathematica.

    s = NDSolve[{u''[x] + 0.5 x^2 u'[x] + (10 + 2 x) u[x] == 0, 
                 u[0] == 1, u'[0] == 1 }, u, {x, 0, 5}]
    Plot[Evaluate[u[x] /. s], {x, 0, 5}]

This produces the following.

There are clearly zeros between 0 and 1, between 1 and 2, and between 2 and 3. It’s hard to see whether there are any more. Let’s see exactly where the roots are that we’re sure of.

    FindRoot[u[x] /. s, {x, 0, 1}]
    {x -> 0.584153}

    FindRoot[u[x] /. s, {x, 1, 2}]
    {x -> 1.51114}

    FindRoot[u[x] /. s, {x, 2, 3}]
    {x -> 2.41892}

The distance between the first two zeros is 0.926983 and the distance between the second and third zeros is 0.90778. This is consistent with our theorem because π/3 > 1 and the spacing between our zeros is less than 1.

Are there more zeros we can’t see in the plot above? When I asked Mathematica to evaluate

    FindRoot[u[x] /. s, {x, 2, 4}]

it failed with several warning messages. But we know from our theorem that if there is another zero, it has to be less than a distance of π/3 from the last one we found. We can use this information to narrow down where we want Mathematica look.

    FindRoot[u[x] /. s, {x, 2.41892, 2.41892 + Pi/3}]

Now Mathematica says there is indeed another solution.

    {x -> 3.42523}

Is there yet another solution? If so, it can’t be more than a distance π/3 from the one we just found.

    FindRoot[u[x] /. s, {x, 3.42523, 3.42523 + Pi/3}]

This returns 3.42523 again, so Mathematica didn’t find another zero. Assuming Mathematica is correct, this means there cannot be any more zeros.

On the interval [0, 5] the quantity in the theorem is bound above by 14, so an additional zero would need to be at least π/√14 away from the latest one. So if there’s another zero, it’s in the interval [4.26486, 4.47243]. If we ask Mathematica to find a zero in that interval, it fails with warning messages. We can plot the solution over that interval and see that it’s never zero.

[1] Protter and Weinberger. Maximum Principles in Differential Equations. Springer-Verlag. 1984. Page 46.

Slowing oscillations

Consider the second order linear differential equation

y''(t) + ky(t) = 0
You could think of y as giving the height of a mass attached to a spring at time t where k is the stiffness of the spring. The solutions are sines and cosines with frequency √k. Think about how the solution depends on the spring constant k. If k is decreases, corresponding to softening the spring, the frequency of the solutions is decreases, and the period between oscillations increases.

Now suppose we replace the constant k with a decreasing positive function q(t). It seems that by controlling the decay rate of q we could control the oscillations of our system. For q decaying slowly but not too slowly, the solutions would continue to oscillate, crossing back and forth across the zero position, but with increasing time between crossings.

Below some critical rate of decay, the solutions may stop crossing the x axis at all. It seems plausible that if q decreases quickly enough, the period could decrease along with it at just the right rate so that the next oscillation never fully arrives.

Adolf Kneser (1862–1930) proved that the intuition described above is indeed correct. He proved that if

\limsup _{t \to \infty} t^2 q(t) < \frac{1}{4}

the solutions to

y'' + q(t)y = 0

will eventually stop oscillating, but if

\liminf _{t \to \infty} t^2 q(t) > \frac{1}{4}

they will oscillate forever.


We’ll set for initial conditions y(0) = 1 and y ‘(0) = 0.

Let q(t) = (1 + 0.1 t)-1.9.  This choice of q decays slower than 0.25 t-2 and so our solutions should oscillate indefinitely. And that appears to be the case:

The period between zero crossings is increasing, but according to Kneser’s theorem the crossings will never stop.

Next we redo our calculation with q(t) = (1 + 0.1 t)-2.1. Now we’re on the other side of Kneser’s theorem and so we expect oscillations to eventually stop.

There are a lot more zero crossings than are evident in the plot; they just can’t be seen because the horizontal scale is so large.

According to Kneser’s theorem there is a last time the solution crosses zero, and I believe it occurs around 2×1026.

More differential equation posts

Hypergeometric equation

I’ve asserted numerous times here that hypergeometric functions come up very often in applied math, but I haven’t said why. This post will give one reason why.

One way to classify functions is in terms of the differential equations they satisfy. Elementary functions satisfy simple differential equations. For example, the exponential function satisfies

y′ = y

and sine and cosine satisfy

y″ + y = 0.

Second order linear differential equations are common and important because Newton’s are second order linear differential equations. For the rest of the post, differential equations will be assumed to be second order and linear, with coefficients that are analytic except possibly at isolated singularities.

ODEs can be classified in terms of the sigularities of their coefficients. The equations above have no singularities and so their solutions are some of the most fundamental equations.

If an ODE has two or fewer regular singular points [1], it can be solved in terms of elementary functions. The next step in complexity is to consider differential equations whose coefficients have three regular singular points. Here’s the kicker:

Every ODE with three regular singular points can be transformed into the hypergeometric ODE.

So one way to think of the hypergeometric differential equation is that it is the standard ODE with three regular singular points. These singularities are at 0, 1, and ∞. If a differential equation has singularities elsewhere, a change of variables can move the singularities to these locations.

The hypergeometric differential equation is

x(x – 1) y″ + (c – (a + b + 1)x) y′ + ab y = 0.

The hypergeometric function F(a, b; c; x) satisfies this equation, and if c is not an integer, then a second independent solution to the equation is

x1-c F(1 + ac, 1 + bc; 2 – c; x).

To learn more about the topic of this post, see Second Order Differential Equations: Special Functions and Their Classification by Gerhard Kristensson.

Related posts

[1] This includes singular points at infinity. A differential equation is said to have a singularity at infinity if under the change of variables t = 1/x the equation in u has a singularity at 0. For example, Airy’s equation

y″ + x y = 0

has no singularities in the finite complex plane, and it has no solution in terms of elementary functions. This does not contradict the statement above because the equation has a singularity at infinity.