Elementary solutions to differential equations

Differential equations rarely have closed-form solutions. Some do, and these are emphasized in textbooks.

For this post we want to look specifically at homogeneous second order linear equations:

y ” + a(x) y‘ + b(x) y = 0.

If the coefficient functions a and b are constant, then the solution can be written down in terms of elementary functions, i.e. functions a first year calculus student would recognize. This would include, for example, polynomials, sines, and cosines, but would not include, the gamma function, Bessel functions, Airy functions, etc.

If the coefficients a and b are not constant, the differential equation usually does not have an elementary solution. In fact, you might wonder if it is ever possible in that case for the differential equation to have an elementary solution. Experience would suggest not.

A paper by Kovacic [1] thoroughly answers this question. The author gives algorithms for determining whether elementary solutions exist and how to find them if they do exist. The following example comes from that paper.

Consider the equation

y” + ry = 0

where [2]

r(x) = (4 x6 – 8 x5 + 12 x4 + 4 x3 + 7 x2 – 20 x + 4)/4 x4.


y(x) = x-3/2 (x2 – 1) exp(x2/2 – x – 1/x)

is a solution, which the following Mathematica code verifies by evaluating to 0.

r[x_] := (4 x^6 - 8 x^5 + 12 x^4 + 4 x^3 + 7 x^2 - 20 x + 4)/(4 x^4)
f[x_] := x^(-3/2) (x^2 - 1) Exp[x^2/2 - x - 1/x]
Simplify[D[f[x], {x, 2}] - r[x] f[x]]

Related posts

[1] Jerald J. Kovacic. An algorithm for solving second order linear homogeneous differential equations. Journal of Symbolic Computation (1986) 2, 3–43.

[2] There’s an error in the paper, where the denominator of r is given as 4x rather than 4x4.

Numerical methods blog posts

I recently got a review copy of Scientific Computing: A Historical Perspective by Bertil Gustafsson. I thought that thumbing through the book might give me ideas for new topics to blog about. It still may, but mostly it made me think of numerical methods I’ve already blogged about.

In historical order, or at least in the order of Gustafsson’s book:

The links above include numerical methods I have written about. What about the methods I have not written about?


I like to write fairly short, self-contained blog posts, and I’ve written about algorithms compatible with that. The methods in Gustafsson’s book that I haven’t written about are mostly methods in partial differential equations. I don’t see how to write short, self-contained posts about numerical methods for PDEs. In any case, my impression is that not many readers would be interested. If you are one of the few readers who would like to see more about PDEs, you may enjoy the following somewhat personal rambling about PDEs.

I studied PDEs in grad school—mostly abstract theory, but also numerical methods—and expected PDEs to be a big part of my career. I’ve done some professional work with differential equations, but the demand for other areas, particularly probability and statistics, has been far greater. In college I had the impression that applied math was practically synonymous with differential equations. I think that was closer to being true a generation or two ago than it is now.

My impression of the market demand for various kinds of math is no doubt colored by my experience. When I was at MD Anderson we had one person in biomathematics working in PDEs, and then none when she left. There may have been people at MDACC doing research into PDEs for modeling cancer, but they weren’t in the biostatistics or biomathematics departments. I know that people are working with differential equations in cancer research, but not at the world’s largest cancer center, and so I expect there aren’t many doing research in that area.

Since leaving MDACC to work as a consultant I’ve seen little demand for differential equations, more in Kalman filters than in differential equations per se. A lot of companies hire people to numerically solve PDEs, but there don’t seem to be many who want to use a consultant for such work. I imagine most companies with an interest in PDEs are large and have a staff of engineers and applied mathematicians working together. There’s more demand for statistical consulting because companies are likely to have an occasional need for statistics. The companies that need PDEs, say for making finite element models of oil reservoirs or airplanes, have an ongoing need and hire accordingly.

Interlaced zeros of ODEs

Sturm’s separation theorem says that the zeros of independent solutions to an equation of the form

y'' + p(x)y' + q(x)y = 0

alternate. That is, between any two consecutive zeros of one solution, there is exactly one zero of the other solution. This is an important theorem because a lot of differential equations of this form come up in applications.

If we let p(x) = 0 and q(x) = 1, then sin(x) and cos(x) are independent solutions and we know that their zeros interlace. The zeros of sin(x) are of the form nπ, and the zeros of cos(x) are multiples of (n + 1/2)π.

What’s less obvious is that if we take two different linear combinations of sine and cosine, as long as they’re not proportional, then their zeros interlace as well. For example, we could take f(x) = 3 sin(x) + 5 cos(x) and g(x) = 7 sin(x) – 11 cos(x). These are also linearly independent solutions to the same differential equation, and so the Sturm separation theorem says their roots have to interlace.

If we take p(x) = 1/x and q(x) = 1 – (ν/x)² then our differential equation becomes Bessel’s equation, and the Bessel functions Jν and Yν are independent solutions. Here’s a little Python code to show how the zeros alternate when ν = 3.

    import matplotlib.pyplot as plt
    from scipy import linspace
    from scipy.special import jn, yn

    x = linspace(4, 30, 100)
    plt.plot(x, jn(3, x), "-")
    plt.plot(x, yn(3, x), "-.")
    plt.legend(["$J_3$", "$Y_3$"])
    plt.axhline(y=0,linewidth=1, color="k")

Plotting Bessel functions J_3 and Y_3

Related posts

System of Jacobi elliptic functions

Jacobi’s elliptic functions are sorta like trig functions. His functions sn and cn have names that reminiscent of sine and cosine for good reason. These functions come up in applications such as the nonlinear pendulum (i.e. when θ is too
large to assume θ is a good enough approximation to sin θ) and in conformal mapping.

I ran across an article [1] yesterday that shows how Jacobi’s three elliptic functions—sn, cn, and dn—could be defined by one dynamical system

\begin{eqnarray*} x' &=& yz \\ y' &=& -zx \\ z' &=& -k^2 xy \end{eqnarray*}

with initial conditions x(0) = 0, y(0) = 1, and z(0) = 1.

The parameter k is the modulus. (In Mathematica’s notation below, k² is the parameter. See this post on parameterization conventions.) As k decreases to 0, sn converges to sine, cn to cosine, and dn to 1. As k increases to 1, sn converges tanh, and cn and dn converge to sech. So you could think of k as a knob you turn to go from being more like circular functions (ordinary trig functions) to more like hyperbolic functions.

Since we have a dynamical system, let’s plot the solution, varying the modulus each time.The Jacobi functions are periodic (in fact they’re doubly periodic) and so the plots will be closed loops.

f[t_, m_] = {JacobiSN[t, m], JacobiCN[t, m], JacobiDN[t, m]}
ParametricPlot3D[f[t, 0.707], {t, 0, 10}, AspectRatio -> 1]

phase portrait k = 1/2

ParametricPlot3D[f[t, 0.99], {t, 0, 20}, AspectRatio -> 1]

phase portrait k = 1/2

ParametricPlot3D[f[t, 0.01], {t, 0, 20}, AspectRatio -> 1]

phase portrait k = 1/2

Note that this last plot is nearly flat because the modulus is small and so z is nearly constant. The small modulus also makes the phase portrait nearly circular because x is approximately sine and y is approximately cosine.

[1] Kenneth R. Meyer. Jacobi Elliptic Functions from a Dynamical Systems Point of View. The American Mathematical Monthly, Vol. 108, No. 8 (Oct., 2001), pp. 729-737

Duffing equation for nonlinear oscillator

The Duffing equation is an ordinary differential equation describing a nonlinear damped driven oscillator.

\frac{d^2x}{dt} + h\frac{dx}{dt} + \Omega_0^2 x + \mu x^3 = F \cos \omega t

If the parameter μ were zero, this would be a damped driven linear oscillator. It’s the nonlinear x³ term that makes things nonlinear and interesting.

Using an analog computer in 1961, Youshisuke Ueda discovered that this system was chaotic. It was the first discovery of chaos in a second-order non-autonomous periodic system. Lorenz discovered his famous Lorenz attractor around the same time, though his system is third order.

Ueda’s system has been called “Ueda’s strange attractor” or the “Japanese attractor.”

In the linear case, when μ = 0, the phase portrait simply spirals outward from the origin towards its steady state.

But things get much more interesting when μ = 1. Here’s Mathematica code to numerically solve the differential equation and plot the phase portrait.

duff[t_] = 
    NDSolve[{x''[t] + 0.05 x'[t] + x[t] + x[t]^3 == 7.5 Cos[t], 
             x[0] == 0, x'[0] == 1}, x[t], {t, 0, 100}][[1, 1, 2]]
ParametricPlot[{duff[t], duff'[t]}, {t, 0, 50}, AspectRatio -> 1]

Here’s the same system integrated out further, to t = 200 rather 50.

Source: Wanda Szemplińska-Stupnicka. Chaos, Bifurcations and Fractals Around Us.

Asymptotic solution to ODE

Our adventure starts with the following ordinary differential equation:

y' = y = \frac{1}{x}

Analytic solution

We can solve this equation in closed-form, depending on your definition of closed-form, by multiplying by an integrating factor.

e^x y' + e^x y = \frac{e^x}{x}

The left side factors to

(e^x y)'

and so

y = e^{-x} \left( \int^x \frac{e^t}{t}\, dt + c\right)

The indefinite integral above cannot be evaluated in elementary terms, though it can be evaluated in terms of special functions.

We didn’t specify where the integration starts or the constant c. You can make the lower limit of integration whatever you want, and the value of c will be whatever is necessary to satisfy initial conditions.

The initial conditions hardly matter for large x because the constant c is multiplied by a negative exponential. As we’ll see below, the solution y decays like 1/x while the effect of the initial condition decays exponentially.

Asymptotic series solution

I wrote a few months ago about power series solutions for differential equations, but that approach won’t work here, not over a very large range. If y has a Taylor series expansion, so does it’s derivative, and so y‘ + y has a Taylor series. But the right side of our differential equation has a singularity at 0.

What if instead of assuming y has a Taylor series expansion, we assume it has a Laurant series expansion? That is, instead of assuming y is a weighted sum of positive powers of x, we assume it’s a weighted sum of negative powers of x:

y = \sum_{n=1}^\infty a_nx^{-n}

Then formally solving for the coefficients we find that

y = \frac{1}{x} + \frac{1}{x^2} + \frac{2}{x^3} + \cdots + \frac{(n-1)!}{x^n} + \cdots


There’s one problem with this approach: The series diverges for all x because n! grows faster than xn.

And yet the series gives a useful approximation! With a convergent series, we fix x and let n go to infinity. The divergent series above is an asymptotic series, and so we fix n and let x go to infinity.

To see how well the asymptotic solution compares to the analytic solution, we’ll pick the lower limit of integration to be 1 and pick c = 0 in the analytic solution.

The plot above shows the analytic solution, and the first and second order asymptotic solutions. (The nth order solution takes the first n terms of the asymptotic series.) Note that the three curves differ substantially at first but quickly converge together.

Now let’s look at the relative error.

That’s interesting. Eventually the second order approximation is more accurate than the first order, but not at first. In both cases the relative error hits a local minimum, bounces back up, then slowly decreases.

Does this pattern continue if we move on to a third order approximation. Yes, as illustrated below.

Superasymptotic series

The graphs above suggests that higher order approximations are more accurate, eventually, but we might do better than any particular order by letting the order vary, picking the optimal order for each x. That’s the idea behind superasymptotics. For each x, the superasymptotic series sums up terms in the asymptotic series until the terms start getting larger.

When this approach works, it can produce a series that converges exponentially fast, even though each truncated asymptotic series only converges polynomially fast.

Update: To get an idea how accuracy varies jointly with the argument x and the asymptotic approximation order n, consider the following graph. Note that the horizontal axis is now n, not x. The different curves correspond to different values of x, generally moving down as x increases.

Relative error as a function of asymptotic order

Every curve is non-monotone because for every value of x, increasing the order only helps to a point. Then the error gets worse as the series diverges. But as you can see from the graph, you can do quite well by picking the optimal approximation order for a given x.

Related posts

Liouville-Green approximation

Suppose you have a differential equation of the form

\frac{d^2y}{dx^2} = f(x)y

If the function f(x) is constant, the differential equation is easy to solve in closed form. But when it is not constant, it is very unlikely that closed form solutions exist. But there may be useful closed form approximate solutions.

There is no linear term in the equation above, but there is a standard change of variables to eliminate a linear term and put any second order linear equation with variable coefficients in the form above.

The Liouville-Green approximate solutions are linear combinations of the functions

f^{-1/4}(x)\exp\left(\pm\int f^{1/2}(x)\, dx \right )

The constant of integration doesn’t matter. It would only change the solution by a constant multiple, so any indefinite integral will do.

To try out the Liouville-Green approximation, let’s look at the equation

\frac{d^2y}{dx^2} = x^2y

Here f(x) = x² and so the LG approximation solutions are

A\, \frac{\exp\left(x^2/2 \right )}{ \sqrt{x} } + B\, \frac{\exp\left(-x^2/2 \right )}{ \sqrt{x} }

Let’s see how these solutions behave compare to a numerical solution of the equation. To make things more definite, we need initial conditions. Let’s say AB = 1, which corresponds to initial conditions y(1) = 2.25525 and y‘(1) = -0.0854354.

Here’s a plot of the L-G approximation and a numerical solution.

Plot of Liouville-Green approximation and numerical solution

Here’s the Mathematica code that was used to create the plot above.

s = NDSolve[
       y''[x] == x^2 y[x], 
       y[1] == 2.25525, 
       y'[1] == -0.0854354
    {x, 1, 3}
g[x_] = (Exp[-x^2/2] + Exp[x^2/2])/Sqrt[x]
    {Evaluate[y[x] /. s], g[x]}, 
    {x, 1, 3}, 
    PlotLabels -> {"Numerical", "LG"}

For more on Liouville-Green approximations, see Olver’s classic book Asymptotics and Special Functions.

Related postMechanical vibrations

Differential equations and recurrence relations

Series solutions to differential equations can be grubby or elegant, depending on your perspective.

Power series solutions

At one level, there’s nothing profound going on. To find a series solution to a differential equation, assume the solution has a power series, stick the series into the equation, and solve for the coefficients. The mechanics are tedious, but you just follow your nose.

But why should this work? Why should the solution have a power series? And why should it be possible to find the coefficients? After all, there are infinitely many of them!

As for why the solution should have a power series, sometimes it doesn’t. But there is a complete theory to tell you when the solution should have a power series or not. And when it doesn’t have a power series, it may have more general series solution.

Recurrence relations

The tedious steps in the middle of the solution process may obscure what’s happening between the first and last step; the hard part in the middle grabs our attention. But the big picture is that series solution process transforms a differential equation for a function y(x) into a recurrence relation for the coefficients in the power series for y. We turn a continuous problem into a discrete problem, a hard problem into an easy problem.

To look at an example, consider Airy’s equation:

y” – xy = 0.

I won’t go into all the details of the solution process—my claim above is that these details obscure the transformation I want to emphasize—but skip from the first to the last step.

If you assume y has a power series at 0 with coefficients an, then we find that the coefficients satisfy the recurrence relation

(n + 2)(n + 1)an+2 = an-1

for n ≥ 1. Initial conditions determine a0 and a1, and it turns out a2 must be 0. The recurrence relation shows how these three coefficients determine all the other coefficients.

Holonomic functions and sequences

There’s something interesting going on here, a sort of functor mapping differential equations to recurrence relations. If we restrict ourselves to differential equations and recurrence relations with polynomial coefficients, we get into the realm of holonomic functions.

A holonomic function is one that satisfies a linear differential equation with polynomial coefficients. A holonomic sequence is a sequence that satisfies a linear recurrence relation with polynomial coefficients. Holonomic functions are the generating functions for holonomic sequences.

Holonomic functions are interesting for a variety of reasons. They’re a large enough class of functions to include many functions of interest, like most special functions from analysis and a lot of generating functions from combinatorics. And yet they’re a small enough class to have nice properties, such as closure under addition and multiplication.

Holonomic functions are determined by a finite list of numbers, the coefficients of the polynomials in their defining differential equation. This means they’re really useful in computer algebra because common operations ultimately map one finite set of numbers to another.


How to eliminate the first order term from a second order ODE

Authors will often say that “without loss of generality” they will assume that a differential equation has no first order derivative term. They’ll explain that there’s no need to consider

y'' + p(x) y' + q(x) y = 0

because a change of variables can turn the above equation into one of the form

y'' + q(x) y = 0

While this is true, the change of variables is seldom spelled out. I’ll give the change of variables explicitly here in case this is helpful to someone in the future. Define u(x) and r(x) by

u(x) = \exp\left( \frac{1}{2} \int^x p(t)\,dt\right ) y(x)


r(x) = q(x) - \frac{1}{2}p'(x) - \frac{1}{4}p(x)^2

With this change of variables

u'' + r(x) u = 0

Proof: Calculate u” + ru and use the fact that y satisfies the original differential equation. The calculation is tedious but routine.

Example: Suppose we start with

xy'' + y' + x^3 y = 0

Then dividing by x we get

y'' + \frac{1}{x}y' + x^2 y = 0

Now applying the change of variables above gives

u'' + \left(x^2 + \frac{1}{4x^2}\right)u = 0
and our original y is u / √ x.

Making a problem easier by making it harder

In the oral exam for my PhD, my advisor asked me a question about a differential equation. I don’t recall the question, but I remember the interaction that followed.

I was stuck, and my advisor countered by saying “Let me ask you a harder question.” I was still stuck, and so he said “Let me ask you an even harder question.” Then I got it.

By “harder” he meant “more general.” He started with a concrete problem, then made it progressively more abstract until I recognized it. His follow-up questions were logically harder but psychologically easier.

This incident came to mind when I ran across an example in Lawrence Evans’ control theory course notes. He uses the example to illustrate what he calls an example of mathematical wisdom:

It is sometimes easier to solve a problem by embedding it within a larger class of problems and then solving the larger class all at once.

The problem is to evaluate the integral of the sinc function:

\int_0^\infty \frac{\sin(x)}{x}\, dx

He does so by introducing the more general problem of evaluating the function

I(\alpha) = \int_0^\infty \exp(-\alpha x) \frac{\sin(x)}{x}\, dx

which reduces to the sinc integral when α = 0.

We can find the derivative of I(α) by differentiating under the integral sign and integrating by parts twice.

I'(\alpha) &=& \int_0^\infty \frac{\partial}{\partial \alpha} \exp(-\alpha x) \frac{\sin(x)}{x}\, dx \\ &=& \int_0^\infty \exp(-\alpha x) \sin(x) \, dx \\ &=& - \frac{1}{1 + \alpha^2}


I(\alpha) = - \arctan(\alpha) + C

As α goes to infinity, I(α) goes to zero, and so C = π/2 and I(0) = π/2.

Incidentally, note that instead of computing an integral in order to solve a differential equation as one often does, we introduced a differential equation in order to compute an integral.