Structure in jazz and math

Last night I went to a concert by the Branford Marsalis Quartet. One of the things that impressed me about the quartet was how creative they are while also being squarely within a tradition. People who are not familiar with jazz may not realize how structured it is and how much it respects tradition. The spontaneous and creative aspects of jazz are more obvious than the structure. In some ways jazz is more tightly structured than classical music. To use Francis Schaeffer’s phrase, there is form and freedom, freedom within form.

Every field has its own structure, its tropes, its traditions. Someone unfamiliar with the field can be overwhelmed, not having the framework that an insider has to understand things. They may think something is completely original when in fact the original portion is small.

In college I used to browse the journals in the math library and be completely overwhelmed. I didn’t learn until later that usually very little in a journal article is original, and even the original part isn’t that original. There’s a typical structure for a paper in PDEs, for example, just as there are typical structures for romantic comedies, symphonies, or space operas. A paper in partial differential equations might look like this:

  1. Motivation / previous work
  2. Weak formulation of PDE
  3. Craft function spaces and PDE as operator
  4. A priori estimates imply operator properties
  5. Well posedness results
  6. Regularity

An expert knows these structures. They know what’s boilerplate, what’s new, and just how new the new part is. When I wrote something up for my PhD advisor I remember him saying “You know what I find most interesting?” and pointing to one inequality. The part he found interesting, the only part he found interesting, was not that special from my perspective. It was all hard work for me, but only one part of it stood out as slightly original to him. An expert in partial differential equations sees a PDE paper the way a professional musician listens to another or the way a chess master sees a chess board.

While a math journal article may look totally incomprehensible, an expert in that specialization might see 10% of it as somewhat new. An interesting contrast to this is the “abc conjecture.” Three and a half years ago Shinichi Mochizuki proposed a proof of this conjecture. But his approach is so entirely idiosyncratic that nobody has been able to understand it. Even after a recent conference held for the sole purpose of penetrating this proof, nobody but Mochizuki really understands it. So even though most original research is not that original, once in a while something really new comes out.

Mathematics of medical plastics

In this interview, I talk with Ray Rilling about applying mathematics to manufacturing medical plastics.

Photo of Ray Rilling

JC: Ray, could you start by saying a little about yourself?

RR: Sure. My name is Ray Rilling, and I am the Director of Technology at Putnam Plastics.

My initial training was cellular biology with an emphasis on epidemiology, but I decided to move to a more applied (and lucrative) field. I’ve been in medical plastics for the last 23 years.

When I saw that epidemiology was a crowded field, I pursued substantial amounts of additional coursework that allowed me to get into the medical engineering plastics space. In lots of ways I think this pick-and-choose approach was much better than a traditional bio-medical engineering degree would have been. It allowed me to focus on exactly what I needed to learn.

JC: Thanks. And tell me a bit about where you work.

RR: Putnam Plastics is a 30 year old medical plastics manufacturer, specializing in polymer-based medical components (i.e. plastic tubing for medical usage).

JC: I did some consulting work for a company that manufactures stents, so I imagine your companies have a few things in common. What is your day-to-day mathematical toolkit?

RR: We cross a lot of boundaries and there are many layers of computational needs. The majority of the computations are basic algebraic formulas to calculate simple things like annular tooling draw down ratios or the ultimate tensile of a tube. From there, you may find that calculating the burst pressure of a composite or may require the solving of a linear system.

The most challenging computations that we perform are for our complex polymer extrusion tools. These are based on Computational Fluid Dynamics (CFD) and Finite or Boundary Element Methods (FEM / BEM). Many complexities emerge because polymer flow is compressible, non-isothermal, and viscoelastic in nature. It does not allow us to use conventional Navier-Stokes or other common equations. In all reality, we rely on proprietary equations to point us in the right direction. No model has ever provided a perfect solution. We combine our experience and expertise with the applied calculations.

The most complex of the computations are performed with third party software packages and often require the use of complex meshing, materials testing for inputs, and can even require us to create modified code to suit the specific application. We work with some very creative minds and there is never a shortage of new challenges.

Aside from the design engineering mathematics, we think statistically about our results. This can be to analyze the process capability, equivalency, and even product risk analysis. This statistical analysis can be intertwined with the design engineering aspects and deal with a range of biological and medical constants (i.e. devices might need to handle liquid nitrogen or remain in bodies for years or decades).

One of my mentors used to say, “The best equation is teamwork.” Teamwork is a sort of expectation, a crowd-sourced form of expert opinion. It allows you bring a focus on what the real variables are.

Some calculations can take weeks, so having a good appreciation of how to approach the challenge is important. You can get away with not knowing what’s under the hood. But that’s dangerous. It is much better to get things done more quickly, and with better understanding. Especially when a client is waiting on results.

JC: What is some math you wish you learned more of?

Vector and tensor-based physics. As the bar in the medical device industry increases, so do the expectations of the products that we make. Being able take large number of variables or inputs and transform them into a workable model is extremely valuable. My dream is to be able to reliably calculate extrusion tooling, product failure modes, and performance properties before I start cutting steel. But in our time and age, we still rely on some development iterations and calculate what we can.

I also wish I had more applied materials science. When I am developing something in the lab, sometimes the product does not perform the way you want it to. Everytime I start learning something new, like applied surface chemistry or the effects of radiation on polymers, I think of 100 things that I could have done in previous projects.

JC: Did you learn any math you haven’t had much use for yet? I’ve used most of the math I learned, though sometimes there’s been a gap of twenty years between learning something and applying it.

RR: I actually use pretty much all of the math I’ve learned. But that’s likely because I was able to pick and choose because I went back and did supplemental studies. Even the epidemiology is useful.

JC: I wanted to pick up on what you said earlier about physical models. Could you say more about the interplay of physical and mathematical models at Putnan?

RR: The models are never completely right. We use them as a tool. They often fail in three key ways:

  1. We find additional variables we need to account for or constrain. In many cases our first attempt failed because the product did not perform properly in an unaccounted for way. At the end of the day it might take many iterations before we have all of the performance properties that are needed.
  2. We are translating soft or subjective numbers from physicians or engineers. A model will provide concrete numbers. It is often important to create a baseline product that can take the customers subjective description and correlate it to a model.
  3. We need to additionally constrain for cost effectiveness (i.e. minimize the amount of expensive wire in a catheter).

Those three trade offs mean that we are never able to just take a model print out and run with it. There always has to be some back and forth.

JC: It’s interesting that you consider shortcomings besides inaccuracy as failures. Someone new to applying math in the real world might think of your first point but the other points. The latter two aren’t failures from an academic perspective, but they certainly are from a business perspective.

* * *

Other interviews:

It all boils down to linear algebra

When I was in college, my view of applied math was something like the following.

Applied math is mostly mathematical physics. Mathematical physics is mostly differential equations. Numerical solution of differential equations boils down to linear algebra. Therefore the heart of applied math is linear algebra.

I still think there’s a lot of truth in the summary above. Linear algebra is very important, and a great deal of applied math does ultimately depend on efficient solutions of large linear systems. The weakest link in the argument may be the first one: there’s a lot more to applied math than mathematical physics. Mathematical physics hasn’t declined, but other areas have grown. Still, areas of applied math outside of mathematical physics and outside of differential equations often depend critically on linear algebra.

I’d certainly recommend that someone interested in applied math become familiar with numerical linear algebra. If you’re going to be an expert in differential equations, or optimization, or many other fields, you need to be at leas familiar with numerical linear algebra if you’re going to compute anything. As Stephen Boyd points out in his convex optimization class, many of the breakthroughs in optimization over the last 20 years have at their core breakthroughs in numerical linear algebra. Improved algorithms have sped up the solution of very large systems more than Moore’s law has.

It may seem questionable that linear algebra is at the heart of applied math because it’s linear. What about nonlinear applications, such as nonlinear PDEs? Nonlinear differential equations lead to nonlinear algebraic equations when discretized. But these nonlinear systems are solved via iterations of linear systems, so we’re back to linear algebra.

Impulse response

You may expect that a burst of input will cause a burst of output. Sometimes that’s the case, but often a burst of input results in a long, smoothly decreasing succession of output. You may not get immediate results, but long-term results. This is true of life in general, but it’s also true in a precise sense of differential equations.

One of the surprises from differential equations is that an infinitely concentrated input usually results in a diffuse output. A fundamental solution to a differential equation is a solution to the equation with a Dirac delta as the forcing function. In a sense, your input is so concentrated that it’s not actually a function. And yet the output may be a nice continuous function, and not one that is not particularly concentrated.

The situation is analogous to striking a bell. The input, the hammer blow to the bell, is extremely short, but the response of the bell is long and smooth. Solving a differential equation with a delta function as input is like learning about a bell by listening to how it rings when you strike it. A better analogy would be striking the bell in many places; a fundamental solution actually solves for a delta function with a position argument, not just a single delta function.

If you’re curious how this informal talk of “infinitely concentrated” input and delta “functions” can be made rigorous, start by reading this post.

Related post: Life lessons from differential equations

Life lessons from differential equations

Ten life lessons from differential equations:

  1. Some problems simply have no solution.
  2. Some problems have no simple solution.
  3. Some problems have many solutions.
  4. Determining that a solution exists may be half the work of finding it.
  5. Solutions that work well locally may blow up when extended too far.

Related posts:


Sensitive dependence on initial conditions

The following problem illustrates how the smallest changes to a problem can have large consequences. As explained at the end of the post, this problem is a little artificial, but it illustrates difficulties that come up in realistic problems.

Suppose you have a simple differential equation,

y'' - y = 0

where y(t) gives displacement in meters at a time measured in seconds. Assume initial conditions y(0) = 1 and y'(0) = -1.

Then the unique solution is y(t) = exp(-t). The solution is a decreasing, positive solution. But a numerical solution to the equation might turn around and start increasing, or it might go negative.

Suppose the initial conditions are y(0) = 1 as before but now y'(0) = -1 ± ε. You could think of the ε as a tiny measurement error in the initial velocity, or a limitation in representing the initial velocity as a number in a computer.

The following graph shows the solutions corresponding to ε = 10-6, 0, and -10-6.

plot of 3 solutions, epsilon positive, negative, zero

The exact solution is

 y(t) = \left(1 \mp \frac{\varepsilon}{2}\right)\exp(-t) \pm \frac{\varepsilon}{2}\exp(t)

If the initial velocity is exactly -1, then ε = 0 and we only have the solution exp(-t). But if the initial velocity is a little more than -1, there is an exponentially increasing component of the solution that will eventually overtake the exponentially decreasing component. And if the initial velocity is a little less than -1, there is a negative component increasing exponentially in magnitude. Eventually it will overtake the positive component and cause the solution to become negative.

The solution starts to misbehave (i.e. change direction or go negative) at

t = \frac{1}{2} \log\left( \frac{2}{\varepsilon} \mp 1 \right)

If ε is small, 2/ε is much larger than 1, and so the final 1 above makes little difference. So we could say the solution goes bad at approximately

t = \frac{1}{2} \log\left( \frac{2}{\varepsilon} \right)

whether the error is positive or negative. If ε is ±0.000001, for example, the solution goes bad around t = 7.25 seconds. (The term we dropped only effects the 6th decimal place.)

On a small time scale, less than 7.25 seconds, we would not notice the effect of the (initially) small exponentially growing component. But eventually it matters, a lot. We can delay the point where the solution goes bad by controlling the initial velocity more carefully, but this doesn’t buy us much time. If we make ε ten times smaller, we postpone the time when the solution goes bad to 8.41, only a little over a second later.

[Looking at the plot, the solution does not obviously go wrong at t = 7.25. But it does go qualitatively wrong at that point: a solution that should be positive and decreasing becomes either negative or increasing.]

Trying to extend the quality of the solution by making ε smaller is a fool’s errand. Every order of magnitude decrease in ε only prolongs the inevitable by an extra second or so.

This is a somewhat artificial example. The equation and initial conditions were selected to make the sensitive dependence obvious. The sensitive dependence is itself sensitive: small changes to the problem, such as adding a small first derivative term, make it better behaved.

However, sensitive dependence is a practical phenomena. Numerical methods can, for example, pick up an initially small component of an increasing solution while trying to compute a decreasing solution. Sometimes the numerical sensitivity is a reflection of a sensitive physical system. But if the physical system is stable and the numerical solution is not, the problem may not be numerical. The problem may be a bad model. The numerical difficulties may be trying to tell you something, in which case increasing the accuracy of the numerical method may hide a more basic problem.

Related post: Approximating a solution that doesn’t exist

Relating Airy and Bessel functions

The Airy functions Ai(x) and Bi(x) are independent solutions to the differential equation

y'' - xy = 0

For negative x they act something like sin(x) and cos(x). For positive x they act something like exp(x) and exp(-x). This isn’t surprising if you look at the differential equation. If you replace x with a negative constant, you get sines and cosines, and if you replace it with a positive constant, you get positive and negative exponentials.

The Airy functions can be related to Bessel functions as follows:

\mathrm{Ai}(x) = \left\{ 	\begin{array}{ll} 		\frac{1}{3}\sqrt{\phantom{-}x} \left(I_{-1/3}(\hat{x}) - I_{1/3}(\hat{x})\right) & \mbox{if } x > 0 \\<br /><br /><br /><br />
                \\<br /><br /><br /><br />
		\frac{1}{3}\sqrt{-x} \left(J_{-1/3}(\hat{x}) + J_{1/3}(\hat{x})\right) & \mbox{if } x < 0 	\end{array} \right.


\mathrm{Bi}(x) = \left\{ 	\begin{array}{ll} 		\sqrt{\phantom{-}x/3} \left(I_{-1/3}(\hat{x}) + I_{1/3}(\hat{x})\right) & \mbox{if } x > 0 \\<br />                 \\<br /> 		\sqrt{-x/3} \left(J_{-1/3}(\hat{x}) - J_{1/3}(\hat{x})\right) & \mbox{if } x < 0 	\end{array} \right.

Here J is a “Bessel function of the first kind” and I is a “modified Bessel function of the first kind.” Also

\hat{x} = \frac{2}{3} \left(\sqrt{|x|}\right)^3

To verify the equations above, and to show how to compute these functions in Python, here’s some code.

The SciPy function airy computes both functions, and their first derivatives, at once. I assume that’s because it doesn’t take much longer to compute all four functions than to compute one. The code for Ai2 and Bi2 below uses np.where instead of if ... else so that it can operate on NumPy vectors all at once. You can plot Ai and Ai2 and see that the two curves lie on top of each other. The same holds for Bi and Bi2.


from scipy.special import airy, jv, iv
from numpy import sqrt, where

def Ai(x):
    (ai, ai_prime, bi, bi_prime) = airy(x)
    return ai

def Bi(x):
    (ai, ai_prime, bi, bi_prime) = airy(x)
    return bi

def Ai2(x):
    third = 1.0/3.0
    hatx = 2*third*(abs(x))**1.5
    return where(x > 0,
        third*sqrt( x)*(iv(-third, hatx) - iv(third, hatx)),
        third*sqrt(-x)*(jv(-third, hatx) + jv(third, hatx)))

def Bi2(x):
    third = 1.0/3.0
    hatx = 2*third*(abs(x))**1.5
    return where(x > 0,
        sqrt( x/3.0)*(iv(-third, hatx) + iv(third, hatx)),
        sqrt(-x/3.0)*(jv(-third, hatx) - jv(third, hatx)))


There is a problem with Ai2 and Bi2: they return nan at 0. A more careful implementation would avoid this problem, but that’s not necessary since these functions are only for illustration. In practice, you’d simply use airy and it does the right thing at 0.

Related links:

Hilbert space methods for PDE

When I was in grad school, my advisor asked me to study his out-of-print book, Hilbert Space Methods in Partial Differential Equations. I believe I had a photocopy of a photocopy; I don’t recall ever seeing the original book. I pored over that stack of copies line by line while preparing for my qualifying exams.

Then this evening I was browsing a used book store and was shocked to find a copy of the book, a Dover reprint (ISBN 0486474437).

It was an odd feeling to find what was once a precious and mysterious book available for $5.99 as part of a rag-tag assortment of mostly elementary/popular used math books.

Related posts:

Mean residual time

If something has survived this far, how much longer is it expected to survive? That’s the question answered by mean residual time.

For a positive random variable X, the mean residual time for X is a function eX(t) given by

e_X(t) = E(X - t \mid X > t) = \int_t^\infty  \frac{1 - F_X(x)}{1-F_X(t)} \, dx

provided the expectation and integral converge. Here F(t) is the CDF, the probability that X is less than t.

For an exponential distribution, the mean residual time is constant. For a Pareto (power law) distribution, the mean residual time is proportional to t. This has an interesting consequence, known as the Lindy effect.

Now let’s turn things around. Given function a function e(t), can we find a density function for a positive random variable with that mean residual time? Yes.

The equation above yields a differential equation for F, the CDF of the distribution.

If we differentiate both sides of

e(t) (1 - F(t)) = \int_t^\infty 1 - F(x)\, dx

with respect to t and rearrange, we get the first order differential equation

F'(t) + g(t)\, F(t) = g(t)


g(t) = \frac{e'(t) + 1}{e(t)}

The initial condition must be F(0) = 0 because we’re looking for the distribution of a positive random variable, i.e. the probability of X being less than zero must be 0. The solution is then

F(t) = 1 - \frac{e(0)}{e(t)} \exp\left( -\int_0^t \frac{dx}{e(x)} \right)

This means that for a desired mean residual time, you can use the equation above to create a CDF function to match. The derivative of the CDF function gives the PDF function, so differentiate both sides to get the density.

Negative damping

An earlier post looked at the effect of damping on free vibrations. We looked at the equation

m u'' + γ u' + k u = 0

where the coefficients m, γ, and k were all positive. But what if some of these terms are negative?

Let’s assume that m is positive. Otherwise multiply the equation above by -1. What happens if γ or k are negative?

The term γ, when positive, takes energy out of the system. A negative value of γ would be a term that adds energy, a sort of negative damping. The behavior of the solutions is determined by the eigenvalues of the system, that is the roots of the equation

m x2 + γ x + k = 0.

If γ is negative, the eigenvalues have positive real part and so the amplitude of the solutions increases exponentially. If γ2 < 4mk then the eigenvalues are complex and so the solutions have an oscillating component. If γ2 = 4mk then there is one repeated, positive eigenvalue. But if γ2 > 4mk the system has one positive and one negative eigenvalue. The solution corresponding to the negative eigenvalue decays exponentially. The other solution increases exponentially. The general solution is a linear combination of these two solutions. As time increases, only the exponentially increasing component of the solution matters because the effect of the other component goes to zero.

In theory, the solution could consist purely of the exponentially decaying component. But in practice, if there is even the tiniest component of the exponentially increasing solution, this component will eventually dominate. A numerical solution, for example, would eventually be dominated by the exponentially increasing solution.

Now what about negative springs? Instead of being a restoring force, a negative spring would be a sort of amplifier, reinforcing rather than resisting displacement. The discriminant γ2 – 4mk will be positive if k is negative. There will be no oscillation because the eigenvalues have no complex part. Also, there will be one positive and one negative eigenvalue, and so the solutions grow exponentially as described above.

Related: If you’re interested in differential equations, check out @diff_eq on Twitter.

Pulp science fiction and vibrations

This morning I ran across Pulp-o-mizer and decided my series of posts on mechanical vibrations could use a little sensational promotion.

The posts are

Damped, driven oscillations

This is the final post in a four-part series on vibrating systems. The first three parts were

and now we consider driven, damped vibrations. We are looking at the equation

m u'' + γ u' + k u = F cos ωt

where all coefficients are positive.

As in the previous post, we need to find one solution to the equation with the forcing term F cos ωt and add it to the general solution to the homogeneous (free) equation.

The particular solution we’re looking for is simply R cos(ωt– φ), but the dependence of R and φ on the coefficients m, γ, and k is complicated. We’ll come back to this later, but first we’ll see what we can understand without knowing the exact form of R and φ.

The solutions to the unforced equation decay exponentially over time because of damping. The long-term behavior of the system, known as the steady state solution, is just R cos(ωt– φ). The frequency of the steady state solution is the same as the frequency of the forcing function, though the phase will be different in general.

To find the amplitude and phase of the steady state solution, we first define Δ > 0 by

Δ2 = m202 – ω2)2 + γ2 ω2

where ω0 = √(k/m) is the natural frequency of the system.

Then R = F /Δ . Note that Δ is small when the driving frequency ω is close to the natural frequency. That’s not where Δ is minimized, though when γ is small the minimum of Δ occurs close to the natural frequency. The minimum of Δ, and hence the maximum of the amplitude R, occurs at a lower frequency, namely

ω2 = ω02 – γ2 / 2m2.

The phase φ satisfies

cos φ = m02 – ω2)/ Δ


sin φ = γω/ Δ.

For low frequency ω, the phase φ is small, i.e. the response is nearly in phase with the excitation. For ω =ω0, i.e. forcing at the natural frequency, the response lags the excitation by π/2, a quarter cycle.

Undamped forced vibrations

This is the third in a series of four blog posts on mechanical vibrations. The first two posts were

We’re looking at solutions to the differential equation

m u'' + γ u' + k u = F cos ωt

The first two posts considered “free” vibrations, systems with no external input. The coefficient F is zero in such systems. Now we consider a the effect of an external driving force of the form F cos ωt. We’ll leave out damping effects in this post, that is, we will assume γ = 0. The final post in this series will cover damped, driven oscillations, i.e. the case of γ and F both positive.

When the right-hand side of a linear differential equation is 0, we say the equation is homogeneous. When the right-hand side is not zero, we say the equation is non-homogeneous.

The general solution to a non-homogeneous linear differential equation is the general solution to the corresponding homogeneous equation plus any particular solution to the non-homogeneous equation. That is, we all we have to do is find one solution to the equation with the forcing term, and add it to the general solution to the equation without the forcing term.

Recall that the natural frequency of our equation without forcing is ω defined by

ω02 = k/m.

and that the general solution without forcing is

u(t) = R cos(ω0t – φ)

where the amplitude R and the phase φ are determined by the initial conditions. To find the general solution to the non-homogeneous we need to find one solution to the non-homogeneous equation, and here’s one:

F cos(ωt) /(m02 – ω2))

provided that the natural frequency ω0 is not equal to the driving frequency ω. This says that the general solution is the sum of two periodic functions, one with the natural frequency ω0 and one with the frequency of the driving force ω. In general, the sum of two periodic functions can be fairly complicated, and not necessarily periodic. More on that here.

Next we will look at two phenomena that the solutions can illustrate: beats and resonance. The latter happens when the natural and driving frequencies are equal.


If the starts at position 0 and with velocity 0, then the amplitudes of the homogeneous and non-homogeneous solutions are the same. The solution is

F /(m02 – ω2)) ( cos(ωt) – cos(ω0t) )

We can apply the trig identity

cos(A) – cos(B) = 2 sin( (AB)/2 ) sin(( A+B)/2 )

to rewrite the solution as the amplitude 2F /(m02 – ω2)) times

sin( (ω0 – ω) t/2 ) sin( (ω0 + ω) t/2 )

If the natural and driving frequencies are similar, i.e. |ω0 – ω| is small, then the first term is a low frequency factor multiplying the second high frequency term. This phenomena is known as beats. You can hear it, for example, when two instruments are nearly but not quite in tune. If one is playing an A 440 and the other an A 441, you’ll hear a pitch of 440.5 getting louder and softer at a frequency of one cycle per second.

Here’s an example of beats, plotting sin(t) sin(10t). The solid blue line is the product. The dotted green lines are sin(t) and -sin(t).

illustrating beats

Incidentally, this is related to how AM radio works: the broadcast content is multiplied by the carrier wave.


When we wrote down the solution above, we assumed the natural and driving frequencies were not equal. If they are, if the system is being driven at its natural frequency, we have resonance. In that case, the solution to the non-homogeneous equation is

(F/2mω) t sin ωt.

Because the solution is proportional to t sin ωt, this says we have oscillations whose amplitude grows linearly with time. In theory, the oscillations can become arbitrarily large. In reality there is always some damping, which eventually puts some limit on the amplitude of the oscillations. Also, the system may break before damping would reign in the oscillations.

Here’s an example of resonance, plotting 0.25 t sin(t) in solid blue and ± 0.25 t in dotted green.


The final post in this series will look at the effect of damping on forced vibrations.

Free damped vibrations

This is the second post in a series on vibrations determine by the equation

m u'' + γ u' + k u = F cos ωt

The first post in the series looked at the simplest case, γ = 0 and F = 0. Now we’ll look at the more realistic and more interesting case of damping γ > 0. We won’t consider forcing yet, so F = 0 and our equation reduces to

m u'' + γ u' + k u = 0.

The effect of damping obviously depends on the amount of damping, i.e. the size of γ relative to other terms. Damping removes energy from the system and so the amplitude of the oscillations goes to zero over time, regardless of the amount of damping. However, the system can have three qualitatively different behaviors: under-damping, critical damping, and over-damping.

The characteristic polynomial of the equation is

m x2 + γ x + k

This polynomial has either two complex roots, one repeated real root, or two real roots depending on whether the discriminant γ2 – 4mk is negative, zero, or positive. These three states correspond to under-damping, critical damping, and over-damping.


When damping is small, the system vibrates at first approximately as if there were no damping, but the amplitude of the solutions decreases exponentially. As long as γ2 < 4mk the system is under-damped and the solution is

u(t) = R exp( –γt/2m ) cos( μt– φ )


μ = √(4mk – γ2) / 2m.

The amplitude R and the phase φ are determined by the initial conditions.

As an example, let γ =1, m = 1, and k = 5. This implies μ =  √19/2. Set the initial conditions so that R = 6 and φ = 1. The solid blue line below is the solution 6 exp(-t/2) cos(t– 1), and the dotted green lines above and below are the exponential envelope 6 exp(-t/2).

And here is a video of the mass-spring-dashpot system in action made using the code discussed here.

Note that the solution is not simply an oscillation at the natural frequency multiplied by an exponential term. The damping changes the frequency of the periodic term, though the change is small when the damping is small.

The factor μ is called the quasi-frequency. Recall from the previous post that the natural frequency, the frequency of the solution with no damping, is denoted ω0. The quasi-frequency is related to the natural frequency by

μ = √(1 – γ2/4mk) ω0 ≈ (1 – γ2/8mk) ω0.

The approximation above holds for small γ since a Taylor expansion shows that √(1 – x) ≈ 1 – x/2 for small x.

This says that the quasi-frequency decreases the natural frequency by a factor of approximately γ2/8mk. So when γ is small there is little change to the natural frequency, but the amount of change grows quadratically as γ increases.

Critical damping

Critical damping occurs when γ2 = 4mk. In this case the solution is given by

u(t) = (A + Bt) exp( – γt/2m).

The position of the mass will asymptotically approach 0. Depending on the initial velocity, it will either go monotonically to zero or reach some maximum displacement before it turns around and goes to 0.

For an example, let m = 1 and k = 5 as before. For critical damping, γ must equal √20. Choose initial conditions so that A = 1 and B = 10.


When γ2 > 4mk the system is over-damped. When damping is that large, the system does not oscillate at all.

The solution is

u(t) = A exp(r1 t) + B exp(r2 t)

where r1 and r2 are the two roots of

m x2 + γ x + k = 0.

Because γ > 0, both roots are negative and solutions decay exponentially to 0.

Coming up

The next two posts in the series will add a forcing term, i.e. F > 0 in our differential equation. The next post will look at undamped driven vibrations, and the final post will look at damped driven vibrations.

Fourier series before Fourier

I always thought that Fourier was the first to come up with the idea of expressing general functions as infinite sums of sines and cosines. Apparently this isn’t true.

The idea that various functions can be described in terms of Fourier series … was for the first time proposed by Daniel Bernoulli (1700–1782) to solve the one-dimensional wave equation (the equation of motion of a string) about 50 years before Fourier. … However, no one contemporaneous to D. Bernoulli accepted the idea as a general method, and soon the study was forgotten.

Source: The Nonlinear World

Perhaps Fourier’s name stuck to the idea because he developed it further than Bernoulli did.

Related posts: