Mechanical vibrations

My favorite topic in an introductory differential equations course is mechanical and electrical vibrations. I enjoyed learning about it as a student and I enjoyed teaching it later. (Or more accurately, I enjoyed being exposed to it as a student and really learning it later when I had to teach it.)

I find this subject interesting for three reasons.

  1. The same equations describe a variety of mechanical and electrical systems.
  2. You can get practical use out of some relatively simple math.
  3. The solutions display wide variety of behavior as you vary the coefficients.

This is the first of a four-part series of posts on mechanical vibrations. The posts won’t be consecutive: I’ll write about other things in between.

Simple mechanical vibrations satisfy the following differential equation:

m u'' + gamma u' + k u = F cos omega t

We could simply write down the general solution be done with it. But the focus here won’t be finding the solutions but rather understanding how the solutions behave.

We’ll think of our equation as modeling a system with a mass attached to a spring and a dash pot. All coefficients are constant. m is the mass, γ is the damping from the dash pot, and k is the restoring force from the spring. The driving force has amplitude F and frequency ω. The solution u(t) gives the position of the mass at time t.

More complicated vibrations, such as a tall building swaying in the wind, can be approximated by this simple setting.

The same differential equation could model an electrical circuit with an inductor, resistor, and capacitor. In that case replace mass m with the inductance L, damping γ with resistance R, and spring constant k with the reciprocal of capacitance C. Then the equation gives the charge on the capacitor at time t.

We will assume m and k are positive. The four blog posts will correspond to γ zero or positive, and F zero or non-zero. Since γ represents damping, the system is called undamped when γ = 0 and damped when γ is greater than 0. And since F is the amplitude of the forcing function, the system is called free when F = 0 and forced otherwise. So the plan for the four posts is

Free, undamped vibrations

With no damping and no forcing, our equation is simply

m u'' + k u = 0

and we can write down the solution

u(t) = A sin ω0t + B cos ω0t


ω02 = k/m.

The value ω0 is called the natural frequency of the system because it gives the frequency of vibration when there is no forcing function.

The values of A and B are determined by the initial conditions, i.e. u(0)  = B and u(0) = A ω0.

Since the sine and cosine components have the same frequency ω0, we can use a trig identity to combine them into a single function

u(t) = R cos(ω0t – φ)

The amplitude R and phase φ are related to the parameters A and B by

A = R cos φ, B = R sin φ.

That’s it for undamped free vibrations: the solutions are just sine waves. The next post in the series will make things more realistic and more interesting by adding damping.

Update: Here’s an animation of undamped free oscillation using code by Stéfan van der Walt described here.

Consulting in differential equations

Unicode to LaTeX

I’ve run across a couple web sites that let you enter a LaTeX symbol and get back its Unicode value. But I didn’t find a site that does the reverse, going from Unicode to LaTeX, so I wrote my own.

Unicode / LaTeX Conversion

If you enter Unicode, it will return LaTeX. If you enter LaTeX, it will return Unicode. It interprets a string starting with “U+” as a Unicode code point, and a string starting with a backslash as a LaTeX command.

screenshot of

For example, the screenshot above shows what happens if you enter U+221E and click “convert.” You could also enter infty and get back U+221E.

However, if you go from Unicode to LaTeX to Unicode, you won’t always end up where you started. There may be multiple Unicode values that map to a single LaTeX symbol. This is because Unicode is semantic and LaTeX is not. For example, Unicode distinguishes between the Greek letter Ω and the symbol Ω for ohms, the unit of electrical resistance, but LaTeX does not.

* * *

For daily tips on LaTeX and typography, follow @TeXtip on Twitter.

TeXtip logo

Letters that fell out of the alphabet

Mental Floss had an interesting article called 12 letters that didn’t make the alphabet. A more accurate title might be 12 letters that fell out of the modern English alphabet.

I thought it would have been better if the article had included the Unicode values of the letters, so I did a little research and created the following table.

Insular gU+A77DU+1D79
Thorn with strokeU+A764U+A765
Tironian ondU+204A
Long sU+017F


Once you know the Unicode code point for a symbol, you can find out more about it, for example, here.

Related posts:

Entering Unicode characters in Windows and Linux.

To enter a Unicode character in Emacs, you can type C-x 8 <return>, then enter the value.

Offended by conditional probability

It’s a simple rule of probability that if A makes B more likely, B makes A more likely. That is, if the conditional probability of A given B is larger than the probability of A alone, the the conditional probability of B given A is larger than the probability of B alone. In symbols,

Prob( A | B ) > Prob( A ) ⇒ Prob( B | A ) > Prob( B ).

The proof is trivial: Apply the definition of conditional probability and observe that if Prob( AB ) / Prob( B ) > Prob( A ), then Prob( AB ) / Prob( A ) > Prob( B ).

Let A be the event that someone was born in Arkansas and let B be the event that this person has been president of the United States. There are five living current and former US presidents, and one of them, Bill Clinton, was born in Arkansas, a state with about 1% of the US population. Knowing that someone has been president increases your estimation of the probability that this person is from Arkansas. Similarly, knowing that someone is from Arkansas should increase your estimation of the chances that this person has been president.

The chances that an American selected at random has been president are very small, but as small as this probability is, it goes up if you know the person is from Arkansas. In fact, it goes up by the same proportion as the opposite probability. Knowing that someone has been president increases their probability of being from Arkansas by a factor of 20, so knowing that someone is from Arkansas increases the probability that they have been president by a factor of 20 as well. This is because

Prob( A | B ) / Prob( A ) = Prob( B | A ) / Prob( B ).

This isn’t controversial when we’re talking about presidents and where they were born. But it becomes more controversial when we apply the same reasoning, for example, to deciding who should be screened at airports.

When I jokingly said that being an Emacs user makes you a better programmer, it appears a few Vim users got upset. Whether they were serious or not, it does seem that they thought “Hey, what does that say about me? I use Vim. Does that mean I’m a bad programmer?”

Assume for the sake of argument that Emacs users are better programmers, i.e.

Prob( good programmer | Emacs user )  >  Prob( good programmer ).

We’re not assuming that Emacs users are necessarily better programmers, only that a larger proportion of Emacs users are good programmers. And we’re not saying anything about causality, only probability.

Does this imply that being a Vim user lowers your chance of being a good programmer? i.e.

Prob( good programmer | Vim user )  <  Prob( good programmer )?

No, because being a Vim user is a specific alternative to being an Emacs user, and there are programmers who use neither Emacs nor Vim. What the above statement about Emacs would imply is that

Prob( good programmer | not a Emacs user )  <  Prob( good programmer ).

That is, if knowing that someone uses Emacs increases the chances that they are a good programmer, then knowing that they are not an Emacs user does indeed lower the chances that they are a good programmer, if we have no other information. In general

Prob( A | B ) > Prob( A ) ⇒ Prob( A | not B ) < Prob( A ).

To take a more plausible example, suppose that spending four years at MIT obtaining a computer science degree makes you a better programmer. Then knowing that someone has a CS degree from MIT increases the probability that this person is a good programmer. But if that’s true, it must also be true that absent any other information, knowing that someone does not have a CS degree from MIT decreases the probability that this person is a good programmer. If a larger proportion of good programmers come from MIT, then a smaller proportion must not come from MIT.

* * *

This post uses the ideas of information and conditional probability interchangeably. If you’d like to read more on that perspective, I recommend Probability Theory: The Logic of Science by E. T. Jaynes.

Lighten up and be logical

I had a little fun on Twitter this morning. From @UnixToolTip I said

Some of the best programmers use Emacs. Therefore, if you use Emacs, you’ll be a great programmer. #cargocultlogic

and from @CompSciFact I said

Some of the best programmers have beards. Therefore, growing a beard will make you a better programmer. #cargocultlogic

The serious implication behind the joke is that mimicking the superficial characteristics of a good programmer will not make you a good programmer.

Apparently most people thought these were funny, but as usual, some people got bent out of shape. They didn’t realize these were meant to be funny, or at least intentionally illogical, despite the cargo cult hash tag. They thought I was slamming vi(m) or being sexist.

Those who were offended by my humorous logic were not being logical.

Pretend for a moment that the statements above were meant seriously. If I said that using Emacs makes you a great programmer, that doesn’t mean that you can’t be a good programmer unless you use Emacs. Maybe using vi(m) makes you a better programmer too. And if I really believed that growing a beard makes you a better programmer, that doesn’t imply that people who do not grow beards are doomed to mediocrity. Maybe childbirth also makes you a better programmer, even though that option is not available to some. In logic symbols, the statement pq does not imply !p ⇒ !q.

I have two suggestions for the Twittersphere:

  1. Lighten up. Don’t take everything so seriously.
  2. If you’re going to play the logic card, be consistent.

Endeavour Selections on Facebook

I started Endeavour Selections on Facebook a little over a year ago for people who want to read the non-technical posts here but who are not so interested in math or computing. The page didn’t take off, so I stopped posting to it. But now there seems to be more interest in it, so I’m giving it another go.

I will post a few other things on the Facebook page besides blog articles, but I’ll keep the math over here.

* * *

Some people have asked about getting blog posts via email. Yes, you can do that. On the right side of the blog, there is a little box where you can enter your email address. Then each morning you’ll get an email message containing the post(s) from the previous day.

Generalized Fourier transforms

How do you take the Fourier transform of a function when the integral that would define its transform doesn’t converge? The answer is similar to how you can differentiate a non-differentiable function: you take a theorem from ordinary functions and make it a definition for generalized functions. I’ll outline how this works below.

Generalized functions are linear functionals on smooth (ordinary) functions. Given an ordinary function f, you can create a generalized function that maps a smooth test function φ to the integral of fφ.

There are other kinds of generalized functions. For example, the Dirac delta “function” is really the generalized function δ that maps φ to φ(0). This is the formalism behind the hand-wavy nonsense about a function infinitely concentrated at 0 and integrating to 1. “Integrating” the product δφ is really applying the linear functional δ to φ.

Now for absolutely integrable functions f and g, we have

\int_{-\infty}^\infty \hat{f} g = \int_{-\infty}^\infty f \hat{g}

In words, the integral of the Fourier transform of f times g equals the integral of f times the Fourier transform of g. This is the theorem we use as motivation for our definition.

Now suppose f is a function that doesn’t have a classical Fourier transform. We make f into a generalized function and define its Fourier transform as the linear function that maps a test function φ to the integral of f times the Fourier transform of φ.

More generally, the Fourier transform of a generalized function f is is the linear function that maps a test function φ to the action of f on the Fourier transform of φ.

This allows us to say, for example, that the Fourier transform of the constant function f(x) = 1 is 2πδ, an exercise left for the reader.

The Heisenberg uncertainty principle for ordinary functions says that the flatter a function is, the more concentrated its Fourier transform and vice versa. Generalized Fourier transforms take this to an extreme. The Fourier transform of the flattest functions, i.e. constant functions, are multiples of the most concentrated function, the delta (generalized) function.

Click to learn more about consulting help with signal processing


Automatic delimiter sizes in LaTeX

I recently read a math book in which delimiters never adjusted to the size of their content or the level of nesting. This isn’t unusual in articles, but books usually pay more attention to typography.

Here’s a part of an equation from the book:

\varphi^{-1} (\int \varphi(f+g) ,d\mu)

Larger outer parentheses make the equation much easier to read, especially as part of a complex equation. It’s clear at a glance that the function φ-1 applies to the result of the integral.

\varphi^{-1} \left(\int \varphi(f+g) ,d\mu\right)

The first equation was typeset using

\varphi^{-1} ( \int \varphi(f+g) ,dmu )

The latter used left and right to tell LaTeX that the parentheses should grow to match the size of the content between them.

\varphi^{-1} \left( \int \varphi(f+g) ,d\mu \right)

You can use \left and \right with more delimiters than just parentheses: braces, brackets, ceiling, floor, etc. And the left and right delimiters do not need to match. You could make a half-open interval, for example, with \left( on one side and \right] on the other.

For every \left delimiter there must be a corresponding \right delimiter. However, you can make one of the pair empty by using a period as its mate. For example, you could start an expression with \left[ and end it with \right. which would create a left bracket as tall as the tallest thing between that bracket and the corresponding \right. command. Note that \right. causes nothing to be displayed, not even a period.

The most common example of a delimiter with no mate may be a curly brace on the left with no matching brace on the right. In that case you’d need to open with \left{. The backslash in front of the brace is necessary to tell LaTeX that you want a literal brace and that you’re not just using the brace for grouping.

* * *

For daily tips on LaTeX and typography, follow @TeXtip on Twitter.

TeXtip logo

Differentiating bananas and co-bananas

I saw a tweet this morning from Patrick Honner pointing to a blog post asking how you might teach derivatives of sines and cosines differently.

One thing I think deserves more emphasis is that “co” in cosine etc. stands for “complement” as in complementary angles. The cosine of an angle is the sine of the complementary angle. For any function f(x), its complement is the function f(π/2 – x).

When memorizing a table of trig functions and their derivatives, students notice a pattern. You can turn one formula into another by replacing every function with its co-function and adding a negative sign on one side. For example,

(d/dx) tan(x) = sec2(x)

and so

(d/dx) cot(x) = – csc2(x)

In words, the derivative of tangent is secant squared, and the derivative of cotangent is negative cosecant squared.

The explanation of this pattern has nothing to do with trig functions per se. It’s just the chain rule applied to f(π/2 – x).

(d/dx) f(π/2 – x) = – f‘(π/2 – x).

Suppose you have some function banana(x) and its derivative is kiwi(x). Then the cobanana function is banana(π/2 – x), the cokiwi function is kiwi((π/2 – x), and the derivative of cobanana(x) is –cokiwi(x). In trig-like notation

(d/dx) ban(x) = kiw(x)


(d/dx) cob(x) = – cok(x).

Now what is unique to sines and cosines is that the second derivative gives you the negative of what you started with. That is, the sine and cosine functions satisfy the differential equation y” = –y. That doesn’t necessarily happen with bananas and kiwis. If the derivative of banana is kiwi, that doesn’t imply that the derivative of kiwi is negative banana. If the derivative of kiwi is negative banana, then kiwis and bananas must be linear combinations of sines and cosines because all solutions to y” = –y have the form a sin(x) + b cos(x).

Footnote: Authors are divided over whether the cokiwi function should be abbreviated cok or ckw.

Related post: How many trig functions are there?

Pretty squiggles

Here’s an image that came out of something I was working on this morning. I thought it might make an interesting border somewhere.

The blue line is sin(x), the green line 0.7 sin(φ x), and the red line is their sum. Here φ is the golden ratio (1 + √5)/2. Even though the blue and green curves are both periodic, their sum is not because the ratio of their frequencies is irrational. So you could make this image as long as you’d like and the red curve would never exactly repeat.

Visualization, modeling, and surprises

This afternoon Hadley Wickham gave a great talk on data analysis. Here’s a paraphrase of something profound he said.

Visualization can surprise you, but it doesn’t scale well.
Modeling scales well, but it can’t surprise you.

Visualization can show you something in your data that you didn’t expect. But some things are hard to see, and visualization is a slow, human process.

Modeling might tell you something slightly unexpected, but your choice of model restricts what you’re going to find once you’ve fit it.

So you iterate. Visualization suggests a model, and then you use your model to factor out some feature of the data. Then you visualize again.

Related posts:

For daily tips on data science, follow @DataSciFact on Twitter.

DataSciFact twitter icon

Overconfidence pays

From Thinking, Fast and Slow:

Experts who acknowledge the full extent of their ignorance may expect to be replaced by more confident competitors who are better able to gain the trust of clients.

I believe Hanlon’s razor applies here: ignorance is a better explanation than dishonesty. I imagine most overconfident predictions are sincere. Unfortunately, sincere ignorance is often rewarded.

Related posts:

Randomized studies of productivity

A couple days ago I wrote a blog post quoting Cal Newport suggesting that four hours of intense concentration a day is as much as anyone can sustain. That post struck a chord and has gotten a lot of buzz on Hacker News and Twitter. Most of the feedback has been agreement, but a few people have complained that this four-hour limit is based only on anecdotes, not scientific data.

Realistic scientific studies of productivity are often not feasible. For example, people often claim that programming language X makes them more productive than language Y. How could you conduct a study where you randomly assign someone a programming language to use for a career? You could do some rinky-dink study where you have 30 CS students do an artificial assignment using language X and 30 using Y. But that’s not the same thing, not by a long shot.

If someone, for example Rich Hickey, says that he’s three times more productive using one language than another, you can’t test that assertion scientifically. But what you can do is ask whether you think you are similar to that person and whether you work on similar problems. If so, maybe you should give their recommended language a try.

Suppose you wanted to test whether people are more productive when they concentrate intensely for two hours in the morning and two hours in the afternoon. You couldn’t just randomize people to such a schedule. That would be like randomizing some people to run a four-minute mile. Many people are not capable of such concentration. They either lack the mental stamina or the opportunity to choose how they work. So you’d have to start with people who have the stamina and opportunity to work the way you want to test. Then you’d randomize some of these people to working longer, fractured work days. Is that even possible? How would you keep people from concentrating? Harrison Bergeron anyone? And if it’s possible, would it be ethical?

Real anecdotal evidence is sometimes more valuable than artificial scientific data. As Tukey said, it’s better to solve the right problem the wrong way than to solve the wrong problem the right way.

Related posts: