How nonlinearity affects a pendulum

The equation of motion for a pendulum is the differential equation

\theta'' + \frac{g}{\ell}\sin \theta = 0

where g is the acceleration due to gravity and ℓ is the length of the pendulum. When this is presented in an introductory physics class, the instructor will immediately say something like “we’re only interested in the case where θ is small, so we can rewrite the equation as

\theta'' + \frac{g}{\ell} \theta = 0

Questions

This raises a lot of questions, or at least it should.

  1. Why not leave sin θ alone?
  2. What justifies replacing sin θ with just θ?
  3. How small does θ have to be for this to be OK?
  4. How do the solutions to the exact and approximate equations differ?

First, sine is a nonlinear function, making the differential equation nonlinear. The nonlinear pendulum equation cannot be solved using mathematics that students in an introductory physics class have seen. There is a closed-form solution, but only if you extend “closed-form” to mean more than the elementary functions a student would see in a calculus class.

Second, the approximation is justified because sin θ ≈ θ when θ is small. That’s true, but it’s kinda subtle. Here’s a post unpacking that.

The third question doesn’t have a simple answer, though simple answers are often given. An instructor could make up an answer on the spot and say “less than 10 degrees” or something like that. A more thorough answer requires answering the fourth question.

I address how nonlinear affects the solutions in a post a couple years ago. This post will expand a bit on that post.

Longer period

The primary difference between the nonlinear and linear pendulum equations is that the solutions to the nonlinear equation have longer periods. The solution to the linear equation is a cosine. Solving the equation determines the frequency, amplitude, and phase shift of the cosine, but qualitatively it’s just a cosine. The solution to the corresponding nonlinear equation, with sin θ rather than θ, is not exactly a cosine, but it looks a lot like a cosine, only the period is a little longer [1].

OK, the nonlinear pendulum has a longer period, but how much longer? The period is increased by a factor f0) where θ0 is the initial displacement.

You can find the exact answer in my earlier post. The exact answer depends on a special function called the “complete elliptic integral of the first kind,” but to a good approximation

f(\theta) \approx \frac{1}{\sqrt{\cos(\theta/2)}}

The earlier post compares this approximation to the exact function.

Linear solution with adjusted period

Since the nonlinear pendulum equation is roughly the same as the linear equation with a longer period, you can approximate the solution to the nonlinear equation by solving the linear equation but increasing the period. How good is that approximation?

Let’s do an example with θ0 = 60° = π/3 radians. Then sin θ0 = 0.866 but θ0, in radians, is 1.047, so we definitely can’t say sin θ0 is approximately θ0. To make things simple, let’s set ℓ = g. Also, assume the pendulum starts from rest, i.e. θ'(0) = 0.

Here’s a plot of the solutions to the nonlinear and linear equations.

Obviously the solution to the nonlinear equation has a longer period. In fact it’s 7.32% longer. (The approximation above would have estimated 7.46%.)

Here’s a plot comparing the solution of the nonlinear equation and the solution to the linear equations with period stretched by 7.32%.

The solutions differ by less than the width of the plotting line, so it’s too small to see. But we can see there’s a difference when we subtract the two solutions.

Here’s a plot of the solutions to the nonlinear and linear equations.

Related posts

[1] The period of a pendulum depends on its length ℓ, and so we can think of the nonlinear term effectively replacing ℓ by a longer effective length ℓeff.

 

Approximation to solve an oblique triangle

The previous post gave a simple and accurate approximation for the smaller angle of a right triangle. Given a right triangle with sides ab, and c, where a is the shortest side and c is the hypotenuse, the angle opposite side a is approximately

A \approx \frac{3a}{b + 2c}

in radians. The previous post worked in degrees, but here we’ll use radians.

If the triangle is oblique rather than a right triangle, there an approximation for the angle A that doesn’t require inverse trig functions, though it does require square roots. The approximation is derived in [1] using the same series that is the basis of the approximation in the earlier post, the power series for 2 csc(x) + cot(x).

For an oblique triangle, the approximation isA \approx \frac{6 \sqrt{(s - b)(s - c)}}{2\sqrt{bc} + \sqrt{s(s-a)}}

where s is the semiperimeter.

s = \frac{a + b + c}{2}

For comparison, we can find the exact value of A using the law of cosines.

a^2 = b^2 + c^2 - 2 bc \cos A

and so

A = \cos^{-1}\left(\frac{b^2 + c^2 - a^2}{2bc}\right)

Here’s a little Python script to see how accurate the approximation is.

from math import sqrt, acos

def approx(a, b, c):
    "approximate the angle opposite a"
    s = (a + b + c)/2
    return 6*sqrt((s - b)*(s - c)) / (2*sqrt(b*c) + sqrt(s*(s - a)))

def exact(a, b, c):
    "exact value of the angle opposite a"    
    return acos((b**2 + c**2 - a**2)/(2*b*c))

a, b, c = 6, 7, 12
print( approx(a, b, c) )
print( exact(a, b, c) )

This prints

0.36387538476776243
0.36387760856668505

showing that in our example the approximation is good to five decimal places.

[1] H. E. Stelson. Note on the approximate solution of an oblique triangle without tables. American Mathematical Monthly. Vol 56, No. 2 (February, 1949), pp. 84–95.

Simple approximation for solving a right triangle

Suppose you have a right triangle with sides ab, and c, where a is the shortest side and c is the hypotenuse. Then the following approximation from [1] for the angle A opposite side a seems too simple and too accurate to be true. In degrees,

Aa 172° / (b + 2c).

The approximation above only involves simple arithmetic. No trig functions. Not even a square root. It could be carried out with pencil and paper or even mentally. And yet it is surprisingly accurate.

If we use the 3, 4, 5 triangle as an example, the exact value of the smallest angle is

A = arctan(3/4) × 180°/π ≈ 36.8699°

and the approximate value is

A ≈  3 × 172° / (4 + 2×5) = 258°/7 ≈ 36.8571°,

a difference of 0.0128°. When the angle is more acute the approximation is even better.

Derivation

Where does this magical approximation come from? It boils down to the series

2 csc(x) + cot(x) = 3/xx³/60 + O(x4)

where x is in radians. When x is small,  x³/60 is extremely small and so we have

2 csc(x) + cot(x) ≈ 3/x.

Apply this approximation with csc(x) = c/a and cot(x) = b/a. and you have

x ≈ 3a/(b + 2c)

in radians. Multiply by 180°/π to convert to degrees, and note that 540/π ≈ 172.

Discovery

It’s unmotivated to say “just expand 2 csc(x) + cot(x) in a series.” Where did that come from?

There’s a line in [1] that says “It can been seen, either from tables or from a consideration of power series that the radian measure of a small angle lies approximately one-third of the way from the sine to the tangent.” In other words

3x ≈ 2 sin(x) + tan(x).

You can verify that by adding the power series and noting that the cubic terms cancel out.

But that’s just the beginning. The author then makes the leap to conjecturing that if the weighted arithmetic mean gives a good approximation, maybe the weighted harmonic mean gives an even better approximation, and that leads to considering

2 csc(x) + cot(x) ≈ 3/x.

Extension

See the next post for an extension to oblique triangles. Not as simple, but based on the same trick.

 

[1] J. S. Frame. Solving a right triangle without tables. The American Mathematical Monthly, Vol. 50, No. 10 (Dec., 1943), pp. 622-626

More on Newton’s diameter theorem

A few days ago I wrote a post on Newton’s diameter theorem. The theorem says to plot the curve formed by the solutions to f(x, y) = 0 where f is a polynomial in x and y of degree n. Next plot several parallel lines that cross the curve at n points and find the centroid of the intersections on each line. Then the centroids will fall on a line.

The previous post contained an illustration using a cubic polynomial and three evenly spaced parallel lines. This post uses a fifth degree polynomial, and shows that the parallel lines need not be evenly spaced.

In this post

f(xy) = y³ + yx (x + 1) (x + 2) (x − 3) (x − 2).

Here’s an example of three lines that each cross the curve five times.

The lines are yxk where k = 0.5, −0.5, and −3. The coordinates of the centroids are (0.4, 0.9), (0.4, -0.1), and (0.4, -2.6).

And to show that the requirement that the lines cross five times is necessary, here’s a plot where one of the parallel lines only crosses three times. The top line is now yx + 2 and the centroid on the top line moved to (0.0550019, 2.055).

Intersecting spheres and GPS

If you know the distance d to a satellite, you can compute a circle of points that passes through your location. That’s because you’re at the intersection of two spheres—the earth’s surface and a sphere of radius d centered on the satellite—and the intersection of two spheres is a circle. Said another way, one observation of a satellite determines a circle of possible locations.

If you know the distance to a second satellite as well, then you can find two circles that contain your location. The two circles intersect at two points, and you know that you’re at one of two possible positions. If you know your approximate position, you may be able to rule out one of the intersection points.

If you know the distance to three different satellites, now you know three circles that you’re standing on, and the third circle will only pass through one of the two points determined by the first two satellites. Now you know exactly where you are.

Knowing the distance to more satellites is even better. In theory additional observations are redundant but harmless. In practice, they let you partially cancel out inevitable measurement errors.

If you’re not on the earth’s surface, you’re still at the intersection of n spheres if you know the distance to n satellites. If you’re in an airplane, or on route to the moon, the same principles apply.

Errors and corrections

How do you know the distance to a satellite? The satellite can announce what time it is by its clock, then when you receive the announcement you compare it to the time by your clock. The difference between the two times tells you how long the radio signal traveled. Multiply by the speed of light and you have the distance.

However, your clock will probably not be exactly synchronized with the satellite clock. Observing a fourth satellite can fix the problem of your clock not being synchronized with the satellite clocks. But it doesn’t fix the more subtle problems of special relativity and general relativity. See this post by Shri Khalpada for an accessible discussion of the physics.

Numerical computation

Each distance measurement gives you an equation:

|| xsi || = di

where si is the location of the ith satellite and di is your distance to that satellite. If you square both sides of the equation, you have a quadratic equation. You have to solve a system of nonlinear equations, and yet there is a way to transform the problem into solving linear equations, i.e. using linear algebra. See this article for details.

Related posts

Finding a parabola through two points with given slopes

The Wikipedia article on modern triangle geometry has an image labeled “Artzt parabolas” with no explanation.

A quick search didn’t turn up anything about Artzt parabolas [1], but apparently the parabolas go through pairs of vertices with tangents parallel to the sides.

The general form of a conic section is

ax² + bxy + cy² + dx + ey + f = 0

and the constraint b² = 4ac means the conic will be a parabola.

We have 6 parameters, each determined only up to a scaling factor; you can multiply both sides by any non-zero constant and still have the same conic. So a general conic has 5 degrees of freedom, and the parabola condition b² = 4ac takes us down to 4. Specifying two points that the parabola passes through takes up 2 more degrees of freedom, and specifying the slopes takes up the last two. So it’s plausible that there is a unique solution to the problem.

There is indeed a solution, unique up to scaling the parameters. The following code finds parameters of a parabola that passes through (xi, yi) with slope mi for i = 1, 2.

def solve(x1, y1, m1, x2, y2, m2):
    
    Δx = x2 - x1
    Δy = y2 - y1
    λ = 4*(Δx*m1 - Δy)*(Δx*m2 - Δy)/(m1 - m2)**2
    k = x2*y1 - x1*y2

    a = Δy**2 + λ*m1*m2
    b = -2*Δx*Δy - λ*(m1 + m2)
    c = Δx**2 + λ
    d =  2*k*Δy + λ*(m1*y2 + m2*y1 - m1*m2*(x1 + x2))
    e = -2*k*Δx + λ*(m1*x1 + m2*x2 - y1 - y2)
    f = k**2 + λ*(m1*x1 - y1)*(m2*x2 - y2)

    return (a, b, c, d, e, f)

[1] The page said “Artz” when I first looked at it, but it has since been corrected to “Artzt”. Maybe I didn’t find anything because I was looking for the wrong spelling.

Mathematical minimalism

Andrzej Odrzywolek recently posted an article on arXiv showing that you can obtain all the elementary functions from just the function

\operatorname{eml}(x,y) = \exp(x) - \log(y)

and the constant 1. The following equations, taken from the paper’s supplement, show how to bootstrap addition, subtraction, multiplication, and division from the eml function.

\begin{align*} \exp(z) &\mapsto \operatorname{eml}(z,1) \\ \log(z) &\mapsto \operatorname{eml}(1,\exp(\operatorname{eml}(1,z))) \\ x - y &\mapsto \operatorname{eml}(\log(x),\exp(y)) \\ -z &\mapsto (\log 1) - z \\ x + y &\mapsto x - (-y) \\ 1/z &\mapsto \exp(-\log z) \\ x \cdot y &\mapsto \exp(\log x + \log y) \end{align*}

See the paper and supplement for how to obtain constants like π and functions like square and square root, as well as the standard circular and hyperbolic functions.

Related posts

Lunar period approximations

The date of Easter

The church fixed Easter to be the first Sunday after the first full moon after the Spring equinox. They were choosing a date in the Roman (Julian) calendar to commemorate an event whose date was known according to the Jewish lunisolar calendar, hence the reference to equinoxes and full moons.

The previous post explained why the Eastern and Western dates of Easter differ. The primary reason is that both churches use March 21 as the first day of Spring, but the Eastern church uses March 21 on the Julian calendar and the Western church uses March 21 on the Gregorian calendar.

But that’s not the only difference. The churches chose different algorithms for calculating when the first full moon would be. The date of Easter doesn’t depend on the date of the full moon per se, but the methods used to predict full moons.

This post will show why determining the date of the full moon is messy.

Lunation length

The moon takes between 29 and 30 days between full moons (or between new moons, which are easier to objectively measure). This period is called a lunation. The average length of a lunation is L = 29.530588853 days. This is not a convenient number to work with, and so there’s no simple way of reconciling the orbital period of the moon with the rotation period of the earth [1]. Lunar calendars alternate months with 29 and 30 days, but they can’t be very accurate, so they have to have some fudge factor analogous to leap years.

The value of L was known from ancient times. Meton of Athens calculated in 432 BC that 235 lunar cycles equaled 19 tropical years or 6940 days. This corresponds to L ≈ 29.5319. Around a century later the Greek scholar Callippus refined this to 940 cycles in 76 years or 27,759 days. This corresponds to L ≈ 29.53085.

The problem wasn’t knowing L but devising a convenient way of working with L. There is no way to work with lunations that is as easy as the way the Julian (or even the more complicated Gregorian) calendar reconciles days with years.

Approximations

Let’s look at the accuracy of several approximations for L. We’d like an approximation that is not only accurate in an absolute sense, but also accurate relative to its complexity. The complexity of a fraction is measured by a height function. We’ll use what’s called the “classic” height function: log( max(n, d) ) where n and d are the numerator and denominator of a fraction. Since we’re approximating a number bigger than 1, this will be simply log(n).

We will compare the first five convergents, approximations that come from the continued fraction form of L, and the approximations of Meton and Callippus. Here’s a plot.

And here’s the code that produced the plot, showing the fractions used.

from numpy import log
import matplotlib.pyplot as plt

fracs = [
    (30, 1), 
    (59, 2),
    (443, 15),
    (502, 17),
    (1447, 49),
    (6940, 235),
    (27759, 940)
]

def error(n, d):
    L = 29.530588853    
    return abs(n/d - L)

for f in fracs:
    plt.plot(log(f[0]), log(error(*f)), 'o')
plt.xlabel("log numerator")
plt.ylabel("log error")
plt.show()

The approximation 1447/49 is the best by far, both in absolute terms and relative to the size of the numerator. But it’s not very useful for calendar design because 1447 is not nicely related to the number of days in a year.

 

[1] The time between full moons is a synodic month, the time it takes for the moon to return to the same position relative to the sun. This is longer than a sidereal month, the time it takes the moon to complete one orbit relative to the fixed stars.

The gap between Eastern and Western Easter

Today is Orthodox Easter. Western churches celebrated Easter last week. Why are the Eastern and Western dates of Easter different? Is Eastern Easter always later than Western Easter? How far apart can the two dates be?

Why the dates differ

Easter is on the first Sunday after the first full moon in Spring [1]. East and West agree on this. What they disagree on is the details of “full moon” and “Spring.” The dates are not based on precise astronomical measurements but rather on astronomical approximations codified long ago.

Spring begins on March 21 for the purposes of calculating Easter. But the Western church uses March 21 on the Gregorian calendar and the Eastern church uses March 21 on the Julian calendar. This mostly accounts for the difference between Eastern and Western dates for Easter. East and West also use slightly different methods of approximating when the moon will be full. More on that in the next post.

Pascha never comes before Easter

The Eastern name for Easter is Pascha. Eastern Pascha and Western Easter can occur on the same day, but otherwise Pascha is always later, never earlier. This is because the Julian year is longer than the Gregorian year, causing fixed dates on the former calendar to occur after the later. Also, the Eastern method of approximating the date of the Paschal full moon gives a later date than the Western method.

The Julian calendar has exactly 365 1/4 days. The Gregorian calendar has 365 97/400 days; centuries are not leap years unless they’re divisible by 400. This complication in the Gregorian calendar was necessary to match the solar year. The date March 21 on the Julian calendar is drifting later in the year from the perspective of the Gregorian calendar, moving further past the astronomical equinox [2].

Size of the gap

Eastern and Western dates of Easter can coincide. The were the same last year, and will be the same again in 2028. The gap is always a whole number of weeks because Easter is always on a Sunday.

The gap is usually 1 week. It can be 0, 4, or 5 weeks, but never 2 or 3 weeks.

This is the pattern for now. Sometime in the distant future the Julian and Gregorian calendars will diverge further than the gaps will increase. Presumably Orthodox churches will make some sort of adjustment before the Julian date March 21 drifts into summer or fall.

Related posts

[1] The reason for this definition is that Christ was crucified at the time of the Passover. Due to the lunisolar design of the Jewish calendar, this would have been during the first full moon after the Spring equinox. Christ rose from the dead the Sunday following the crucifixion, so Easter is on the first Sunday after the first full moon of Spring.

[2] The Julian and Gregorian calendars currently differ by 13 days, and they’re drifting apart at the rate of 3 days every 400 years. Somewhere around 47,000 years from now the two calendars will agree again, sorta, because the Julian calendar will be a full year behind the Gregorian calendar.

Distribution of digits in fractions

There’s a lot of mathematics just off the beaten path. You can spend a career in math and yet not know all there is to know about even the most basic areas of math. For example, this post will demonstrate something you may not have seen about decimal forms of fractions.

Let p > 5 be a prime number and 0 < k < p. Then the digits in k/p might be the same for all k, varying only by cyclic permutations. This is the case, for example, when p = 7 or p = 17. More on these kinds of fractions here.

The digits in k/p repeat for every k, but different values of k might have sequences of digits that vary by more than cyclic permutations. For example, let’s look at the values of k/13.

>>> for i in range(1, 13):
...   print(i/13)
...
 1 0.0769230769230769
 2 0.1538461538461538
 3 0.2307692307692307
 4 0.3076923076923077
 5 0.3846153846153846
 6 0.4615384615384615
 7 0.5384615384615384
 8 0.6153846153846154
 9 0.6923076923076923
10 0.7692307692307693
11 0.8461538461538461
12 0.9230769230769231

One cycle goes through the digits 076923. You’ll see this when k = 1, 3, 4, 9, 10, or 11. The other cycle goes through 153846 for the rest of the values of k. The cycles 076923 and 153846 are called the distinct repeating sets of 13 in [1].

If we look at fractions with denominator 41, thee are six distinct repeating sets.

02439
04878
07317
09756
12195
14634
26829
36585

You could find these by modifying the Python code above. However, in general you’ll need more than default precision to see the full periods. You might want to shift over to bc, for example.

When you look at all the distinct repeating sets of a prime number, all digits appear almost the same number of times. Some digits may appear one more time than others, but that’s as uneven as you can get. A corollary in [1] states that if p = 10q + r, with 0 < r < 10, then 11 − r digits appear q times, and r − 1 digits appear q + 1 times.

Looking back at the example with p = 13, we have q = 1 and r = 3. The corollary says we should expect 8 digits to appear once and 2 digits to appear twice. And that’s what we see: in the sets 076923 and 153846 we have 3 and 6 repeated twice and the remaining 8 digits appear once.

In the example with p = 41, we have q = 4 and r = 1. So we expect all 10 digits to appear 4 times, which is the case.

Related posts

[1] James K. Schiller. A Theorem in the Decimal Representation of Rationals. The American Mathematical Monthly
Vol. 66, No. 9 (Nov., 1959), pp. 797-798