Approximating a spiral by rings

An Archimedian spiral has the polar equation

r = b θ1/n

This post will look at the case n = 1. I may look at more general values of n in a future post. The case n = 1 is the simplest case, and it’s the case I needed for the client project that motivated this post.

In this case the spacing between points where the spiral crosses an axis is constant. Call this constant h. Then

h = 2πb.

For example, when rolling up a carpet, h corresponds to the thickness of the carpet.

Suppose θ runs from 0 to 2πm, wrapping around the origin m times. We could approximate the spiral by m concentric circles of radius h, 2h, 3h, …, mh. To visualize this, we’re approximating the length of the red spiral on the left with that of the blue circles on the right.

Comparing Archimedes spiral and concentric circles

We could approximate this further by saying we have m/2 circles whose average radius is πmb. This suggests the length of the spiral should be approximately


How good is this approximation? What happens to the relative error as θ increases? Intuitively, each wrap around the origin is more like a circle as θ increases, so we’d expect the approximation to improve for large θ.

According to Mathworld, the exact length of the spiral is

πbm √(1 + (2πm)²) + b arcsinh(2πm) /2

When m is so large that we can ignore the 1 in √(1 + (2πm)²) then the first term is the same as the circle approximation, and all that’s left is the arcsinh term, which is on the order of log m because

arcsinh(x) = log(x + (1 + x²)1/2).

So for large m, the arc length is on the order of m² while the error is on the order of log m. This means the relative error is O( log(m) / m² ). [1]

We’ve assumed m was an integer because that makes it easier to visual approximating the spiral by circles, but that assumption is not necessary. We could restate the problem in terms of the final value of θ. Say θ runs from 0 to T. Then we could solve

T = 2πm

for m and say that the approximate arc length is

½ bT²

and the exact length is

½ bT(1 + T²)1/2 + ½ b arcsinh(T).

The relative approximation error is O( log(T) / T² ).

Here’s a plot of the error as a function of T assuming b = 1.

Related posts

[1] The error in approximating √(1 + (2πm)²) with 2πm is on the order of 1/(4πm) and so is smaller than the logarithmic term.

Distance from a point to a line

Eric Lengyel’s new book Projective Geometric Algebra Illuminated arrived yesterday and I’m enjoying reading it. Imagine if someone started with ideas like dot products, cross products, and determinants that you might see in your first year of college, then thought deeply about those things for years. That’s kinda what the book is.

Early in the book is the example of finding the distance from a point q to a line of the form p + tv.

If you define u = q − p then a straightforward derivation shows that the distance d from q to the line is given by

d = \sqrt{{\bold u}^2 - \frac{({\bold u}\cdot {\bold v})^2}{{\bold v}^2}}

But as the author explains, it is better to calculate d by

d = \sqrt{\frac{({\bold u}\times {\bold v})^2}{{\bold v}^2}}

Why is that? The two expressions are algebraically equal, but the latter is better suited for numerical calculation.

The cardinal rule of numerical calculation is to avoid subtracting nearly equal floating point numbers. If two numbers agree to b bits, you may lose up to b bits of significance when computing their difference.

If u and v are vectors with large magnitude, but q is close to the line, then the first equation subtracts two large, nearly equal numbers under the square root.

The second equation involves subtraction too, but it’s less obvious. This is a common theme in numerical computing. Imagine this dialog.

[Student produces first equation.]

Mentor: Avoid subtracting nearly equal numbers.

[Student produces section equation.]

Student: OK, did it.

Mentor: That’s much better, though it could still have problems.

Where is there a subtraction in the second equation? We started with a subtraction in defining u. More subtly, the definition of cross product involves subtractions. But these subtractions involve smaller numbers than the first formula, because the first formula subtracts squared values. Eric Lengyel points this out in his book.

None of this may matter in practice, until it does matter, which is a common pattern in numerical computing. You implement something like the first formula, something that can be derived directly. You implicitly have in mind vectors whose magnitude is comparable to d and this guides your choice of unit tests, which all pass.

Some time goes by and a colleague tells you your code is failing. Impossible! You checked your derivation by hand and in Mathematica. Your unit tests all pass. Must be your colleague’s fault. But it’s not. Your code would be correct in infinite precision, but in an actual computer it fails on inputs that violate your implicit assumptions.

This can be frustrating, but it can also be fun. Implementing equations from a freshman textbook accurately, efficiently, and robustly is not a freshman-level exercise.

Related posts

Bounding the perimeter of a triangle between circles

Suppose you have a triangle and you know the size of the largest circle that can fit inside (the incircle) and the size of the smallest circle that can fit outside (the circumcircle). How would you estimate the perimeter of the triangle?

In terms of the figure below, if you know the circumference of the red and blue circles, how could you estimate the perimeter of the black triangle?

Triangle with inscribed and circumscribed circles

A crude estimate is that the triangle perimeter must be greater than the incircle circumference and less than the circumcircle circumference. But we can do better.

Gerretsen’s inequalities

It is conventional in this kind of problem to work with the semiperimeter of the triangle, half the perimeter, rather than the perimeter. Let r be the radius of the incircle, R the radius of the circumcircle, and s the semiperimeter of the triangle. Then Gerretsen’s inequalities say

16Rr − 5r² ≤ s² ≤ 4R² + 4Rr + 3r²

In the figure above,

r = 1.058, R = 2.5074, s = 6.1427

and Gerretsen’s inequalities give

36.8532  ≤ s² = 37.7327 ≤ 39.1200.

In the case of an equilateral triangle, Gerretsen’s inequalities are in fact equations.

Note that Gerretsen’s inequalities say nothing about the centers of the circles. An incircle must be inside a circumcircle, but for a variety of triangles with the centers of the two circles in different relations you have the same bounds.

Kooi’s inequality

Kooi’s inequality [2] gives another upper bound for the perimeter of the triangle:

s² ≤ ½ R(4R + r)² / (2Rr)

which in the example above gives a tighter upper bound, 38.9515.

Kooi’s upper bound is uniformly better than the upper bound half of Gerretsen’s inequalities. But further refinements are possible [3].

Related posts

[1] J. C. H. Gerretsen, Ongelijkheden in de driehoek. Nieuw Tijdschr 41 (1953), 1–7.

[2] O. Kooi, Inequalities for the triangle, Simon Stevin, 32 (1958), 97–101.

[3] Martin Lukarevski and Dan Stefan Marinescu. A refinement of the Kooi’s inequality, Mittenpunkt and applications. Journal of Mathematical Applications. Volume 13, Number 3 (2019), 827–832

Area of quadrilateral as a determinant

I’ve written several posts about how determinants come up in geometry. These determinants often look similar, having columns related to coordinates and a column of ones. You can find several examples here along with an explanation for this pattern.

If you have three points z1, z2, and z3 in the complex plane, you can find the area of a triangle with these points as vertices

A(z_1, z_2, z_3) = \frac{i}{4} \, \left| \begin{matrix} z_1 & \bar{z}_1 & 1 \\ z_2 & \bar{z}_2 & 1 \\ z_3 & \bar{z}_3 & 1 \\ \end{matrix} \right|

You can read more about this here.

If you add the second column to the first, and subtract the first column from the second, you can get the equation for the area of a triangle in the real plane.

A = \frac{1}{2} \, \left| \begin{matrix} x_1 & y_1 & 1 \\ x_2 & y_2 & 1 \\ x_3 & y_3 & 1 \\ \end{matrix} \right|

Presumably the former equation was first derived from the latter, but the two are equivalent.

Now suppose you have a quadrilateral whose vertices are numbered in clockwise or counterclockwise order. Then the area is given by

A = \frac{1}{2} \, \left| \begin{matrix} x_1 & y_1 & 1 & 0\\ x_2 & y_2 & 1 & 1\\ x_3 & y_3 & 1 & 0\\ x_4 & y_4 & 1 & 1\\ \end{matrix} \right|

The proof is easy. If you expand the determinant by minors on the last column, you get the sum of two 3 × 3 determinants corresponding to the areas of the two triangles formed by splitting the quadrilateral along the diagonal connecting the first and third points.

Kepler triangle

A Kepler triangle is a right triangle whose sides are in geometric progression. That is, if the sides have length a < b < c, then b/a = c/b = k.

All Kepler triangles are similar because the proportionality constant k can only take on one value. To see this, we first pick our units so that a = 1. Then b = k and c = k². By the Pythagorean theorem

a² + b² = c²

and so

1 + k2 = k4

which means k² equals the golden ratio φ.

Here’s a nice geometric property of the Kepler triangle proved in [1].

Go around the triangle counterclockwise placing a point on each side dividing the side into pieces that are in golden proportion. Connect these three points with the opposite vertex. Then the triangle formed by the intersections of these line segments is also a Kepler triangle.

On each side, the ratio of the length of the green segment to the blue segment is φ. The grey triangle in the middle is another Kepler triangle.

The rest of this post will present the code that was used to create the image above.

We’ll need the following imports.

import matplotlib.pyplot as plt
from numpy import array
from numpy.linalg import solve

We’ll also need to define the golden ratio, a function to split a line segment into golden proportions, and a function to draw a line between two points.

φ = (1 + 5**0.5)/2

def golden_split(pt1, pt2):
    return (1/φ)*pt1 + (1 - 1/φ)*pt2

def connect(pt1, pt2, style):
    plt.plot([pt1[0], pt2[0]], [pt1[1], pt2[1]], style)

Now we can draw the figure.

A = array([0, 1])
B = array([0, 0])
C = array([φ**0.5, 0])

X = golden_split(A, B)
Y = golden_split(B, C)
Z = golden_split(C, A)

connect(A, X, "b")
connect(X, B, "g")
connect(B, Y, "b")
connect(Y, C, "g")
connect(C, Z, "b")
connect(Z, A, "g")
connect(A, Y, "grey")
connect(B, Z, "grey")
connect(C, X, "grey")


We can show algebraically that the golden_split works as claimed, but here is a numerical illustration.

assert(abs( (C[0] - Y[0]) / (Y[0] - A[0]) - φ) < 1e-14)

Similarly, we can show numerically what [1] proves exactly, i.e. that the triangle in the middle is a Kepler triangle.

from numpy.linalg import solve, norm

def intersect(pt1, pt2, pt3, pt4):
    # Find the intersection of the line joining pt1 and pt2
    # with the line joining pt3 and pt4.
    m1 = (pt2[1] - pt1[1])/(pt2[0] - pt1[0])
    m3 = (pt4[1] - pt3[1])/(pt4[0] - pt3[0])
    A = array([[m1, -1], [m3, -1]])
    rhs = array([m1*pt1[0]-pt1[1], m3*pt3[0]-pt3[1]])
    x = solve(A, rhs)
    return x

I = intersect(A, Y, C, X)
J = intersect(A, Y, B, Z)
K = intersect(B, Z, C, X)

assert( abs( norm(I - J)/norm(J - K) - φ**0.5 ) < 1e-14 )
assert( abs( norm(I - K)/norm(I - J) - φ**0.5 ) < 1e-14 )

Related posts

[1] Jun Li. Some properties of the Kepler triangle. The Mathematical Gazette, November 2017, Vol. 101, No. 552, pp. 494–495

Schwarz lemma, Schwarz-Pick theorem, and Poincare metric

Let D be the open unit disk in the complex plane. The Schwarz lemma says that if f is an analytic function from D to D with f(0) = 0, then

|f(z)| \leq |z|

for all z in D. The lemma also says more, but this post will focus on just this portion of the theorem.

The Schwarz-Pick theorem generalizes the Schwarz lemma by not requiring the origin to be fixed. That is, it says that if f is an analytic function from D to D then

\left| \frac{f(z) - f(w)}{1 - f(z)\,\overline{f(w)}} \right| \leq \left| \frac{z - w}{1 - z\,\overline{w}}\right|

The Schwarz-Pick theorem also concludes more, but again we’re focusing on part of the theorem here. Note that if f(0) = 0 then the Schwarz-Pick theorem reduces to the Schwarz lemma.

The Schwarz lemma is a sort of contraction theorem. Assuming f(0) = 0, the lemma says

|f(z) - f(0)| \leq |z - 0|

This says applying f to a point cannot move the point further from 0. That’s interesting, but it would be more interesting if we could say f is a contraction in general, not just with respect to 0. That is indeed what the Schwarz-Pick theorem does, though with respect to a new metric.

For any two points z and w in the open unit disk D, define the Poincaré distance between z and w by

d(z,w) = \tanh^{-1}\left( \left| \frac{z - w}{1 - z\overline{w}}\right| \right)

It’s not obvious that this is a metric, but it really is. As is often the case, most of the properties of a metric are simple to confirm, but the proving the triangle inequality is the hard part.

If we apply the monotone function tanh-1 to both sides of the Schwarz-Pick theorem, then we have that any analytic function f from D to D is a contraction on D with respect to the Poincaré metric.

Here we’re using “contraction” in the lose sense. It would be more explicit to say that f is a non-expansive map. Applying f to a pair of points may not bring the points closer together, but it cannot move them any further apart (with respect to the Poincaré metric).

By using the Poincaré metric, we turn the unit disk into a hyperbolic space. That is D with the metric d is a model of the hyperbolic plane.

Related posts

Stability of a superegg

Three weeks ago I wrote about supereggs, a shape popularized by Piet Hein.

Brass superegg by Piet Hein

One aspect of supereggs that I did not address is their stability. Looking at the photo above, you could imagine that if you gave the object a slight nudge it would not fall over. Your intuition would be right: supereggs are stable.

More interesting than asking whether supereggs are stable is to aks how stable they are. An object is stable if for some ε > 0, a perturbation of size less than ε will not cause the object to fall over.

All supereggs are stable, but the degree of stability decreases as the ratio of height to width increases: the taller the superegg, the easier it is to knock over.

How can we quantify stability? An object is stable if its center of curvature, measured at the point of contact with a horizontal surface, is above its center of mass.

For a sphere, the center of curvature is exactly the center of mass. If we modify the sphere to make it slightly flatter at the point of contact, the center of curvature will move above the center of mass and the modified sphere will be stable. On the other hand, if we made the sphere slightly more curved at the point of contact, the center of curvature would move below the center of mass and the object would be unstable.

The center of curvature is the center of the sphere that best approximates a surface at a point. If our object is a sphere, the hypothetical sphere defining the curvature is the object itself. For an object flatter on bottom than a sphere, the sphere defining center of curvature is larger than the object. The flatter the object, the larger the approximating sphere.

The superegg has zero curvature at the bottom, and so its center of curvature is at infinity. But if you push the superegg slightly, the part touching the horizontal surface is no longer the exact center, the curvature is slightly positive, and so the radius of curvature is finite. The center of mass also moves up slightly as the superegg rocks to the side.

So the center of curvature moves down and the center of mass moves up. At what point do they cross and the object becomes unstable?

Solution outline

The equation of the surface of the superegg is

\left(\sqrt{x^2 + y^2}\right)^p + |z/h|^p = 1

where p > 2 (a common choice is p = 5/2) and h > 1 (a common choice is h = 4/3).

If we tilt the superegg so that its axis of symmetry now makes a small angle θ with the z-axis, we have to answer several questions.

  1. Where does the center of mass go?
  2. What is the new point of contact with the horizontal surface?
  3. Where is the center of curvature?
  4. Is the center of curvature above or below the center of mass?

It’s easier to imagine lifting the superegg up from the surface, rotating it in the air, then lowering it to the surface. That way you don’t have to describe how the superegg rocks.

The superegg is radially symmetric about the vertical axis, so without loss of generality you can imagine the problem is limited to the xz plane.

It’s messy to work out the details, but in principle you could work out how much the superegg can be perturbed and return to its original position. You know a priori that the result will be a decreasing function of h and p.

Related posts

[1] Photo by Malene Thyssen, licensed under the Creative Commons Attribution-Share Alike 3.0 Unported license.

Johnson circle theorem

Draw three circles of radius r that intersect at a single point. Then draw a triangle connecting the remaining three points of intersection.

(Each pair of circles intersects in two points, one of which is the point where all three circles intersect, so there are three other intersection points.)

Then the circumcircle of the triangle, the circle through the three vertices, also has radius r.

I’ve seen this theorem referred to as Johnson’s theorem, as well as the Johnson–Tzitzeica or Tzitzeica-Johnson theorem. Apparently Roger Johnson and George Tzitzeica (Gheorghe Țițeica) both proved the same theorem around the same time. Johnson’s publication [1] dates to 1916.

It’s remarkable that a theorem in Euclidean geometry this easy to state was discovered 2200 years after Euclid. Johnson says in [1]

Singularly enough, this remarkable theorem appears to be new. A rather cursory search in several of the treatises on modern elementary geometry fails to disclose it, and the author has not yet found any person to whom it was known. On the other hand, the figure is so simple … that it seems almost out of the question that the fact can have escaped detection. Even if geometers have overlooked it, someone must have noticed it in casually drawing circles. But if this were the case, it seems like a theorem of sufficient interest to receive some prominence in the literature, and therefore ought to be well known.

Related posts

[1] Roger Johnson. A circle theorem. The American Mathematical Monthly, May, 1916, Vol. 23, No. 5, pp. 161-162.

Newton line

Let Q be a convex quadrilateral with at most two parallel sides. Draw the two diagonals then draw a line through their midpoints. This line is called the Newton line.

(The requirement that at most two sides are parallel insures that the midpoints are distinct and so there is a unique line joining them.)

In the figure above, the diagonals are blue, their midpoints are indicated by black dots, and the red line joining them is the Newton line.

Now join the midpoints of the sides. These are draw with dotted gray lines above. Then the intersection of these two lines lies on the Newton line.

Now suppose further that our quadrilateral is a tangential quadrilateral, i.e. that all four sides are tangent to a circle C. Then the center of C also lies on the Newton line.

In the image above, it appears that the lines joining the midpoints of the sides also passes intersect at the center of the circle. That’s not true in general, and its not true in the example above but you’d have to zoom in to see it. But it is true that the intersection of these lines and the center of the circle both lie on the Newton line.

Related posts

Geometric derivation of hyperbolic trig functions

This is the third post in a series on generalizing sine and cosine.

The previous post looked at a generalization of the sine and cosine functions that come from replacing a circle with a lemniscate, a curve that looks like a figure eight. This post looks at replacing the circle with a hyperbola.

On the unit circle, an arc of length θ starting at (1, 0) and running counterclockwise ends at (cos θ, sin θ), and this can be used to define the two trig functions. The lemniscate functions use an analogous approach, transferring arc length to a new curve.

Hyperbolic functions do not do this. Instead, they generalize a different property of the circle. Again we start out at (1, 0) and move counterclockwise around the unit circle, but this time we look at area of the sector rather than the length of the arc. The sector that ends at (cos θ, sin θ) has area θ/2. We could define sine and cosine by this relation: cos α and sin α are the x and y coordinates of where we stop when we’ve swept out an area of θ/2. This would be an awkward way to define sine and cosine, but it generalizes.

Start at (1, 0) and move along the hyperbola x² – y² = 1 in the first quadrant until the area bounded by the hyperbola, the x-axis, and a line from the origin to your location (x, y) is α/2. Then the x and y coordinates of the place where you stop are cosh x and sinh x respectively.

This is interesting in its own right, but it is also useful to tie together the first post in this series and the next because it is the area generalization rather than the arc length generalization that gives a geometric interpretation to the functions sinp and cosp.