Limit of a doodle

Suppose you’re in a boring meeting and you start doodling. You draw a circle, and then you draw a triangle on the outside of that circle.

Next you draw a circle through the vertices of the triangle, and draw a square outside that.

Then you draw a circle through the vertices of the square, and draw a pentagon outside that.

Then you think “Will this ever stop?!”, meaning the meeting, but you could ask a similar question about your doodling: does your sequence of doodles converge to a circle of finite radius, or do they grow without bound?

An n-gon circumscribed on the outside of a circle of radius r is inscribed in a circle of radius


So if you start with a unit circle, the radius of the circle through the vertices of the N-gon is

\prod_{n=3}^N \frac{1}{\cos(\pi/n)}

and the limit as N → ∞ exists. The limit is known as the polygon circumscribing constant, and it equals 8.7000366252….

We can visualize the limit by making a plot with large N. The plot is less cluttered if we leave out the circles and just show the polygons. N = 30 in the plot below.

The reciprocal of the polygon circumscribing constant is known as the Kepler-Bouwkamp constant. The Kepler-Bouwkamp constant is the limiting radius if you were to reverse the process above, inscribing polygons at each step rather than circumscribing them. It would make sense to call the Kepler-Bouwkamp constant the polygon inscribing constant, but for historical reasons it is named after Johannes Kepler and Christoffel Bouwkamp.

Kepler’s ellipse perimeter approximations

In 1609, Kepler remarked that the perimeter of an ellipse with semiaxes a and b could be approximated either as

P ≈ 2π(ab)½


P ≈ π(a + b).

In other words, you can approximate the perimeter of an ellipse by the circumference of a circle of radius r where r is either the geometric mean or arithmetic mean of the semi-major and semi-minor axes.

Comparing ellipse with circles of approximately the same perimeter

How good are these approximations, particularly when a and b are roughly equal? Which one is better?

When can choose our unit of measurement so that the semi-minor axis b equals 1, then plot the error in the two approximations as a increases.

Exact and approximation ellipse perimeter

We see from this plot that both approximations give lower bounds, and that arithmetic mean is more accurate than geometric mean.

Incidentally, if we used the geometric mean of the semi-axes as the radius of a circle when approximating the area then the results would be exactly correct. But for perimeter, the arithmetic mean is better.

Ellipse approximation relative error

Next, if we just consider ellipses in which the semi-major axis is no more than twice as long as the semi-minor axis, the arithmetic approximation is within 2% of the exact value and the geometric approximation is within 8%. Both approximations are good when ab.

The next post goes into more mathematical detail, explaining why Kepler’s approximation behaves as it does and giving ways to improve on it.

More ellipse posts

Approximating a spiral by rings

An Archimedean spiral has the polar equation

r = b θ1/n

This post will look at the case n = 1. I may look at more general values of n in a future post. (Update: See here.) The case n = 1 is the simplest case, and it’s the case I needed for the client project that motivated this post.

In this case the spacing between points where the spiral crosses an axis is constant. Call this constant h. Then

h = 2πb.

For example, when rolling up a carpet, h corresponds to the thickness of the carpet.

Suppose θ runs from 0 to 2πm, wrapping around the origin m times. We could approximate the spiral by m concentric circles of radius h, 2h, 3h, …, mh. To visualize this, we’re approximating the length of the red spiral on the left with that of the blue circles on the right.

Comparing Archimedes spiral and concentric circles

We could approximate this further by saying we have m/2 circles whose average radius is πmb. This suggests the length of the spiral should be approximately


How good is this approximation? What happens to the relative error as θ increases? Intuitively, each wrap around the origin is more like a circle as θ increases, so we’d expect the approximation to improve for large θ.

According to Mathworld, the exact length of the spiral is

πbm √(1 + (2πm)²) + b arcsinh(2πm) /2

When m is so large that we can ignore the 1 in √(1 + (2πm)²) then the first term is the same as the circle approximation, and all that’s left is the arcsinh term, which is on the order of log m because

arcsinh(x) = log(x + (1 + x²)1/2).

So for large m, the arc length is on the order of m² while the error is on the order of log m. This means the relative error is O( log(m) / m² ). [1]

We’ve assumed m was an integer because that makes it easier to visual approximating the spiral by circles, but that assumption is not necessary. We could restate the problem in terms of the final value of θ. Say θ runs from 0 to T. Then we could solve

T = 2πm

for m and say that the approximate arc length is

½ bT²

and the exact length is

½ bT(1 + T²)1/2 + ½ b arcsinh(T).

The relative approximation error is O( log(T) / T² ).

Here’s a plot of the error as a function of T assuming b = 1.

Related posts

[1] The error in approximating √(1 + (2πm)²) with 2πm is on the order of 1/(4πm) and so is smaller than the logarithmic term.

Distance from a point to a line

Eric Lengyel’s new book Projective Geometric Algebra Illuminated arrived yesterday and I’m enjoying reading it. Imagine if someone started with ideas like dot products, cross products, and determinants that you might see in your first year of college, then thought deeply about those things for years. That’s kinda what the book is.

Early in the book is the example of finding the distance from a point q to a line of the form p + tv.

If you define u = q − p then a straightforward derivation shows that the distance d from q to the line is given by

d = \sqrt{{\bold u}^2 - \frac{({\bold u}\cdot {\bold v})^2}{{\bold v}^2}}

But as the author explains, it is better to calculate d by

d = \sqrt{\frac{({\bold u}\times {\bold v})^2}{{\bold v}^2}}

Why is that? The two expressions are algebraically equal, but the latter is better suited for numerical calculation.

The cardinal rule of numerical calculation is to avoid subtracting nearly equal floating point numbers. If two numbers agree to b bits, you may lose up to b bits of significance when computing their difference.

If u and v are vectors with large magnitude, but q is close to the line, then the first equation subtracts two large, nearly equal numbers under the square root.

The second equation involves subtraction too, but it’s less obvious. This is a common theme in numerical computing. Imagine this dialog.

[Student produces first equation.]

Mentor: Avoid subtracting nearly equal numbers.

[Student produces section equation.]

Student: OK, did it.

Mentor: That’s much better, though it could still have problems.

Where is there a subtraction in the second equation? We started with a subtraction in defining u. More subtly, the definition of cross product involves subtractions. But these subtractions involve smaller numbers than the first formula, because the first formula subtracts squared values. Eric Lengyel points this out in his book.

None of this may matter in practice, until it does matter, which is a common pattern in numerical computing. You implement something like the first formula, something that can be derived directly. You implicitly have in mind vectors whose magnitude is comparable to d and this guides your choice of unit tests, which all pass.

Some time goes by and a colleague tells you your code is failing. Impossible! You checked your derivation by hand and in Mathematica. Your unit tests all pass. Must be your colleague’s fault. But it’s not. Your code would be correct in infinite precision, but in an actual computer it fails on inputs that violate your implicit assumptions.

This can be frustrating, but it can also be fun. Implementing equations from a freshman textbook accurately, efficiently, and robustly is not a freshman-level exercise.

Related posts

Bounding the perimeter of a triangle between circles

Suppose you have a triangle and you know the size of the largest circle that can fit inside (the incircle) and the size of the smallest circle that can fit outside (the circumcircle). How would you estimate the perimeter of the triangle?

In terms of the figure below, if you know the circumference of the red and blue circles, how could you estimate the perimeter of the black triangle?

Triangle with inscribed and circumscribed circles

A crude estimate is that the triangle perimeter must be greater than the incircle circumference and less than the circumcircle circumference. But we can do better.

Gerretsen’s inequalities

It is conventional in this kind of problem to work with the semiperimeter of the triangle, half the perimeter, rather than the perimeter. Let r be the radius of the incircle, R the radius of the circumcircle, and s the semiperimeter of the triangle. Then Gerretsen’s inequalities say

16Rr − 5r² ≤ s² ≤ 4R² + 4Rr + 3r²

In the figure above,

r = 1.058, R = 2.5074, s = 6.1427

and Gerretsen’s inequalities give

36.8532  ≤ s² = 37.7327 ≤ 39.1200.

In the case of an equilateral triangle, Gerretsen’s inequalities are in fact equations.

Note that Gerretsen’s inequalities say nothing about the centers of the circles. An incircle must be inside a circumcircle, but for a variety of triangles with the centers of the two circles in different relations you have the same bounds.

Kooi’s inequality

Kooi’s inequality [2] gives another upper bound for the perimeter of the triangle:

s² ≤ ½ R(4R + r)² / (2Rr)

which in the example above gives a tighter upper bound, 38.9515.

Kooi’s upper bound is uniformly better than the upper bound half of Gerretsen’s inequalities. But further refinements are possible [3].

Related posts

[1] J. C. H. Gerretsen, Ongelijkheden in de driehoek. Nieuw Tijdschr 41 (1953), 1–7.

[2] O. Kooi, Inequalities for the triangle, Simon Stevin, 32 (1958), 97–101.

[3] Martin Lukarevski and Dan Stefan Marinescu. A refinement of the Kooi’s inequality, Mittenpunkt and applications. Journal of Mathematical Applications. Volume 13, Number 3 (2019), 827–832

Area of quadrilateral as a determinant

I’ve written several posts about how determinants come up in geometry. These determinants often look similar, having columns related to coordinates and a column of ones. You can find several examples here along with an explanation for this pattern.

If you have three points z1, z2, and z3 in the complex plane, you can find the area of a triangle with these points as vertices

A(z_1, z_2, z_3) = \frac{i}{4} \, \left| \begin{matrix} z_1 & \bar{z}_1 & 1 \\ z_2 & \bar{z}_2 & 1 \\ z_3 & \bar{z}_3 & 1 \\ \end{matrix} \right|

You can read more about this here.

If you add the second column to the first, and subtract the first column from the second, you can get the equation for the area of a triangle in the real plane.

A = \frac{1}{2} \, \left| \begin{matrix} x_1 & y_1 & 1 \\ x_2 & y_2 & 1 \\ x_3 & y_3 & 1 \\ \end{matrix} \right|

Presumably the former equation was first derived from the latter, but the two are equivalent.

Now suppose you have a quadrilateral whose vertices are numbered in clockwise or counterclockwise order. Then the area is given by

A = \frac{1}{2} \, \left| \begin{matrix} x_1 & y_1 & 1 & 0\\ x_2 & y_2 & 1 & 1\\ x_3 & y_3 & 1 & 0\\ x_4 & y_4 & 1 & 1\\ \end{matrix} \right|

The proof is easy. If you expand the determinant by minors on the last column, you get the sum of two 3 × 3 determinants corresponding to the areas of the two triangles formed by splitting the quadrilateral along the diagonal connecting the first and third points.

Kepler triangle

A Kepler triangle is a right triangle whose sides are in geometric progression. That is, if the sides have length a < b < c, then b/a = c/b = k.

All Kepler triangles are similar because the proportionality constant k can only take on one value. To see this, we first pick our units so that a = 1. Then b = k and c = k². By the Pythagorean theorem

a² + b² = c²

and so

1 + k2 = k4

which means k² equals the golden ratio φ.

Here’s a nice geometric property of the Kepler triangle proved in [1].

Go around the triangle counterclockwise placing a point on each side dividing the side into pieces that are in golden proportion. Connect these three points with the opposite vertex. Then the triangle formed by the intersections of these line segments is also a Kepler triangle.

On each side, the ratio of the length of the green segment to the blue segment is φ. The grey triangle in the middle is another Kepler triangle.

The rest of this post will present the code that was used to create the image above.

We’ll need the following imports.

import matplotlib.pyplot as plt
from numpy import array
from numpy.linalg import solve

We’ll also need to define the golden ratio, a function to split a line segment into golden proportions, and a function to draw a line between two points.

φ = (1 + 5**0.5)/2

def golden_split(pt1, pt2):
    return (1/φ)*pt1 + (1 - 1/φ)*pt2

def connect(pt1, pt2, style):
    plt.plot([pt1[0], pt2[0]], [pt1[1], pt2[1]], style)

Now we can draw the figure.

A = array([0, 1])
B = array([0, 0])
C = array([φ**0.5, 0])

X = golden_split(A, B)
Y = golden_split(B, C)
Z = golden_split(C, A)

connect(A, X, "b")
connect(X, B, "g")
connect(B, Y, "b")
connect(Y, C, "g")
connect(C, Z, "b")
connect(Z, A, "g")
connect(A, Y, "grey")
connect(B, Z, "grey")
connect(C, X, "grey")


We can show algebraically that the golden_split works as claimed, but here is a numerical illustration.

assert(abs( (C[0] - Y[0]) / (Y[0] - A[0]) - φ) < 1e-14)

Similarly, we can show numerically what [1] proves exactly, i.e. that the triangle in the middle is a Kepler triangle.

from numpy.linalg import solve, norm

def intersect(pt1, pt2, pt3, pt4):
    # Find the intersection of the line joining pt1 and pt2
    # with the line joining pt3 and pt4.
    m1 = (pt2[1] - pt1[1])/(pt2[0] - pt1[0])
    m3 = (pt4[1] - pt3[1])/(pt4[0] - pt3[0])
    A = array([[m1, -1], [m3, -1]])
    rhs = array([m1*pt1[0]-pt1[1], m3*pt3[0]-pt3[1]])
    x = solve(A, rhs)
    return x

I = intersect(A, Y, C, X)
J = intersect(A, Y, B, Z)
K = intersect(B, Z, C, X)

assert( abs( norm(I - J)/norm(J - K) - φ**0.5 ) < 1e-14 )
assert( abs( norm(I - K)/norm(I - J) - φ**0.5 ) < 1e-14 )

Related posts

[1] Jun Li. Some properties of the Kepler triangle. The Mathematical Gazette, November 2017, Vol. 101, No. 552, pp. 494–495

Schwarz lemma, Schwarz-Pick theorem, and Poincare metric

Let D be the open unit disk in the complex plane. The Schwarz lemma says that if f is an analytic function from D to D with f(0) = 0, then

|f(z)| \leq |z|

for all z in D. The lemma also says more, but this post will focus on just this portion of the theorem.

The Schwarz-Pick theorem generalizes the Schwarz lemma by not requiring the origin to be fixed. That is, it says that if f is an analytic function from D to D then

\left| \frac{f(z) - f(w)}{1 - f(z)\,\overline{f(w)}} \right| \leq \left| \frac{z - w}{1 - z\,\overline{w}}\right|

The Schwarz-Pick theorem also concludes more, but again we’re focusing on part of the theorem here. Note that if f(0) = 0 then the Schwarz-Pick theorem reduces to the Schwarz lemma.

The Schwarz lemma is a sort of contraction theorem. Assuming f(0) = 0, the lemma says

|f(z) - f(0)| \leq |z - 0|

This says applying f to a point cannot move the point further from 0. That’s interesting, but it would be more interesting if we could say f is a contraction in general, not just with respect to 0. That is indeed what the Schwarz-Pick theorem does, though with respect to a new metric.

For any two points z and w in the open unit disk D, define the Poincaré distance between z and w by

d(z,w) = \tanh^{-1}\left( \left| \frac{z - w}{1 - z\overline{w}}\right| \right)

It’s not obvious that this is a metric, but it really is. As is often the case, most of the properties of a metric are simple to confirm, but the proving the triangle inequality is the hard part.

If we apply the monotone function tanh−1 to both sides of the Schwarz-Pick theorem, then we have that any analytic function f from D to D is a contraction on D with respect to the Poincaré metric.

Here we’re using “contraction” in the lose sense. It would be more explicit to say that f is a non-expansive map. Applying f to a pair of points may not bring the points closer together, but it cannot move them any further apart (with respect to the Poincaré metric).

By using the Poincaré metric, we turn the unit disk into a hyperbolic space. That is D with the metric d is a model of the hyperbolic plane.

Related posts

Stability of a superegg

Three weeks ago I wrote about supereggs, a shape popularized by Piet Hein.

Brass superegg by Piet Hein

One aspect of supereggs that I did not address is their stability. Looking at the photo above, you could imagine that if you gave the object a slight nudge it would not fall over. Your intuition would be right: supereggs are stable.

More interesting than asking whether supereggs are stable is to aks how stable they are. An object is stable if for some ε > 0, a perturbation of size less than ε will not cause the object to fall over.

All supereggs are stable, but the degree of stability decreases as the ratio of height to width increases: the taller the superegg, the easier it is to knock over.

How can we quantify stability? An object is stable if its center of curvature, measured at the point of contact with a horizontal surface, is above its center of mass.

For a sphere, the center of curvature is exactly the center of mass. If we modify the sphere to make it slightly flatter at the point of contact, the center of curvature will move above the center of mass and the modified sphere will be stable. On the other hand, if we made the sphere slightly more curved at the point of contact, the center of curvature would move below the center of mass and the object would be unstable.

The center of curvature is the center of the sphere that best approximates a surface at a point. If our object is a sphere, the hypothetical sphere defining the curvature is the object itself. For an object flatter on bottom than a sphere, the sphere defining center of curvature is larger than the object. The flatter the object, the larger the approximating sphere.

The superegg has zero curvature at the bottom, and so its center of curvature is at infinity. But if you push the superegg slightly, the part touching the horizontal surface is no longer the exact center, the curvature is slightly positive, and so the radius of curvature is finite. The center of mass also moves up slightly as the superegg rocks to the side.

So the center of curvature moves down and the center of mass moves up. At what point do they cross and the object becomes unstable?

Solution outline

The equation of the surface of the superegg is

\left(\sqrt{x^2 + y^2}\right)^p + |z/h|^p = 1

where p > 2 (a common choice is p = 5/2) and h > 1 (a common choice is h = 4/3).

If we tilt the superegg so that its axis of symmetry now makes a small angle θ with the z-axis, we have to answer several questions.

  1. Where does the center of mass go?
  2. What is the new point of contact with the horizontal surface?
  3. Where is the center of curvature?
  4. Is the center of curvature above or below the center of mass?

It’s easier to imagine lifting the superegg up from the surface, rotating it in the air, then lowering it to the surface. That way you don’t have to describe how the superegg rocks.

The superegg is radially symmetric about the vertical axis, so without loss of generality you can imagine the problem is limited to the xz plane.

It’s messy to work out the details, but in principle you could work out how much the superegg can be perturbed and return to its original position. You know a priori that the result will be a decreasing function of h and p.

Related posts

[1] Photo by Malene Thyssen, licensed under the Creative Commons Attribution-Share Alike 3.0 Unported license.

Johnson circle theorem

Draw three circles of radius r that intersect at a single point. Then draw a triangle connecting the remaining three points of intersection.

(Each pair of circles intersects in two points, one of which is the point where all three circles intersect, so there are three other intersection points.)

Then the circumcircle of the triangle, the circle through the three vertices, also has radius r.

I’ve seen this theorem referred to as Johnson’s theorem, as well as the Johnson–Tzitzeica or Tzitzeica-Johnson theorem. Apparently Roger Johnson and George Tzitzeica (Gheorghe Țițeica) both proved the same theorem around the same time. Johnson’s publication [1] dates to 1916.

It’s remarkable that a theorem in Euclidean geometry this easy to state was discovered 2200 years after Euclid. Johnson says in [1]

Singularly enough, this remarkable theorem appears to be new. A rather cursory search in several of the treatises on modern elementary geometry fails to disclose it, and the author has not yet found any person to whom it was known. On the other hand, the figure is so simple … that it seems almost out of the question that the fact can have escaped detection. Even if geometers have overlooked it, someone must have noticed it in casually drawing circles. But if this were the case, it seems like a theorem of sufficient interest to receive some prominence in the literature, and therefore ought to be well known.

Related posts

[1] Roger Johnson. A circle theorem. The American Mathematical Monthly, May, 1916, Vol. 23, No. 5, pp. 161-162.