Drawing with a compass on a globe

Take a compass and draw a circle on a globe. Then take the same compass, opened to the same width, and draw a circle on a flat piece of paper. Which circle has more area?

If the circle is small compared to the radius of the globe, then the two circles will be approximately equal because a small area on a globe is approximately flat.

To get an idea what happens for larger circles, let’s a circle on the globe as large as possible, i.e. the equator. If the globe has radius r, then to draw the equator we need our compass to be opened a width of √2 r, the distance from the north pole to the equator along a straight line cutting through the globe.

The area of a hemisphere is 2πr². If we take our compass and draw a circle of radius √2 r on a flat surface we also get an area of 2πr². And by continuity we should expect that if we draw a circle that is nearly as big as the equator then the corresponding circle on a flat surface should have approximately the same area.

Interesting. This says that our compass will draw a circle with the same area whether on a globe or on a flat surface, at least approximately, if the width of the compass sufficiently small or sufficiently large. In fact, we get exactly the same area, regardless of how wide the compass is opened up. We haven’t proven this, only given a plausibility argument, but you can find a proof in [1].

Note that the width w of the compass is the radius of the circle drawn on a flat surface, but it is not the radius of the circle drawn on the globe. The width w is greater than the radius of the circle, but less than the distance along the sphere from the center of the circle. In the case of the equator, the radius of the circle is r, the width of the compass is √2 r , and the distance along the sphere from the north pole to the equator is πr/2.

Related posts

[1] Nick Lord. On an alternative formula for the area of a spherical cap. The Mathematical Gazette, Vol. 102, No. 554 (July 2018), pp. 314–316

Ptolemy’s theorem

Draw a quadrilateral by pick four arbitrary points on a circle and connecting them cyclically.

inscribed quadrilateral

Now multiply the lengths of the pairs of opposite sides. In the diagram below this means multiplying the lengths of the two horizontal-ish blue sides and the two vertical-ish orange sides.

quadrilateral with opposite sides colored

Ptolemy’s theorem says that the sum of the two products described above equals the product of the diagonals.

inscribed quadrilateral with diagonals

To put it in colorful terms, the product of the blue sides plus the product of the orange sides equals the product of the green diagonals.

The converse of Ptolemy’s theorem also holds. If the relationship above holds for a quadrilateral, then the quadrilateral can be inscribed in a circle.

Note that if the quadrilateral in Ptolemy’s theorem is a rectangle, then the theorem reduces to the Pythagorean theorem.

Related posts

Limit of a doodle

Suppose you’re in a boring meeting and you start doodling. You draw a circle, and then you draw a triangle on the outside of that circle.

Next you draw a circle through the vertices of the triangle, and draw a square outside that.

Then you draw a circle through the vertices of the square, and draw a pentagon outside that.

Then you think “Will this ever stop?!”, meaning the meeting, but you could ask a similar question about your doodling: does your sequence of doodles converge to a circle of finite radius, or do they grow without bound?

An n-gon circumscribed on the outside of a circle of radius r is inscribed in a circle of radius

\frac{r}{\cos(\pi/n)}

So if you start with a unit circle, the radius of the circle through the vertices of the N-gon is

\prod_{n=3}^N \frac{1}{\cos(\pi/n)}

and the limit as N → ∞ exists. The limit is known as the polygon circumscribing constant, and it equals 8.7000366252….

We can visualize the limit by making a plot with large N. The plot is less cluttered if we leave out the circles and just show the polygons. N = 30 in the plot below.

The reciprocal of the polygon circumscribing constant is known as the Kepler-Bouwkamp constant. The Kepler-Bouwkamp constant is the limiting radius if you were to reverse the process above, inscribing polygons at each step rather than circumscribing them. It would make sense to call the Kepler-Bouwkamp constant the polygon inscribing constant, but for historical reasons it is named after Johannes Kepler and Christoffel Bouwkamp.

Kepler’s ellipse perimeter approximations

In 1609, Kepler remarked that the perimeter of an ellipse with semiaxes a and b could be approximated either as

P ≈ 2π(ab)½

or

P ≈ π(a + b).

In other words, you can approximate the perimeter of an ellipse by the circumference of a circle of radius r where r is either the geometric mean or arithmetic mean of the semi-major and semi-minor axes.

Comparing ellipse with circles of approximately the same perimeter

How good are these approximations, particularly when a and b are roughly equal? Which one is better?

When can choose our unit of measurement so that the semi-minor axis b equals 1, then plot the error in the two approximations as a increases.

Exact and approximation ellipse perimeter

We see from this plot that both approximations give lower bounds, and that arithmetic mean is more accurate than geometric mean.

Incidentally, if we used the geometric mean of the semi-axes as the radius of a circle when approximating the area then the results would be exactly correct. But for perimeter, the arithmetic mean is better.

Ellipse approximation relative error

Next, if we just consider ellipses in which the semi-major axis is no more than twice as long as the semi-minor axis, the arithmetic approximation is within 2% of the exact value and the geometric approximation is within 8%. Both approximations are good when ab.

The next post goes into more mathematical detail, explaining why Kepler’s approximation behaves as it does and giving ways to improve on it.

More ellipse posts

Approximating a spiral by rings

An Archimedean spiral has the polar equation

r = b θ1/n

This post will look at the case n = 1. I may look at more general values of n in a future post. (Update: See here.) The case n = 1 is the simplest case, and it’s the case I needed for the client project that motivated this post.

In this case the spacing between points where the spiral crosses an axis is constant. Call this constant h. Then

h = 2πb.

For example, when rolling up a carpet, h corresponds to the thickness of the carpet.

Suppose θ runs from 0 to 2πm, wrapping around the origin m times. We could approximate the spiral by m concentric circles of radius h, 2h, 3h, …, mh. To visualize this, we’re approximating the length of the red spiral on the left with that of the blue circles on the right.

Comparing Archimedes spiral and concentric circles

We could approximate this further by saying we have m/2 circles whose average radius is πmb. This suggests the length of the spiral should be approximately

2π²m²b

How good is this approximation? What happens to the relative error as θ increases? Intuitively, each wrap around the origin is more like a circle as θ increases, so we’d expect the approximation to improve for large θ.

According to Mathworld, the exact length of the spiral is

πbm √(1 + (2πm)²) + b arcsinh(2πm) /2

When m is so large that we can ignore the 1 in √(1 + (2πm)²) then the first term is the same as the circle approximation, and all that’s left is the arcsinh term, which is on the order of log m because

arcsinh(x) = log(x + (1 + x²)1/2).

So for large m, the arc length is on the order of m² while the error is on the order of log m. This means the relative error is O( log(m) / m² ). [1]

We’ve assumed m was an integer because that makes it easier to visual approximating the spiral by circles, but that assumption is not necessary. We could restate the problem in terms of the final value of θ. Say θ runs from 0 to T. Then we could solve

T = 2πm

for m and say that the approximate arc length is

½ bT²

and the exact length is

½ bT(1 + T²)1/2 + ½ b arcsinh(T).

The relative approximation error is O( log(T) / T² ).

Here’s a plot of the error as a function of T assuming b = 1.

Related posts

[1] The error in approximating √(1 + (2πm)²) with 2πm is on the order of 1/(4πm) and so is smaller than the logarithmic term.

Distance from a point to a line

Eric Lengyel’s new book Projective Geometric Algebra Illuminated arrived yesterday and I’m enjoying reading it. Imagine if someone started with ideas like dot products, cross products, and determinants that you might see in your first year of college, then thought deeply about those things for years. That’s kinda what the book is.

Early in the book is the example of finding the distance from a point q to a line of the form p + tv.

If you define u = q − p then a straightforward derivation shows that the distance d from q to the line is given by

d = \sqrt{{\bold u}^2 - \frac{({\bold u}\cdot {\bold v})^2}{{\bold v}^2}}

But as the author explains, it is better to calculate d by

d = \sqrt{\frac{({\bold u}\times {\bold v})^2}{{\bold v}^2}}

Why is that? The two expressions are algebraically equal, but the latter is better suited for numerical calculation.

The cardinal rule of numerical calculation is to avoid subtracting nearly equal floating point numbers. If two numbers agree to b bits, you may lose up to b bits of significance when computing their difference.

If u and v are vectors with large magnitude, but q is close to the line, then the first equation subtracts two large, nearly equal numbers under the square root.

The second equation involves subtraction too, but it’s less obvious. This is a common theme in numerical computing. Imagine this dialog.

[Student produces first equation.]

Mentor: Avoid subtracting nearly equal numbers.

[Student produces section equation.]

Student: OK, did it.

Mentor: That’s much better, though it could still have problems.

Where is there a subtraction in the second equation? We started with a subtraction in defining u. More subtly, the definition of cross product involves subtractions. But these subtractions involve smaller numbers than the first formula, because the first formula subtracts squared values. Eric Lengyel points this out in his book.

None of this may matter in practice, until it does matter, which is a common pattern in numerical computing. You implement something like the first formula, something that can be derived directly. You implicitly have in mind vectors whose magnitude is comparable to d and this guides your choice of unit tests, which all pass.

Some time goes by and a colleague tells you your code is failing. Impossible! You checked your derivation by hand and in Mathematica. Your unit tests all pass. Must be your colleague’s fault. But it’s not. Your code would be correct in infinite precision, but in an actual computer it fails on inputs that violate your implicit assumptions.

This can be frustrating, but it can also be fun. Implementing equations from a freshman textbook accurately, efficiently, and robustly is not a freshman-level exercise.

Related posts

Bounding the perimeter of a triangle between circles

Suppose you have a triangle and you know the size of the largest circle that can fit inside (the incircle) and the size of the smallest circle that can fit outside (the circumcircle). How would you estimate the perimeter of the triangle?

In terms of the figure below, if you know the circumference of the red and blue circles, how could you estimate the perimeter of the black triangle?

Triangle with inscribed and circumscribed circles

A crude estimate is that the triangle perimeter must be greater than the incircle circumference and less than the circumcircle circumference. But we can do better.

Gerretsen’s inequalities

It is conventional in this kind of problem to work with the semiperimeter of the triangle, half the perimeter, rather than the perimeter. Let r be the radius of the incircle, R the radius of the circumcircle, and s the semiperimeter of the triangle. Then Gerretsen’s inequalities say

16Rr − 5r² ≤ s² ≤ 4R² + 4Rr + 3r²

In the figure above,

r = 1.058, R = 2.5074, s = 6.1427

and Gerretsen’s inequalities give

36.8532  ≤ s² = 37.7327 ≤ 39.1200.

In the case of an equilateral triangle, Gerretsen’s inequalities are in fact equations.

Note that Gerretsen’s inequalities say nothing about the centers of the circles. An incircle must be inside a circumcircle, but for a variety of triangles with the centers of the two circles in different relations you have the same bounds.

Kooi’s inequality

Kooi’s inequality [2] gives another upper bound for the perimeter of the triangle:

s² ≤ ½ R(4R + r)² / (2Rr)

which in the example above gives a tighter upper bound, 38.9515.

Kooi’s upper bound is uniformly better than the upper bound half of Gerretsen’s inequalities. But further refinements are possible [3].

Related posts

[1] J. C. H. Gerretsen, Ongelijkheden in de driehoek. Nieuw Tijdschr 41 (1953), 1–7.

[2] O. Kooi, Inequalities for the triangle, Simon Stevin, 32 (1958), 97–101.

[3] Martin Lukarevski and Dan Stefan Marinescu. A refinement of the Kooi’s inequality, Mittenpunkt and applications. Journal of Mathematical Applications. Volume 13, Number 3 (2019), 827–832

Area of quadrilateral as a determinant

I’ve written several posts about how determinants come up in geometry. These determinants often look similar, having columns related to coordinates and a column of ones. You can find several examples here along with an explanation for this pattern.

If you have three points z1, z2, and z3 in the complex plane, you can find the area of a triangle with these points as vertices

A(z_1, z_2, z_3) = \frac{i}{4} \, \left| \begin{matrix} z_1 & \bar{z}_1 & 1 \\ z_2 & \bar{z}_2 & 1 \\ z_3 & \bar{z}_3 & 1 \\ \end{matrix} \right|

You can read more about this here.

If you add the second column to the first, and subtract the first column from the second, you can get the equation for the area of a triangle in the real plane.

A = \frac{1}{2} \, \left| \begin{matrix} x_1 & y_1 & 1 \\ x_2 & y_2 & 1 \\ x_3 & y_3 & 1 \\ \end{matrix} \right|

Presumably the former equation was first derived from the latter, but the two are equivalent.

Now suppose you have a quadrilateral whose vertices are numbered in clockwise or counterclockwise order. Then the area is given by

A = \frac{1}{2} \, \left| \begin{matrix} x_1 & y_1 & 1 & 0\\ x_2 & y_2 & 1 & 1\\ x_3 & y_3 & 1 & 0\\ x_4 & y_4 & 1 & 1\\ \end{matrix} \right|

The proof is easy. If you expand the determinant by minors on the last column, you get the sum of two 3 × 3 determinants corresponding to the areas of the two triangles formed by splitting the quadrilateral along the diagonal connecting the first and third points.

Kepler triangle

A Kepler triangle is a right triangle whose sides are in geometric progression. That is, if the sides have length a < b < c, then b/a = c/b = k.

All Kepler triangles are similar because the proportionality constant k can only take on one value. To see this, we first pick our units so that a = 1. Then b = k and c = k². By the Pythagorean theorem

a² + b² = c²

and so

1 + k2 = k4

which means k² equals the golden ratio φ.

Here’s a nice geometric property of the Kepler triangle proved in [1].

Go around the triangle counterclockwise placing a point on each side dividing the side into pieces that are in golden proportion. Connect these three points with the opposite vertex. Then the triangle formed by the intersections of these line segments is also a Kepler triangle.

On each side, the ratio of the length of the green segment to the blue segment is φ. The grey triangle in the middle is another Kepler triangle.

The rest of this post will present the code that was used to create the image above.

We’ll need the following imports.

import matplotlib.pyplot as plt
from numpy import array
from numpy.linalg import solve

We’ll also need to define the golden ratio, a function to split a line segment into golden proportions, and a function to draw a line between two points.

φ = (1 + 5**0.5)/2

def golden_split(pt1, pt2):
    return (1/φ)*pt1 + (1 - 1/φ)*pt2

def connect(pt1, pt2, style):
    plt.plot([pt1[0], pt2[0]], [pt1[1], pt2[1]], style)

Now we can draw the figure.

A = array([0, 1])
B = array([0, 0])
C = array([φ**0.5, 0])

X = golden_split(A, B)
Y = golden_split(B, C)
Z = golden_split(C, A)

connect(A, X, "b")
connect(X, B, "g")
connect(B, Y, "b")
connect(Y, C, "g")
connect(C, Z, "b")
connect(Z, A, "g")
connect(A, Y, "grey")
connect(B, Z, "grey")
connect(C, X, "grey")

plt.gca().set_aspect("equal")
plt.axis("off")
plt.show()

We can show algebraically that the golden_split works as claimed, but here is a numerical illustration.

assert(abs( (C[0] - Y[0]) / (Y[0] - A[0]) - φ) < 1e-14)

Similarly, we can show numerically what [1] proves exactly, i.e. that the triangle in the middle is a Kepler triangle.

from numpy.linalg import solve, norm

def intersect(pt1, pt2, pt3, pt4):
    # Find the intersection of the line joining pt1 and pt2
    # with the line joining pt3 and pt4.
    m1 = (pt2[1] - pt1[1])/(pt2[0] - pt1[0])
    m3 = (pt4[1] - pt3[1])/(pt4[0] - pt3[0])
    A = array([[m1, -1], [m3, -1]])
    rhs = array([m1*pt1[0]-pt1[1], m3*pt3[0]-pt3[1]])
    x = solve(A, rhs)
    return x

I = intersect(A, Y, C, X)
J = intersect(A, Y, B, Z)
K = intersect(B, Z, C, X)

assert( abs( norm(I - J)/norm(J - K) - φ**0.5 ) < 1e-14 )
assert( abs( norm(I - K)/norm(I - J) - φ**0.5 ) < 1e-14 )

Related posts

[1] Jun Li. Some properties of the Kepler triangle. The Mathematical Gazette, November 2017, Vol. 101, No. 552, pp. 494–495

Schwarz lemma, Schwarz-Pick theorem, and Poincare metric

Let D be the open unit disk in the complex plane. The Schwarz lemma says that if f is an analytic function from D to D with f(0) = 0, then

|f(z)| \leq |z|

for all z in D. The lemma also says more, but this post will focus on just this portion of the theorem.

The Schwarz-Pick theorem generalizes the Schwarz lemma by not requiring the origin to be fixed. That is, it says that if f is an analytic function from D to D then

\left| \frac{f(z) - f(w)}{1 - f(z)\,\overline{f(w)}} \right| \leq \left| \frac{z - w}{1 - z\,\overline{w}}\right|

The Schwarz-Pick theorem also concludes more, but again we’re focusing on part of the theorem here. Note that if f(0) = 0 then the Schwarz-Pick theorem reduces to the Schwarz lemma.

The Schwarz lemma is a sort of contraction theorem. Assuming f(0) = 0, the lemma says

|f(z) - f(0)| \leq |z - 0|

This says applying f to a point cannot move the point further from 0. That’s interesting, but it would be more interesting if we could say f is a contraction in general, not just with respect to 0. That is indeed what the Schwarz-Pick theorem does, though with respect to a new metric.

For any two points z and w in the open unit disk D, define the Poincaré distance between z and w by

d(z,w) = \tanh^{-1}\left( \left| \frac{z - w}{1 - z\overline{w}}\right| \right)

It’s not obvious that this is a metric, but it really is. As is often the case, most of the properties of a metric are simple to confirm, but the proving the triangle inequality is the hard part.

If we apply the monotone function tanh−1 to both sides of the Schwarz-Pick theorem, then we have that any analytic function f from D to D is a contraction on D with respect to the Poincaré metric.

Here we’re using “contraction” in the lose sense. It would be more explicit to say that f is a non-expansive map. Applying f to a pair of points may not bring the points closer together, but it cannot move them any further apart (with respect to the Poincaré metric).

By using the Poincaré metric, we turn the unit disk into a hyperbolic space. That is D with the metric d is a model of the hyperbolic plane.

Related posts