Kepler triangle

A Kepler triangle is a right triangle whose sides are in geometric progression. That is, if the sides have length a < b < c, then b/a = c/b = k.

All Kepler triangles are similar because the proportionality constant k can only take on one value. To see this, we first pick our units so that a = 1. Then b = k and c = k². By the Pythagorean theorem

a² + b² = c²

and so

1 + k2 = k4

which means k² equals the golden ratio φ.

Here’s a nice geometric property of the Kepler triangle proved in [1].

Go around the triangle counterclockwise placing a point on each side dividing the side into pieces that are in golden proportion. Connect these three points with the opposite vertex. Then the triangle formed by the intersections of these line segments is also a Kepler triangle.

On each side, the ratio of the length of the green segment to the blue segment is φ. The grey triangle in the middle is another Kepler triangle.

The rest of this post will present the code that was used to create the image above.

We’ll need the following imports.

import matplotlib.pyplot as plt
from numpy import array
from numpy.linalg import solve

We’ll also need to define the golden ratio, a function to split a line segment into golden proportions, and a function to draw a line between two points.

φ = (1 + 5**0.5)/2

def golden_split(pt1, pt2):
    return (1/φ)*pt1 + (1 - 1/φ)*pt2

def connect(pt1, pt2, style):
    plt.plot([pt1[0], pt2[0]], [pt1[1], pt2[1]], style)

Now we can draw the figure.

A = array([0, 1])
B = array([0, 0])
C = array([φ**0.5, 0])

X = golden_split(A, B)
Y = golden_split(B, C)
Z = golden_split(C, A)

connect(A, X, "b")
connect(X, B, "g")
connect(B, Y, "b")
connect(Y, C, "g")
connect(C, Z, "b")
connect(Z, A, "g")
connect(A, Y, "grey")
connect(B, Z, "grey")
connect(C, X, "grey")

plt.gca().set_aspect("equal")
plt.axis("off")
plt.show()

We can show algebraically that the golden_split works as claimed, but here is a numerical illustration.

assert(abs( (C[0] - Y[0]) / (Y[0] - A[0]) - φ) < 1e-14)

Similarly, we can show numerically what [1] proves exactly, i.e. that the triangle in the middle is a Kepler triangle.

from numpy.linalg import solve, norm

def intersect(pt1, pt2, pt3, pt4):
    # Find the intersection of the line joining pt1 and pt2
    # with the line joining pt3 and pt4.
    m1 = (pt2[1] - pt1[1])/(pt2[0] - pt1[0])
    m3 = (pt4[1] - pt3[1])/(pt4[0] - pt3[0])
    A = array([[m1, -1], [m3, -1]])
    rhs = array([m1*pt1[0]-pt1[1], m3*pt3[0]-pt3[1]])
    x = solve(A, rhs)
    return x

I = intersect(A, Y, C, X)
J = intersect(A, Y, B, Z)
K = intersect(B, Z, C, X)

assert( abs( norm(I - J)/norm(J - K) - φ**0.5 ) < 1e-14 )
assert( abs( norm(I - K)/norm(I - J) - φ**0.5 ) < 1e-14 )

Related posts

[1] Jun Li. Some properties of the Kepler triangle. The Mathematical Gazette, November 2017, Vol. 101, No. 552, pp. 494–495

Schwarz lemma, Schwarz-Pick theorem, and Poincare metric

Let D be the open unit disk in the complex plane. The Schwarz lemma says that if f is an analytic function from D to D with f(0) = 0, then

|f(z)| \leq |z|

for all z in D. The lemma also says more, but this post will focus on just this portion of the theorem.

The Schwarz-Pick theorem generalizes the Schwarz lemma by not requiring the origin to be fixed. That is, it says that if f is an analytic function from D to D then

\left| \frac{f(z) - f(w)}{1 - f(z)\,\overline{f(w)}} \right| \leq \left| \frac{z - w}{1 - z\,\overline{w}}\right|

The Schwarz-Pick theorem also concludes more, but again we’re focusing on part of the theorem here. Note that if f(0) = 0 then the Schwarz-Pick theorem reduces to the Schwarz lemma.

The Schwarz lemma is a sort of contraction theorem. Assuming f(0) = 0, the lemma says

|f(z) - f(0)| \leq |z - 0|

This says applying f to a point cannot move the point further from 0. That’s interesting, but it would be more interesting if we could say f is a contraction in general, not just with respect to 0. That is indeed what the Schwarz-Pick theorem does, though with respect to a new metric.

For any two points z and w in the open unit disk D, define the Poincaré distance between z and w by

d(z,w) = \tanh^{-1}\left( \left| \frac{z - w}{1 - z\overline{w}}\right| \right)

It’s not obvious that this is a metric, but it really is. As is often the case, most of the properties of a metric are simple to confirm, but the proving the triangle inequality is the hard part.

If we apply the monotone function tanh−1 to both sides of the Schwarz-Pick theorem, then we have that any analytic function f from D to D is a contraction on D with respect to the Poincaré metric.

Here we’re using “contraction” in the lose sense. It would be more explicit to say that f is a non-expansive map. Applying f to a pair of points may not bring the points closer together, but it cannot move them any further apart (with respect to the Poincaré metric).

By using the Poincaré metric, we turn the unit disk into a hyperbolic space. That is D with the metric d is a model of the hyperbolic plane.

Related posts

Stability of a superegg

Three weeks ago I wrote about supereggs, a shape popularized by Piet Hein.

Brass superegg by Piet Hein

One aspect of supereggs that I did not address is their stability. Looking at the photo above, you could imagine that if you gave the object a slight nudge it would not fall over. Your intuition would be right: supereggs are stable.

More interesting than asking whether supereggs are stable is to ask how stable they are. An object is stable if for some ε > 0, a perturbation of size less than ε will not cause the object to fall over.

All supereggs are stable, but the degree of stability decreases as the ratio of height to width increases: the taller the superegg, the easier it is to knock over.

How can we quantify stability? An object is stable if its center of curvature, measured at the point of contact with a horizontal surface, is above its center of mass.

For a sphere, the center of curvature is exactly the center of mass. If we modify the sphere to make it slightly flatter at the point of contact, the center of curvature will move above the center of mass and the modified sphere will be stable. On the other hand, if we made the sphere slightly more curved at the point of contact, the center of curvature would move below the center of mass and the object would be unstable.

The center of curvature is the center of the sphere that best approximates a surface at a point. If our object is a sphere, the hypothetical sphere defining the curvature is the object itself. For an object flatter on bottom than a sphere, the sphere defining center of curvature is larger than the object. The flatter the object, the larger the approximating sphere.

The superegg has zero curvature at the bottom, and so its center of curvature is at infinity. But if you push the superegg slightly, the part touching the horizontal surface is no longer the exact center, the curvature is slightly positive, and so the radius of curvature is finite. The center of mass also moves up slightly as the superegg rocks to the side.

So the center of curvature moves down and the center of mass moves up. At what point do they cross and the object becomes unstable?

Solution outline

The equation of the surface of the superegg is

\left(\sqrt{x^2 + y^2}\right)^p + |z/h|^p = 1

where p > 2 (a common choice is p = 5/2) and h > 1 (a common choice is h = 4/3).

If we tilt the superegg so that its axis of symmetry now makes a small angle θ with the z-axis, we have to answer several questions.

  1. Where does the center of mass go?
  2. What is the new point of contact with the horizontal surface?
  3. Where is the center of curvature?
  4. Is the center of curvature above or below the center of mass?

It’s easier to imagine lifting the superegg up from the surface, rotating it in the air, then lowering it to the surface. That way you don’t have to describe how the superegg rocks.

The superegg is radially symmetric about the vertical axis, so without loss of generality you can imagine the problem is limited to the xz plane.

It’s messy to work out the details, but in principle you could work out how much the superegg can be perturbed and return to its original position. You know a priori that the result will be a decreasing function of h and p.

Related posts

[1] Photo by Malene Thyssen, licensed under the Creative Commons Attribution-Share Alike 3.0 Unported license.

Johnson circle theorem

Draw three circles of radius r that intersect at a single point. Then draw a triangle connecting the remaining three points of intersection.

(Each pair of circles intersects in two points, one of which is the point where all three circles intersect, so there are three other intersection points.)

Then the circumcircle of the triangle, the circle through the three vertices, also has radius r.

I’ve seen this theorem referred to as Johnson’s theorem, as well as the Johnson–Tzitzeica or Tzitzeica-Johnson theorem. Apparently Roger Johnson and George Tzitzeica (Gheorghe Țițeica) both proved the same theorem around the same time. Johnson’s publication [1] dates to 1916.

It’s remarkable that a theorem in Euclidean geometry this easy to state was discovered 2200 years after Euclid. Johnson says in [1]

Singularly enough, this remarkable theorem appears to be new. A rather cursory search in several of the treatises on modern elementary geometry fails to disclose it, and the author has not yet found any person to whom it was known. On the other hand, the figure is so simple … that it seems almost out of the question that the fact can have escaped detection. Even if geometers have overlooked it, someone must have noticed it in casually drawing circles. But if this were the case, it seems like a theorem of sufficient interest to receive some prominence in the literature, and therefore ought to be well known.

Related posts

[1] Roger Johnson. A circle theorem. The American Mathematical Monthly, May, 1916, Vol. 23, No. 5, pp. 161-162.

Newton line

Let Q be a convex quadrilateral with at most two parallel sides. Draw the two diagonals then draw a line through their midpoints. This line is called the Newton line.

(The requirement that at most two sides are parallel insures that the midpoints are distinct and so there is a unique line joining them.)

In the figure above, the diagonals are blue, their midpoints are indicated by black dots, and the red line joining them is the Newton line.

Now join the midpoints of the sides. These are draw with dotted gray lines above. Then the intersection of these two lines lies on the Newton line.

Now suppose further that our quadrilateral is a tangential quadrilateral, i.e. that all four sides are tangent to a circle C. Then the center of C also lies on the Newton line.

In the image above, it appears that the lines joining the midpoints of the sides also passes intersect at the center of the circle. That’s not true in general, and its not true in the example above but you’d have to zoom in to see it. But it is true that the intersection of these lines and the center of the circle both lie on the Newton line.

Related posts

Geometric derivation of hyperbolic trig functions

This is the third post in a series on generalizing sine and cosine.

The previous post looked at a generalization of the sine and cosine functions that come from replacing a circle with a lemniscate, a curve that looks like a figure eight. This post looks at replacing the circle with a hyperbola.

On the unit circle, an arc of length θ starting at (1, 0) and running counterclockwise ends at (cos θ, sin θ), and this can be used to define the two trig functions. The lemniscate functions use an analogous approach, transferring arc length to a new curve.

Hyperbolic functions do not do this. Instead, they generalize a different property of the circle. Again we start out at (1, 0) and move counterclockwise around the unit circle, but this time we look at area of the sector rather than the length of the arc. The sector that ends at (cos θ, sin θ) has area θ/2. We could define sine and cosine by this relation: cos α and sin α are the x and y coordinates of where we stop when we’ve swept out an area of θ/2. This would be an awkward way to define sine and cosine, but it generalizes.

Start at (1, 0) and move along the hyperbola x² – y² = 1 in the first quadrant until the area bounded by the hyperbola, the x-axis, and a line from the origin to your location (x, y) is α/2. Then the x and y coordinates of the place where you stop are cosh x and sinh x respectively.

This is interesting in its own right, but it is also useful to tie together the first post in this series and the next because it is the area generalization rather than the arc length generalization that gives a geometric interpretation to the functions sinp and cosp.

Supereggs, squigonometry, and squircles

The Depths of Wikipedia twitter account posted a screenshot about supereggs that’s popular at the moment. It says

there’s no way this is real. they must be making these words up

above a screenshot from the Wikipedia article on supereggs saying

The definition can be changed to have an equality rather than an inequality; this changes the superegg to being a surface of revolution rather than a solid.

I assume the Twitter account is having fun, not seriously suggesting that the terms are made up.

The terms “superegg” and “squircles” are whimsical but have been around for decades and have precise meanings. I hadn’t heard of “squigonometry,” but there are many variations on trigonometry that replace a circle with another curve, the best known example being hyperbolic trigonometry.

The equation for the volume of the superegg looked familiar but not quite right. It turns out the definition of superegg is not quite what I thought it was.

Brass superegg by Piet Hein

Piet Hein coined the terms superellipse and superegg. The photo above is a brass superegg made by Piet Hein [1].

A superellipse is what mathematicians more commonly call a p-norm ball in two dimensions. I assumed that a superegg was a p-norm ball in three dimensions, but it’s not quite.

A unit p-norm ball in 3 dimensions has equation

|x|^p + |y|^p + |z|^p = 1

A superegg, however, has equation

\left(\sqrt{x^2 + y^2}\right)^p + |z|^p = 1

If you slice a p-norm ball horizontally or vertically you get another p-norm ball. So in three dimensions, either a vertical or horizontal slice gives you a superellipse.

But a horizontal slice of a superegg is a circle while a vertical slice is a superellipse, which is not a circle unless p = 2. Said another way, supereggs are symmetric about the z-axis but p-norm balls are not.

I’ve left out one detail: superellipses and supereggs typically stretch one of the axes. So you’d replace x with x/k in the definition of a superellipse or replace z with z/k in the definition of a superegg. A squircle is a superellipse with the two axes equally, and typically p is set to 4 or a value near 4.

Related posts

[1] Photo by Malene Thyssen, licensed under the Creative Commons Attribution-Share Alike 3.0 Unported license.

Regular solids and Monte Carlo integration

Monte Carlo integration is not as simple in practice as it is often introduced. A homework problem might as you to integrate a function of two variables by selecting random points from a cube and counting how many of the points fall below the graph of the function. This would indeed give you an estimate of the volume bounded by the surface and hence the value of the integral.

But Monte Carlo integration is most often used in high dimensions, and the region of interest takes up a tiny proportion of a bounding box. In practice you’d rarely sample uniformly from a high-dimensional box. This post will look at sampling points on a (possibly high-dimensional) sphere.

The rate of convergence of Monte Carlo integration depends on the variance of the samples, and so people look for ways to reduce variance. Antipodal sampling is one such approach. The idea is that a function on a sphere is likely to take on larger values on one side of the sphere and smaller values on the other. So for every point x where the function is sampled, it is also sampled at the diametrically opposite point −x on the assumption/hope that the values of the function at the two points are negatively correlated.

Antipodal sampling is a first step in the direction of a hybrid of regular and random sampling, sampling by random choices of regularly spaced points, such as antipodal points. When this works well, you get a sort of synergy, an integration method that converges faster than either purely systematic or purely random sampling.

If a little is good, then more is better, right? Not necessarily, but maybe, so it’s worth exploring. If I remember correctly, Alan Genz explored this. Instead of just taking antipodal points, you could sample at the points of a regular solid, like a tetrahedron. Randomly select and initial point, create a tetrahedron on the sphere with this as one of the vertices, and sample your integrand at each of the vertices. Or you could think of having a tetrahedron fixed in space and randomly rotating the sphere so that the sphere remains in contact with the vertices of the tetrahedron.

If you’re going to sample at the vertices of a regular solid, you’d like to know what regular solids are possible. In three dimensions, there are five: tetrahedron, hexahedron (cube), octahedron, dodecahedon, and icosahedron. Only the first three of these generalize to dimensions 5 and higher, so you only have three choices in high dimension if you want to sample at the vertices of a regular solid.

Here’s more about the cross polytope, the generalization of the octahedron.

If you want more regularly-spaced points on a sphere than regular solids will permit, you could compromise and use points whose spacing is approximately regular, such as the Fibonacci lattice. You could randomly rotate your Fibonacci lattice to create a randomized quasi-Monte Carlo (RQMC) method.

You have a knob you can turn determining the amount of regularity and the amount of randomness. At one extreme is purely random sampling. As you turn the knob you go from antipodes to tetrahedra and up to cross polytopes. Then there’s a warning line, but you can keep going with things like the Fibonacci lattice, introducing some distortion, sorta like turning the volume of a speaker up past the point of distortion.

In my experience, I’ve gotten the best results near the random sampling end of the spectrum. Antipodal sampling sped up convergence, but other methods not so much. But your mileage may vary; the results depend on the kind of function you’re integrating.

Solve for ellipse axes given perimeter

I posted some notes this morning on how to find the perimeter of an ellipse given its axes. The notes include a simple approximation, a better but more complicated approximation, and the exact value. So given the semi axes a and b, the notes give three ways to compute the perimeter p.

If you are given the perimeter and one of the axes, you can solve for the other axis, though this involves a nonlinear equation with an elliptic integral. Not an insurmountable obstacle, but not trivial either.

However, the simple approximation for the perimeter is easy to invert. Since

p \approx 2 \pi \left(\frac{a^{3/2} + b^{3/2}}{2} \right )^{2/3}

we have

a \approx \left( 2\left( \frac{p}{2\pi} \right)^{3/2} -\, b^{3/2}\right)^{2/3}

The same equation holds if you reverse the roles of a and b.

If this solution is not accurate enough, it at least gives you a good starting point for solving the exact equation numerically.

If you’re not given either a or b, then you might as well assume a = b and so both equal p/2π.

How you define center matters a lot

Earlier I wrote a post showing what happens when you start with an equilateral triangle, then repeatedly subdivide it into smaller and smaller triangles by drawing lines from the centroid (barycenter) to each of the vertices.

I mentioned in that post that I moved the code for finding the center to its own function because in the future I might want to see what happens when you look at different choices of center. There are thousands of ways to define the center of a triangle.

This post will look at 4 levels of recursive division, using the barycenter, incenter, and circumcenter.

Barycenter

The barycenter of a set of points is the point that would be the center of mass if each point had the same weight. (The name comes from the Greek baros for weight. Think barium or bariatric surgery.)

This is the method used in the earlier post.

Incenter

The incenter of a triangle is the center of the largest circle that can be drawn inside the triangle. When we use this definition of center and repeatedly divide our triangle, we get a substantially different image.

Circumcenter

The circumcenter of a triangle is the center of the unique circle that passes through each of the three vertices. This results in a very different image because the circumcenter of a triangle may be outside of the triangle.

By recursively dividing our triangle, we get a hexagon!

Related posts