Supereggs, squigonometry, and squircles

The Depths of Wikipedia twitter account posted a screenshot about supereggs that’s popular at the moment. It says

there’s no way this is real. they must be making these words up

above a screenshot from the Wikipedia article on supereggs saying

The definition can be changed to have an equality rather than an inequality; this changes the superegg to being a surface of revolution rather than a solid.

I assume the Twitter account is having fun, not seriously suggesting that the terms are made up.

The terms “superegg” and “squircles” are whimsical but have been around for decades and have precise meanings. I hadn’t heard of “squigonometry,” but there are many variations on trigonometry that replace a circle with another curve, the best known example being hyperbolic trigonometry.

The equation for the volume of the superegg looked familiar but not quite right. It turns out the definition of superegg is not quite what I thought it was.

Brass superegg by Piet Hein

Piet Hein coined the terms superellipse and superegg. The photo above is a brass superegg made by Piet Hein [1].

A superellipse is what mathematicians more commonly call a p-norm ball in two dimensions. I assumed that a superegg was a p-norm ball in three dimensions, but it’s not quite.

A unit p-norm ball in 3 dimensions has equation

|x|^p + |y|^p + |z|^p = 1

A superegg, however, has equation

\left(\sqrt{x^2 + y^2}\right)^p + |z|^p = 1

If you slice a p-norm ball horizontally or vertically you get another p-norm ball. So in three dimensions, either a vertical or horizontal slice gives you a superellipse.

But a horizontal slice of a superegg is a circle while a vertical slice is a superellipse, which is not a circle unless p = 2. Said another way, supereggs are symmetric about the z-axis but p-norm balls are not.

I’ve left out one detail: superellipses and supereggs typically stretch one of the axes. So you’d replace x with x/k in the definition of a superellipse or replace z with z/k in the definition of a superegg. A squircle is a superellipse with the two axes equally, and typically p is set to 4 or a value near 4.

Related posts

[1] Photo by Malene Thyssen, licensed under the Creative Commons Attribution-Share Alike 3.0 Unported license.

Regular solids and Monte Carlo integration

Monte Carlo integration is not as simple in practice as it is often introduced. A homework problem might as you to integrate a function of two variables by selecting random points from a cube and counting how many of the points fall below the graph of the function. This would indeed give you an estimate of the volume bounded by the surface and hence the value of the integral.

But Monte Carlo integration is most often used in high dimensions, and the region of interest takes up a tiny proportion of a bounding box. In practice you’d rarely sample uniformly from a high-dimensional box. This post will look at sampling points on a (possibly high-dimensional) sphere.

The rate of convergence of Monte Carlo integration depends on the variance of the samples, and so people look for ways to reduce variance. Antipodal sampling is one such approach. The idea is that a function on a sphere is likely to take on larger values on one side of the sphere and smaller values on the other. So for every point x where the function is sampled, it is also sampled at the diametrically opposite point −x on the assumption/hope that the values of the function at the two points are negatively correlated.

Antipodal sampling is a first step in the direction of a hybrid of regular and random sampling, sampling by random choices of regularly spaced points, such as antipodal points. When this works well, you get a sort of synergy, an integration method that converges faster than either purely systematic or purely random sampling.

If a little is good, then more is better, right? Not necessarily, but maybe, so it’s worth exploring. If I remember correctly, Alan Genz explored this. Instead of just taking antipodal points, you could sample at the points of a regular solid, like a tetrahedron. Randomly select and initial point, create a tetrahedron on the sphere with this as one of the vertices, and sample your integrand at each of the vertices. Or you could think of having a tetrahedron fixed in space and randomly rotating the sphere so that the sphere remains in contact with the vertices of the tetrahedron.

If you’re going to sample at the vertices of a regular solid, you’d like to know what regular solids are possible. In three dimensions, there are five: tetrahedron, hexahedron (cube), octahedron, dodecahedon, and icosahedron. Only the first three of these generalize to dimensions 5 and higher, so you only have three choices in high dimension if you want to sample at the vertices of a regular solid.

Here’s more about the cross polytope, the generalization of the octahedron.

If you want more regularly-spaced points on a sphere than regular solids will permit, you could compromise and use points whose spacing is approximately regular, such as the Fibonacci lattice. You could randomly rotate your Fibonacci lattice to create a randomized quasi-Monte Carlo (RQMC) method.

You have a knob you can turn determining the amount of regularity and the amount of randomness. At one extreme is purely random sampling. As you turn the knob you go from antipodes to tetrahedra and up to cross polytopes. Then there’s a warning line, but you can keep going with things like the Fibonacci lattice, introducing some distortion, sorta like turning the volume of a speaker up past the point of distortion.

In my experience, I’ve gotten the best results near the random sampling end of the spectrum. Antipodal sampling sped up convergence, but other methods not so much. But your mileage may vary; the results depend on the kind of function you’re integrating.

Solve for ellipse axes given perimeter

I posted some notes this morning on how to find the perimeter of an ellipse given its axes. The notes include a simple approximation, a better but more complicated approximation, and the exact value. So given the semi axes a and b, the notes give three ways to compute the perimeter p.

If you are given the perimeter and one of the axes, you can solve for the other axis, though this involves a nonlinear equation with an elliptic integral. Not an insurmountable obstacle, but not trivial either.

However, the simple approximation for the perimeter is easy to invert. Since

p \approx 2 \pi \left(\frac{a^{3/2} + b^{3/2}}{2} \right )^{2/3}

we have

a \approx \left( 2\left( \frac{p}{2\pi} \right)^{3/2} -\, b^{3/2}\right)^{2/3}

The same equation holds if you reverse the roles of a and b.

If this solution is not accurate enough, it at least gives you a good starting point for solving the exact equation numerically.

If you’re not given either a or b, then you might as well assume a = b and so both equal p/2π.

How you define center matters a lot

Earlier I wrote a post showing what happens when you start with an equilateral triangle, then repeatedly subdivide it into smaller and smaller triangles by drawing lines from the centroid (barycenter) to each of the vertices.

I mentioned in that post that I moved the code for finding the center to its own function because in the future I might want to see what happens when you look at different choices of center. There are thousands of ways to define the center of a triangle.

This post will look at 4 levels of recursive division, using the barycenter, incenter, and circumcenter.

Barycenter

The barycenter of a set of points is the point that would be the center of mass if each point had the same weight. (The name comes from the Greek baros for weight. Think barium or bariatric surgery.)

This is the method used in the earlier post.

Incenter

The incenter of a triangle is the center of the largest circle that can be drawn inside the triangle. When we use this definition of center and repeatedly divide our triangle, we get a substantially different image.

Circumcenter

The circumcenter of a triangle is the center of the unique circle that passes through each of the three vertices. This results in a very different image because the circumcenter of a triangle may be outside of the triangle.

By recursively dividing our triangle, we get a hexagon!

Related posts

Recursive triangle subdivision

The other day I saw where Cliff Pickover tweeted some images of triangles recursively subdivided by adding a point to the barycenter of each triangle. The images were not what I expected, so I wanted to reproduce the images to look into this further.

Here are the first three steps:

I set the alpha value of the lines to 0.1 so that lines that get drawn repeatedly would appear darker.

The size of the images grows quickly as the number of subdivisions increases. Here are links to the images after five steps: SVG, PNG

Update: Plots using incenter and circumcenter look very different than the plots in this post. See here.

Python code

The code below can be used to subdivide any triangle, not just an equilateral triangle, to any desired depth.

I pulled the code to find the center of the triangle out into its own function because there are many ways to define the center of a triangle—more on that here—and I may want to come back and experiment with other centers.

    import matplotlib.pyplot as plt
    import numpy as np
    from itertools import combinations
    
    def center(points):
        return sum(points)/len(points)
    
    def draw_triangle(points):
        for ps in combinations(points, 2):
            xs = [ps[0][0], ps[1][0]]
            ys = [ps[0][1], ps[1][1]]        
            plt.plot(xs, ys, 'b-', alpha=0.1)
    
    def mesh(points, depth):
        if depth > 0:
            c = center(points)
            for pair in combinations(points, 2):
                pts = [pair[0], pair[1], c]
                draw_triangle(pts)
                mesh(pts, depth-1)
                
    points = [
        np.array([0, 1]),
        np.array([-0.866, -0.5]),
        np.array([ 0.866, -0.5]) ]
    
    mesh(points, 3)
    plt.axis("off")
    plt.gca().set_aspect("equal")
    plt.show()

Related posts

Curvature at Cairo

I was flipping through Gravitation [1] this weekend and was curious about an illustration on page 309. This post reproduces that graph.

The graph is centered at Cairo, Egypt and includes triangles whose side lengths are the distances between cities. The triangles are calculated using only distances, not by measuring angles per se.

The geometry of each triangle is Euclidean: giving the three edge lengths fixes all the features of the figure, including the indicated angle. … The triangles that belong to a given vertex [i.e. Cairo], laid out on a flat surface, fail to meet.

I will reproduce the plot in Python because I’m more familiar with making plots there. But I’ll get the geographic data out of Mathematica, because I know how to do that there.

Geographic information from Mathematica

I found the distances between the various cities using the GeoDistance function in Mathematica. The arguments to GeoDistance are “entities” which are a bit opaque. When using Mathematica interactively, you can use ctrl + = to enter the name of an entity. There’s some guesswork, e.g. whether I meant New York City or the state of New York when I entered “New York”, but Mathematica guessed correctly. The following code lists the city entities explicitly.

    cities = {
        Entity["City", {"Cairo", "Cairo", "Egypt"}], 
        Entity["City", {"Delhi", "Delhi", "India"}], 
        Entity["City", {"Moscow", "Moscow", "Russia"}], 
        Entity["City", {"Brussels", "Brussels", "Belgium"}], 
        Entity["City", {"Reykjavik", "Hofudhborgarsvaedhi", "Iceland"}], 
        Entity["City", {"NewYork", "NewYork", "UnitedStates"}], 
        Entity["City", {"CapeTown", "WesternCape", "SouthAfrica"}], 
        Entity["City", {"PortLouis", "PortLouis", "Mauritius"}] }

Most of these are predictable, but I would not have guessed the code for Reykjavik or Cape Town. I found these by using the command InputForm and entering the entities as above.

I found the distance from Cairo to each of the other cities with

    Table[GeoDistance[cities[[1]], cities[[i]]], {i, 2, 8}]

and the distances from the cities to their neighbors with

    Table[GeoDistance[cities[[i]], cities[[i + 1]]], {i, 2, 7}]
    GeoDistance[cities[[8]], cities[[2]]]

Drawing the plot

Now that we’ve got the data, how do we draw the plot?

Let’s put Cairo at the origin. First we draw a line from Cairo to Delhi. We might as well put Delhi on the x-axis to make things simple.

Next we need to plot Moscow. We know the distance R1 from Cairo to Moscow, and the distance R2 from Delhi to Moscow. So imagine drawing a circle of radius R1 centered at Cairo and a circle of radius R1 centered at Delhi. Moscow is located where the two circles intersect. The previous post shows how to find the intersection of circles.

The two circles intersect in at two points, so which do we choose? We choose the intersection point that preserves the orientation of the original graph (and the globe). As we go through the cities in counterclockwise order, the cross product of the previous line to the next line should have positive z component.

This shows that the original graph was not to scale, though the gap between triangles was approximately to scale. In hindsight this should have been obvious: Brussels and Reykjavik are much closer to each other than Capetown and New York are.

The gap

Why the gap? Because the earth is curved at Cairo (and everywhere else). If the earth were flat, the triangles would fit together without any gaps.

There’s no gap when you take spherical triangles on the globe. But even though the triangles preserve length when projected to the plane, they cannot preserve angles too. The sum of the angles in a spherical triangle adds up to more than 180°, and the amount by which the sum exceeds 180° is proportional to the size of the spherical triangle. Since the angles of triangles in the plane do add up to 180°, each flat triangle fails to capture a bit of the corresponding spherical triangles, and the failures add up to the gap we see in the image.

[1] Gravitation by Misner, Thorne, and Wheeler. 1973.

Calculating the intersection of two circles

Given the equations for two circles, how can you tell whether they intersect? And if they do intersect, how do you find the point(s) of intersection?

MathWorld gives a derivation, but I’d like to add the derivation there in two ways. First, I’d like to be more explicit about the number of solutions. Second, I’d like to make the solution more general.

The derivation begins with the simplifying assumption that one circle is centered at the origin and the other circle is centered somewhere along the x-axis. You can always change coordinates so that this is the case, and doing so simplifies the presentation. Undoing this simplification is implicitly left as an exercise to the reader. I will go through this exercise here because I want a solution I can use in software.

Finding the x coordinate

Suppose the first circle, the one centered at the origin, has radius R. The other circle is centered at (d, 0) for some d, and has radius r. The  x-coordinate of the intersection is shown to satisfy the following equation.

x = \frac{d^2 -r^2 + R^2}{2d}

When I got to this point in the derivation I was wondering what assumption was made that guaranteed there is a solution. Clearly if you increase d enough, moving the second circle to the right, the circles won’t intersect. And yet the derivation for x always succeeds, unless d = 0. If d does equal 0, the two circles are concentric. In that case they’re the same circle if R = r; otherwise they never intersect.

Finding the y coordinate

Why does the derivation for x always succeed even though the intersection might be empty? The resolution depends on the solution for y. What we’ve found is that if the circles intersect, the x coordinate of the point(s) of intersection is given by the equation above.

The y coordinate of the point(s) of intersection satisfies

 \begin{align*} y^2 &= \frac{4d^2R^2 - (d^2 - r^2 + R^2)^2}{4d^2} \\ &= \frac{(-d+r-R)(-d-r+R)(-d+r+R)(d+r+R)}{4d^2} \end{align*}

If the numerator is negative, there is no real solution, no intersection. If the numerator is zero, there is one solution, and the two circles are tangent. if the numerator is positive, there are two solutions.

The distance between the two intersection points, if there are two intersection points, is a = 2|y|. This will be needed below.

a = \frac{\sqrt{(-d+r-R)(-d-r+R)(-d+r+R)(d+r+R)}}{d}

General position

Now suppose we’re first circle is centered at (x0, y0) and the second circle is centered at (x1, y1). Again we let d be the distance between the centers of the circles. The circles intersect twice if d < R + r, once if d = R + r, and never if d > R + r.

Imagine for a moment shifting and rotating the plane so that (x0, y0) goes to the origin and goes to (d, 0). The length of the line segment between the two intersection points is still given by a above. And the distance from the center of the first circle to that line segment is given by the equation for x above.

So to find the points of intersection, we first form a unit vector in the direction of the center of the first circle headed toward the center of the second circle. This is the black line at the top of the post. We move a distance x along this line, with x as in the equation above, and then move perpendicularly a distance a/2 in either direction. This is the dashed gray line.

Python code

The following code implements the algorithm described above.

    def circle_intersect(x0, y0, r0, x1, y1, r1):
        c0 = np.array([x0, y0])
        c1 = np.array([x1, y1])
        v = c1 - c0
        d = np.linalg.norm(v)
    
        if d > r0 + r1 or d == 0:
            return None
        
        u = v/np.linalg.norm(v)
        xvec = c0 + (d**2 - r1**2 + r0**2)*u/(2*d)
    
        uperp = np.array([u[1], -u[0]])
        a = ((-d+r1-r0)*(-d-r1+r0)*(-d+r1+r0)*(d+r1+r0))**0.5/d
        return (xvec + a*uperp/2, xvec - a*uperp/2)

Application

This post started out to be part of the next post, but it turned out to be big enough to make its own post. The next post looks carefully at an example that illustrates how you could discover that you’re living on a curved surface just by measuring the distances to points around you. I needed the code in this post to make the image in the next post.

The numerical range ellipse

Let A be an n × n complex matrix. The numerical range of A is the image of x*Ax over the unit sphere. That is, the numerical range of A is the set W(A) in defined by

W(A) = {x*Ax | x ∈ ℂn and ||x|| = 1}

where x* is the conjugate transpose of the column vector x.

The product of a row vector and a column vector is a scalar, so W(A) is a subset of the complex plane ℂ.

When A is a 2 × 2 matrix, W(A) is an ellipse, and the two foci are the eigenvalues of A.

Denote the two eigenvalues by λ1 and λ2 . Denote the major and minor axes of the ellipse by a and b respectively. Then

b² = tr(A*A) − |λ1|² − |λ2|².

where tr is the trace operator. This is the elliptical range theorem [1]. The theorem doesn’t state the major axis a explicitly, but it can be found from the two foci and the minor axis b.

Demonstration

We will create a visualization the numerical range of a matrix directly from the definition, generating x‘s at random and plotting x*Ax. Then we draw the ellipse described in the elliptical range theorem and see that it forms the boundary of the random points.

The following code plots 500 random points in the numerical range of a matrix formed from the numbers in today’s date. The eigenvalues are plotted in black.

import numpy as np
from scipy.stats import norm
import matplotlib.pyplot as plt

A = np.array([[8, 16], [20, 23j]])

for _ in range(5000):
    v = norm(0, 1).rvs(size=4)
    v /= np.linalg.norm(v)
    x = np.array([v[0] + 1j*v[1], v[2] + 1j*v[3]])
    z = np.conj(x).dot(A.dot(x))
    plt.plot(z.real, z.imag, 'o', color='0.9')

eigenvalues, eigenvectors = np.linalg.eig(A)    
plt.plot(eigenvalues[0].real, eigenvalues[0].imag, 'ko')
plt.plot(eigenvalues[1].real, eigenvalues[1].imag, 'ko')

The following code draws the ellipse guaranteed by the elliptical range theorem.

A_star_A = np.dot(np.conj(A).T, A)
b = (np.trace(A_star_A) - abs(eigenvalues[0])**2 - abs(eigenvalues[1])**2)**0.5
c = 0.5*abs(eigenvalues[0] - eigenvalues[1])
a = (2/3)*(2*(2*b**2/4 + c**2)**0.5 + c)

t = np.linspace(0, np.pi*2)
z = a*np.cos(t)/2 + 1j*b*np.sin(t)/2
u = (eigenvalues[1] - eigenvalues[0])/2
u /= np.linalg.norm(u)
m = eigenvalues.mean()
z = z*u + m

plt.plot(z.real, z.imag, 'b-')

Here is the result.

Here is the result of running the same code with 10,000 random points.

Related posts

[1] Ulrich Daepp, Pamela Gorkin, Andrew Shaffer, and Karl Voss. Finding Ellipses: What Blaschke Products, Poncelet’s Theorem, and the Numerical Range Know about Each Other. MAA Press 2018.

Random slices of a sphube

Ben Grimmer posted something yesterday on Twitter:

A nice mathematical puzzle🧐

If you take a 4-norm ball and cut it carefully, you will find a two-norm ball. 3D printed visual evidence below.
The puzzle: Why does this happen and how much more generally does it happen?

(This question was first posed to me by Pablo Parrilo)

This is a surprising result, and it touches on two things I’ve written about before.

Several years ago I wrote about the surprising result that if you slice open a Menger sponge a certain way then you can see six-pointed stars. I was just as surprised to see the result above. I also wrote several posts about “squircles,” and the 4-norm ball is a 3D version of the squircle, sometimes called a “sphube.”

I modified the code I’d written for the Menger sponge post to create random slides of the 4-norm ball. Here are some of the results.

One of these happens to be a circle, or at least close enough that it looks like a circle.

My code generated a random point and a random normal vector to determine a plane through that point. That’s how I produced the square-ish and triangle-ish plots.

For the last image, I set the normal vector to (1, 1, 1) because the 3D printed image above suggested that would be the right direction. I generated random points inside the cube, and one time the random point was approximately (0.2, 0, -0.2) and noticed the plot was approximately circular. It seems that when z = −x and the normal vector is (1, 1, 1) you get a circular cross section. By symmetry you can make many more examples from this example.

I have not proved anything yet; just played with plots.

Spherical coordinate Rosetta Stone

If you’ve only seen one definition of spherical coordinates, you may be shocked to discover that there are multiple conventions. In particular, mathematicians and geoscientists have different conventions. As Volker Michel put it in book on constructive approximation,

Many mathematicians have faced weird jigsaw puzzles with misplaced continents after using a data set from a geoscientist.

Nor are there only two conventions. This post will cover three conventions, which I will call physics, math, and geography. This is not to imply that everyone in a profession uses the conventions named after the profession.

As with Fourier transforms there are many conventions. Presumably the reason for the proliferation of conventions is the same: a useful basic concept will be discovered multiple times in different areas of application. When there are trade-offs between conventions, people working in each area will choose the convention that is simplest for the things they care most about.

Questions and definitions

Given a triple of numbers (c1, c2, c3), that represent spherical coordinates, there are two basic questions:

  1. What do the numbers mean?
  2. What symbol should be used to represent them?

Part of the meaning question is whether the coordinate system is left-handed or right-handed. There is also a minor question of the standard range of equivalent angles.

Polar angle is the angle down from the positive z-axis, the north pole. The north pole has polar angle 0 and the south pole has polar angle π.

Latitude is the angle up from the xy plane or the equator. The latitude of the north pole is π/2 and the latitude of the south pole is −π/2.

Azimuth is the angle from an initial meridian (e.g. the Prime Meridian in geography). In mathematical terms it is the angle between the x-axis and a line connecting the projection of a point to the xy to the origin.

Physics convention

The physics convention is also the ISO 80000-2 convention. The three coordinates are

(radial distance to origin, polar angle, azimuthal angle)

and are conventionally denoted

(r, θ, ϕ).

The point (r, θ, ϕ) corresponds to

(r sin θ cos ϕ, r sin θ sin ϕ, r cos θ)

in Cartesian coordinates.

The coordinate system is right-handed: ϕ increases in the counterclockwise direction when looking down on the xy plane.

The polar angle typically runs from 0 to π and the azimuthal angle typically runs from 0 to 2π.

Math convention

In this convention the three coordinates are

(radial distance to origin, azimuthal angle, polar angle)

and are denoted

(ρ, θ, ϕ).

When compared to the physics notation we have something of a Greek letter paradox. Both conventions agree that the second coordinate should be written θ and the third coordinate should be written ϕ, but the two conventions mean opposite things by the two symbols.

The advantage of using θ to denote the azimuthal angle is that then θ has the same meaning as it has in polar coordinates. Similarly, the advantage of using ρ rather than r is to preserve the meaning of r in polar and cylindrical coordinates: ρ is the distance in 3 dimensions from a point to the origin, and r is the distance in 2 dimensions from the projection of a point to the xy plane and the origin.

Sometimes you’ll see r rather than ρ for the first coordinate.

The point (ρ, θ, ϕ) corresponds to

(ρ cos θ sin ϕ, ρ sin θ sin ϕ, ρ cos ϕ).

The math convention uses the same orientation and typical range coordinate ranges as the physics convention.

Geography and geoscience

Geographic coordinate systems are complicated by the fact that the earth is not perfectly spherical. Geographic latitude and geocentric latitude are not exactly the same as the following highly exaggerated diagram shows.

However, for this post we will assume a perfectly spherical earth in which geographic and geocentric latitudes are the same.

A point on the earth’s surface is described by two numbers, latitude and longitude. This corresponds to spherical coordinates in which the first coordinate is implicitly the earth’s radius. You’ll see latitude and longitude listed in either order.

The polar angle is sometimes called colatitude because latitude is the complementary angle of latitude. Longitude corresponds to azimuth. There are multiple symbols used for latitude, though λ is common for longitude. If you see one coordinate denoted λ, the other is latitude.

The safest, most explicit notation is to use the words longitude and latitude, not depending on symbols or order. Longitude is implicitly longitude east of the Prime Meridian, but to be absolutely clear, put an E at the end. Latitude typically ranges from −π/2 to π/2 (−90° to 90°). Longitude either runs from 0 to 2π or from −π to π (−180° to 180°).

To convert to Cartesian coordinates, set polar angle to π/2 − latitude, azimuth to longitude (east), and use the formulas above for either the physics or mathematics notation. If you have elevation, this is the radial distance above the earth’s surface, so ρ is the earth’s radius plus elevation.

Related posts