John Conway and mental exercise rituals

Drawing of John Conway with horned sphere

John Horton Conway (1937–2020) came up with an algorithm in 1973 for mentally calculating what day of the week a date falls on. His method, which he called the “Doomsday rule” starts from the observation that every year, the dates 4/4. 6/6, 8/8, 10/10, 12/12, 5/9, 9/5, 7/11, and 11/7 fall on the same day of the week [1], what Conway called the “doomsday” of that year. That’s Monday this year.

Once you know the doomsday for a year you can bootstrap your way to finding the day of the week for any date that year. Finding the doomsday is a little complicated.

Conway had his computer set up so that it would quiz him with random dates every time he logged in.

Mental exercises

Recently I’ve been thinking about mental exercise rituals, similar to Conway having his computer quiz him on dates. Some people play video games or solve Rubik’s cubes or recite poetry.

Curiously, what some people do for a mental warm-up others do for a mental cool-down, such as mentally reviewing something they’ve memorized as a way to fall asleep.

What are some mental exercise rituals that you’ve done or heard of other people doing? Please leave examples in the comments.

More on Conway

The drawing at the top of the page is a sketch of Conway by Simon Frazer. The strange thing coming out of Conway’s head in the sketch is Alexander’s horned sphere, a famous example from topology. Despite appearances, the boundary of Alexander’s horned sphere is topologically equivalent to a sphere.

Conway was a free-range mathematician, working in a wide variety of areas, ranging from the classification of finite simple groups to his popular Game of Life. Of the 26 sporadic groups, three are named after Conway.

Here are some more posts where I’ve written about John Conway or cited something he wrote.

[1] Because this list of dates is symmetric in month and day, the rule works on both sides of the Atlantic.

Curiously simple approximations

As I’ve written about here and elsewhere, the following simple approximations are fairly accurate.

log10 x ≈ (x-1)/(x+1)

loge x ≈ 2 (x – 1)/(x + 1)

log2 x ≈ 3(x – 1)/(x + 1)

It’s a little surprising that each is as accurate as it is, but it’s also surprising that the approximations for loge and loge are small integer multiples of the approximation for log10.

Logarithms in all bases are proportional: If b and β are two bases, then for all x,

logβ(x) = logb(x) / logb(β).

So it’s not surprising that the approximations above are proportional. But the proportionality constants are not what the equation above would predict.

There are a couple things going on. First, these approximations are intended for rough mental calculations, so round numbers are desirable. Second, we want to minimize the error over some range. The approximation for logb(x) needs to work over the interval from 1/√b to √b. The approximations don’t work outside this range, but they don’t need to since you can always reduce the problem to computing logs in this interval. These two goals work together.

We could make the log10 x approximation more accurate near 1 if we multiplied it by 0.9. That would make the approximation a little harder to use, but it wouldn’t improve the accuracy over the interval 1/√10 to √10. The constant 1 works just as well as 0.9, and it’s easier to multiply by 1.

Here’s a plot of the approximation errors. Notice that different bases are plotted over different ranges since the log for each base b needs to work from 1/√b to √b.

 

Calculating where projective lines intersect

A couple days ago I wrote about homogeneous coordinates projective planes. I said that the lines y = 5 and y = 6 intersect in a point “at infinity.”

In projective geometry any two distinct lines intersect in exactly one point, and you can compute that intersection point the same way, whether the intersection is at a finite point or an infinite point [1].

Definitions and notation

In my post on projective duality I explained how lines are defined. A line is simply a set of three coordinates, not all zero, just like a point. We used parentheses for triples that represent points, and square brackets for triples that represent lines. All variables come from a field F; you can think of F as the real numbers but everything here works in general fields.

As we noted in that post, a point (a, b, c) is defined to be on the line [x, y, z] if

ax + by + cz = 0.

Points and lines are equivalence classes of triples; multiplying all three components by the same non-zero value gives an equivalent point or equivalent line.

When we want to think of a triple as a vector in a vector space, not a point in a projective plane and not an equivalence class, we used double parentheses. So (a, b, c) would be a representative of a point in the projective plane, and ((a, b, c)) is a vector in R³, or more generally F³ for a field F.

Note that point (a, b, c) is on the line [x, y, z] if the dot product of the vectors ((a, b, c)) and ((x, y, z)) is zero.

Calculating intersections

Two lines [a, b, c] and [d, e, f] intersect at a point (g, h, i) where ((g, h, i)) is the cross product of ((ab, c)) and ((d, e, f)).

So to find the intersection of two lines, we take representations of each line, reinterpret the three coordinates as components of a vector rather than a line in projective space, take the cross product of these two vectors, then reinterpret the components as homogeneous coordinates of a point. It’s interesting that this algorithm requires interpreting triples three different ways.

Note that using different representatives of the same lines just results in a different representative of the same intersection point, so the intersection calculation gives a well-defined result.

Next we’ll give two examples: one where lines intersect at a finite point and one where the intersect at a point at infinity.

Finite example

Consider the lines

y = 3x + 2

and

y = x + 4

The lines intersect at the point where x = 1 and y = 5 in the ordinary plane. If we embed each line in the projective plane and take the intersection there, we get the embedding of the point (1, 5) in the projective plane. Recall that we associate the point (x, y) in the ordinary plane with (x, y, 1) in the projective plane.

The line y = 3x + 2 becomes the line [3, -1, 2] in the projective plane because if

y = 3x + 2

then the dot product of ((x, y, 1)) with ((3, -1, 2)) is zero.

Similarly, the line y = x + 4 becomes the line [1, -2, 4] in the projective plane.

We have the cross product

((3, -1, 2)) × ((1, -2, 4)) = ((-2, -10, -2))

which is associated with the point (-2, -10, 2), which is in the same equivalence class as (1, 5, 1), which is the embedding of the point (1, 5) in the projective plane.

Infinite example

Now consider the parallel lines

y = 5

and

y = 6

as in previous posts.

These lines have projective representations [0, -1, 5] and [0, -1, 6].

We take the cross product to find

((0, -1, 5)) × ((0, -1, 6)) = ((-1, 0, 0)).

Now we said in this post that the two lines intersected at (1, 0, 0). But (-1, 0, 0) is equivalent to (1, 0, 0) in homogeneous coordinates, so we get the same result using a cross product that we got before using more direct reasoning.

The important thing to note is that we didn’t have to do anything special for lines that intersect “at infinity.” We execute the same algorithm in any case. If we didn’t realize our lines were parallel (in the ordinary plane: there are no parallel lines in the projective plane) we could turn the crank and then notice a 0 in the third coordinate when we’re done. Then we’d know that our intersection point is at infinity, and so the original lines must have been parallel (when considered as lines in the ordinary plane).

[1] Distinguishing some points as infinite and some as finite is an artifact of how we constructed the projective plane and how we think of an ordinary plane sitting inside. There’s nothing special about “points at infinity” other than the way we think of them.

Random Blaschke products and Mathematica binding

A Blaschke product is a function that is the product of Blaschke factors, functions of the form

b(z; a) = |a|  (a – z) / a (1 – a*z)

where the complex number a lies inside the unit circle and a* is the complex conjugate of a.

I wanted to plot Blaschke products with random values of a using Mathematica, and I ran into a couple items of interest.

First, although Mathematica has a function RandomComplex for returning complex numbers chosen by a pseudorandom number generator, this function selects complex numbers uniformly over a rectangle and I wanted to select uniformly over a disk. This is easy enough to get around. I wrote my own function by selecting a magnitude and phase at random.

    rc[] := Sqrt[RandomReal[]] Exp[-2 Pi I RandomReal[]]

Now if I reuse the definition of a Blaschke factor from an earlier post

    b[z_, a_] := (Abs[a]/a) (a - z)/(1 - Conjugate[a] z)

I can define a product of two random Blaschke factors as follows.

    f[z_] := b[z, rc[]]  b[z, rc[]]

However, this may not do what you expect. If you plot the function twice, you’ll get different results! It’s a matter of binding order. At what point in the process are the two random values of a chosen and fixed? The answer is some time between the definition of f and executing the plotting function. If the plotting function generated new values of a every time it needed to evaluate f the result would be a hot mess.

On the other hand, if we leave out the colon then the function f behaves as expected.

    f[z_] = b[z, rc[]]  b[z, rc[]]

Now random values are generated by each call to rc, and these values are frozen in the definition of f.

Here’s a Blaschke product with 10 random parameters.

    f[z_] = Product[b[z, rc[]], {i, 1, 10}]

The code

    ComplexPlot[f[z], {z, -1 - I, 1 + I}]

produces the following plot.

If I plot this function again, say changing the plot range, I’ll plot the same function. But if I execute the line of code defining f again, and call ComplexPlot again, I’ll generate a new function.

Incidentally, if I plot this same function using ComplexPlot3d I get the following.

Why does it look like a bowl? As explained in the post on Blaschke factors, each factor is zero at its parameter a and has a pole at the reflection of a in the unit disk.

The function plotted above has zeros inside the unit disk near the boundary. That means it also has poles outside the unit disk near the boundary on the other side.

Projective duality

The previous post explained how to define a projective plane over a field F. Now let’s look at how we do geometry in a projective plane.

Definitions

We have a definition of points from the other post: a point is a triple (a, b, c) of elements of F, with not all elements equal to zero, and two points (a, b, c) and (a′, b′, c′) are defined to be equivalent if there is a λ ≠ 0 such that

(a, b, c) = (λa′, λb′, λc′).

OK, so that defines points. How do we define lines? A line is a triple [x, y, z] of elements of F, with not all elements equal to zero, and two lines [x, y, z] and [x′, y′, z′] are defined to be equivalent if there is a λ ≠ 0 such that

[x, y, z] = [λx′, λy′, λz′].

Now you might object that the definitions of point and line are identical. Not at all: one uses parentheses and one uses brackets! :)

A point (a, b, c) is defined to be on the line [x, y, z] if

ax + by + cz = 0.

The definition is symmetric in the way it treats points and lines. A point (a, b, c) lies on the line [x, y, z] if and only if the point (x, y, z) lies on the line [a, b, c].

Points and lines are interchangeable in the sense that if you completely reverse your notion of points and lines all at once, you’d get the same geometry. That doesn’t mean you can ignore the difference between a point and a line; you can’t just swap some points and some lines.

Relation to vector spaces

Points and lines in a projective plane over a field F are defined as equivalence classes. Let’s use double parentheses to denote elements of the vector space F³. So ((a, b, c)) is a point in F³ and not an equivalence class.

The projective point (a, b, c) is on the projective line [x, y, z] if and only if the vector ((a, b, c)) is perpendicular to the vector ((x, yz)). Points and lines in a projective plane are analogous to vectors and dual vectors (or covectors as a physicist might say).

What about equivalence classes? If you multiply a vector by a scalar, you multiply its inner product with another vector by the same scalar, so the notion of when a point belongs to a line is well defined.

The duality between projective points and lines is analogous to the duality between vectors and planes in F³, as long as vectors are based at the origin and planes go through the origin. You can define a plane as the set of points orthogonal to a vector, or define a vector as the normal to a plane.

Synthetic definition

You can define projective planes without using fields. You can define a projective plane as a set of points P, a set of lines L, and a set of incidence rules satisfying three axioms:

  1. Any two distinct points are incident with exactly one line.
  2. Any two distinct lines are incident with exactly one point.
  3. There exist four points such that no three of these points are colinear.

Classical finite projective planes are isomorphic to projectives plane over a finite field, but there are non-classical possibilities. However, so far all non-classical examples have the same number of points as a classical example.

The Fano plane

The Fano plane is a set of seven points and seven lines as illustrated below.

The projective “points” are the black dots and projective “lines” are the Euclidean lines and the circle in the middle.

By duality, you could reverse these definitions, calling the black dots “lines” and the Euclidean lines and the circle the “points.”

Either way, any two points determine exactly one line, and two lines intersect at exactly one point.

It’s easy to see that the Fano plane satisfies the synthetic definition of a projective plane. It’s also true, but not obvious, that the Fano plane is isomorphic to the projective plane over the field with two elements. It is the unique finite field of its size.

Incidentally, there’s a way to multiply octonions using a Fano plane. See footnote [2] here. If you label the points with the seven basis elements (besides 1) the right way, and turn the Fano plane into a directed graph, then you can encode the multiplication rules so that the product of two elements is the third element in the line they two elements determine. Direction matters because the order of multiplication matters with octonions.

Finite projective planes

Given a field F, finite or infinite, you can construct a projective plane over F by starting with pairs of elements of F and adding “points at infinity,” one point for each direction.

Motivation: Bézout’s theorem

A few days ago I mentioned Bézout’s theorem as an example of a simple theorem that rests on complex definitions. Bézout (1730–1783) stated that in general a curve of degree n and a curve of degree m intersect in nm points. There are a lot of special cases excluded by the phrase “in general” that go away when you state Bézout’s theorem in a sophisticated context.

To make Bézout’s theorem rigorous you have to work over an algebraically complete field (e.g. the complex numbers rather than the real numbers), you have to count intersections with multiplicity, and you have to add points at infinity, i.e. you have to work in a projective plane.

Motivation: Elliptic curves

Bézout’s theorem involves infinite field ℂ, but this post is about finite projective planes. Elliptic curves over finite fields provide a motivating example of working in finite projective planes.

This blog is served over HTTPS and so serving up its pages involves public key cryptography. And depending on what protocol your browser negotiates with my server, you may be using elliptic curve cryptography to view this page.

An elliptic curve over a finite field is not an ellipse and not a curve, at least not a curve in the sense of a continuum of points. Elliptic curves over ℝ really are curves in the usual sense, but the definition is then abstracted in a way that extends to any field, including finite fields.

Homogeneous coordinates

In order to make this idea of “points at infinity” rigorous, we have to introduce a new coordinate system. We now describe points by (equivalence classes of) triples of field elements rather than pairs of field elements. The benefit of this added complexity is that points at infinity can be handled perfectly rigorously. Often the third coordinate can be ignored, but then you pay attention to it when you need to be careful.

In homogeneous coordinates, we consider two points (x, y, z) and (x′, y′, z′) equivalent if there is a λ ≠ 0 such that

(x, y, z) = (λx′, λy′, λz′).

Also, we require that not all three coordinates are 0, i.e. (0, 0, 0) is not an element of the projective plane.

We associate a point (x, y) with the triple (x, y, 1) and all its equivalents, i.e. all triples of the form (λx, λy, λ) with λ ≠ 0.

Points at infinity

A point at infinity is simply a point with third coordinate 0.

In the post mentioning Bézout’s theorem I said in passing that the lines y = 5 and y = 6 meet at infinity. Here’s how we can make this rigorous. The line y = 5 in the finite plane is the set of points of the form (x, 5), which embed in the projective plane as

(x, 5, 1)

and these points are equivalent to the set of points

(1, 5/x, 1/x)

Now take the limit as x → ∞ and we get (1, 0, 0). We consider this addition point to be part of the line y = 5 when the line is considered part of the projective plane. You can see that the line y = 6 contains the same point. So we can be very specific about where the lines y = 5 and y = 6 intersect: they intersect at (1, 0, 0).

You can go through a similar exercise to show that the lines y = x and y = x – 57 also intersect “at infinity” but at a different point at infinity, namely at (1, 1, 0) and its equivalents.This also shows that although the lines y = 5 and y = x both go off to infinity, they “reach” infinity at different points. The former goes to the point at infinity associated with horizontal lines and the latter goes to the point at infinity associated with 45 degree lines.

Note also that parallel lines meet at one point at infinity, not two. You might want to say, for example, that y = 5 and y = 6 meet twice, once at positive infinity and once at negative infinity. You could construct a system that works that way, but that’s not now projective planes work. Since non-zero multiples of a point are equivalent, (1, 0, 0) and (-1, 0, 0) are in the same equivalence class.

See this post for practical examples of when you might choose to have one or two points at infinity depending on your application.

Finite projective planes

We can construct projective planes containing a finite number of points by repeating the construction above using a finite field F. Finding finite projective planes not isomorphic to one constructed this way is hard [1].

Let F be a finite field with q elements. Then q is necessarily either a prime or a power of a prime, though that fact isn’t needed here.

The points of the finite projective plane over F are the points of the finite plane over F, i.e. pairs of elements of F, and the additional “points at infinity.” We can say exactly what these points at infinity are and count them.

A point (x, y) is embedded in the projective plane as (x, y, 1) and all points in its equivalence class. So for starters we have at least q² points in the finite projective plane, one for each pair (x, y).

First consider the case x ≠ 0. Then (x, y, 1) is equivalent to (1, y/x, 1/x) and so without loss of generality we can assume x = 1 (the multiplicative identity of the field). So for each y, (1, y, 0) is a point at infinity. That gives us q more elements of the finite projective plane.

Next consider the case x = 0. Then (0, y, 0) is another point at infinity as long as y ≠ 0. It’s only one point, because all non-zero multiples of (0, 1, 0) are in the same equivalence class.

So all together we have q² + q + 1 points, represented by the following elements of their equivalence classes:

  • (x, y, 0) for all x, y in F,
  • (1, y, 0) for all y, and
  • (0, 1, 0).

Related posts

[1] There are for non-isomorphic finite projective planes of order 9, i.e. planes with 9² + 9 + 1 = 91 points. And there are other examples of finite projective planes not isomorphic to planes constructed as outlined here. However, so far all such planes have the same number of points as a finite projective plane constructed as above.

Fixed points of bilinear transformations

Introduction

I was puzzled the first time I saw bilinear transformations, also known as Möbius transformations. I was in a class where everything had been abstract and general, and suddenly thing got very concrete and specific. I wondered why we had changed gears, and I wondered how there could be much to say about something so simple.

A bilinear transformation f has the form

f(z) = \frac{az + b}{cz + d}

where adbc ≠ 0.

The answer to my questions, which I did not realize at the time, was that bilinear transformations come up very often in applications, if not directly then indirectly as useful tools. For example, the electrical engineer’s Smith chart is a bilinear transformation. Also, most of the mental math tricks given here amount to bilinear approximations. And the Blaschke factors I wrote about yesterday are bilinear transformations.

Fixed points

For this post I want to look at fixed points of bilinear transformations, solutions to f(x) = x where f is as above. This amounts to solving a quadratic equation.

The locations of the fixed points are

\frac{a - d \pm \sqrt{\Delta}}{2c}

where

\Delta = (a + d)^2 - 4(ad-bc)

The trace of the transformation is a + d, and the trace classifies the transformation into one of three categories: parabolic, elliptic, or loxodromic.

The trace could be any number in the complex plane, and all values of the trace correspond to the loxodromic case except when the trace is a real number in the interval [-2, 2].

In the loxodromic case there are two distinct fixed points: one attractive and one repulsive.

Example

In this example I chose a = 1, b = i, c = 2, and d = 3. The two fixed points are at 0.136 + 0.393i and -1.136 – 0.393i.

I chose 200 starting points at random in the unit square and iterated the binlinear function. This produced the following graph.

So apparently 0.136 + 0.393i is the attracting fixes point.

Next I started at 200 random starting points in a tight neighborhood of -1.136 – 0.393i., all within 0.00001 of the fixed point. This produced the following graph.

If you start exactly on the repelling fixed point, you’ll stay there forever. But if you start a tiny distance away from this point you’ll end up at the attracting fixed point.

 

Partitioning complexity

This post looks at how to partition complexity between definitions and theorems, and why it’s useful to be able to partition things more than one way.

Quadratic equations

Imagine the following dialog in an algebra class.

“Quadratic equations always have two roots.”

“But what about (x – 5)² = 0. That just has one root, x = 5.”

“Well, the 5 counts twice.”

Bézout’s theorem

Here’s a more advanced variation on the same theme.

“A curve of degree m and a curve of degree n intersect in mn places. That’s Bézout’s theorem.”

“What about the parabola y = (x – 5)²  and the line y = 0. They intersect at one point, not two points.”

“The point of intersection has multiplicity two.”

“That sounds familiar. I think we talked about that before.”

“What about the parabola y = x² + 1 and the line y = 0. They don’t intersect at all.”

“You have to look at complex numbers. They intersect at x = i and x = –i.”

“Oh, OK. But what about the line y = 5 and the line y = 6. They don’t intersect, even for complex numbers.”

“They intersect at the point at infinity.”

In order to make the statement of Bézout’s theorem simple you have to establish a context that depends on complex definitions. Technically, you have to work in complex projective space.

Definitions and theorems

Michael Spivak says in the preface to his book Calculus on Manifolds

… the proof of [Stokes’] theorem is … an utter triviality. On the other hand, even the statement of this triviality cannot be understood without a horde of definitions … There are good reasons why the theorems should all be easy and the definitions hard.

There are good reasons, for the mathematician, to make the theorems easy and the definitions hard. But for students, there may be good reasons to do the opposite.

Here math faces a tension that programming languages (and spoken languages) face: how to strike a balance between the needs of novices and the needs of experts.

In my opinion, math should be taught bottom-up, starting with simple definitions and hard theorems, valuing transparency over elegance. Then, motivated by the complication of doing things the naive way, you go back and say “In light of what we now know, let’s go back and define things differently.”

It’s tempting to think you can save a lot of time by starting with the abstract final form of a theory rather than working up to it. While that’s logically true, it’s not pedagogically true. A few people with an unusually high abstraction tolerance can learn this way, accepting definitions without motivation or examples, but not many. And the people who do learn this way may have a hard time applying what they learn.

Applications

Application requires moving up and down levels of abstraction, generalizing and particularizing. And particularizing is harder than it sounds. This lesson was etched into my brain by an incident I relate here. Generalization can be formulaic, but recognizing specific instances of more general patterns often requires a flash of insight.

Spivak said there are good reasons why the theorems should all be easy and the definitions hard. But I’d add there are also good reasons to remember how things were formulated with hard theorems and easy definitions.

It’s good, for example, to understand analysis at a high level as in Spivak’s book, with all the machinery of differential forms etc. and also be able to swoop down and grind out a problem like a calculus student.

Going back to Bézout’s theorem, suppose you need to find real solutions a system of equations that amounts to finding where a quadratic and cubic curve intersect. You have a concrete problem, then you move up to the abstract setting of Bézout’s theorem learn that there are at most six solutions. Then you go back down to the real world (literally, as in real numbers) and find two solutions. Are there any more solutions that you’ve overlooked? You zoom back up to the abstract world of Bézout’s theorem, and find four more by considering multiplicities, infinities, and complex solutions. Then you go back down to the real world, satisfied that you’ve found all the real solutions.

A pure mathematician might climb a mountain of abstraction and spend the rest of his career there, but applied mathematicians have to go up and down the mountain routinely.

Blaschke factors

Blaschke factors are complex functions with specified zeros inside the unit disk. Given a complex number a with |a| < 1, the Blaschke factor associated with a is the function

b(z; a) = \frac{|a|}{a} \frac{a-z}{1 -\bar{a}z}

Notice the semicolon in b(z; a). This is a convention that a few authors follow, and that I wish more would adopt. From a purely logical perspective the semicolon serves no purpose because the expression on the right is a function of two variables, z and a. But the semicolon conveys how we think about the function: we think of it as a function of z that depends on a parameter a.

So, for example, if we were to speak of the derivative of b, we would mean the derivative with respect to z. And we could say that that b is an analytic function inside the unit disk. It is not an analytic function if we think of it as a function of a because of taking the conjugate of a in the denominator.

The function b(z; a) has a zero at a and a pole at the reciprocal of the conjugate of a. So, as I wrote about yesterday, this means that the zero and the pole are inversions of each other in the unit circle. If you punch a hole in the complex plane at the origin and turn the plane inside-out, without moving the unit circle, the zero and pole will swap places.

Here’s a plot of b(z; 0.3 + 0.3i).

Notice there’s a zero at 0.3 + 0.3i and a pole at

1/(0.3 – 0.3i) = 5/3 + 5/3i

which is the inversion of 0.3 + 0.3i in the unit circle. You can tell that one is a zero and the other is a pole because the colors rotate in opposite directions.

The plot above was made with the following Mathematica code.

    b[z_, a_] := (Abs[a]/a) (a - z)/(1 - Conjugate[a] z)
    ComplexPlot[b[z, .3 + .3 I], {z, -1.2 - 1.2 I, 2 + 2 I}, 
        ColorFunction -> "CyclicLogAbs"]

Because the zero and pole locations are inversions of each other, as the zero moves closer to the unit circle from the inside, the pole moves closer to the unit circle from the outside. Here’s a plot with a = 0.5 + 0.5i.

And the closer the zero gets to the origin, the further out the pole moves. Here’s a plot with a = 0.2 + 0.2i, this time on a larger scale, this time with the real and imaginary axes going up to 3 rather than 2.

Blaschke factors are the building blocks of Blaschke products, something I intend to write about in the future. By multiplying Blaschke factors together, you can create function that is analytic in the unit disk with specified zeros and with other nice properties.

Elias Wegert [1] says “In some sense, Blaschke products can be considered to be ‘hyperbolic counterparts’ of polynomials.” That’s something I’d like to explore further.

Related posts

[1] Elias Wegert. Visual Complex Functions: An Introduction with Phase Portraits. Birkhäuser.

Inversion in a circle

Inversion in the unit circle is a way of turning the circle inside-out. Everything that was inside the circle goes outside the circle, and everything that was outside the circle comes in.

Not only is the disk turned inside-out, the same thing happens along each ray going out from the origin. Points on that ray that are inside the circle go outside and vice versa. In polar coordinates, the point (r, θ) goes to (1/r, θ).

Complex numbers

In terms of complex numbers, inversion in the unit circle amounts to taking the reciprocal and the conjugate (in either order, because these operations commute). This is the same as dividing a complex number by the square of its magnitude. Proof:

z \bar{z} = |z|^2 \implies \frac{1}{\bar{z}} = \frac{z}{|z|^2}

There are two ways to deal with the case z = 0. One is to exclude it, and the other is to say that it maps to the point at infinity. This can be made rigorous by working on the Riemann sphere rather than the complex plane. More on that here.

Inverting a hyperbola

The other day Alex Kantorovich pointed out on Twitter that “the perfect figure 8 (or infinity) is simply the inversion through a circle of a hyperbola.” We’ll demonstrate this with Mathematica.

What happens to a point on the hyperbola

x^2 - y^2 = 1

when you invert it through a circle? If we think of x and y as the real and imaginary parts of a complex number, the discussion above shows that the point (x, y) goes to the same point divided by its length squared.

(x, y) \mapsto \left( \frac{x}{x^2 + y^2}, \frac{y}{x^2 + y^2} \right)

Let (u, v) be the image of the point (x< y).

\begin{align*} u &= \frac{x}{x^2 + y^2} \\ v &= \frac{y}{x^2 + y^2} \end{align*}

Then we have

u^2 + y^2 = \frac{x^2 + v^2}{(x^2 + y^2)^2} = \frac{1}{x^2 + y^2}

and

u^2 - v^2 = \frac{x^2 - y^2}{(x^2 + y^2)^2} = \frac{1}{(x^2 + y^2)^2}

because x² – y² = 1. Notice that the latter is the square of the former, i.e.

u^2 - v^2 = (u^2 + v^2)^2

Now we have everything to make our plot. The code

ContourPlot[{
    x^2 + y^2 == 1, 
    x^2 - y^2 == 1, 
    x^2 - y^2 == (x^2 + y^2)^2}, 
    {x, -3, 3}, {y, -3, 3}]

produces the following plot.

The blue circle is the first contour, the orange hyperbola is the second contour, and the green figure eight is the third contour.

Related posts