Rational Trigonometry

Rational trigonometry is a very different way of looking at geometry. At its core are two key ideas. First, instead of distance, do all your calculations in terms of quadrance, which is distance squared. Second, instead of using angles to measure the separation between lines, use spread., which turns out to be the square of the sine of the angle [1].

What’s the point of these two changes? Quadrance and spread can be computed purely algebraically in terms of point coordinates. No square roots, no sines, and no cosines.

The word “rational” in rational trigonometry enters in two ways. The spread of two lines is a rational function of their coordinate descriptions, i.e. the ratio of polynomials. Also, if the coordinates of a set of points are rational numbers, so are the quadrances and spreads.

I became interested in rational trigonometry when I was working on a project to formally verify drone collision avoidance software. Because all calculations are purely algebraic, involving no transcendental functions, theorems are easier to prove.

Rational trigonometry is such a departure from the conventional approach—trig without trig functions!—you have to ask whether it has advantages rather than simply being novel for the sake of being novel. I mentioned one advantage above, namely easier formal theorem proving, but this is a fairly esoteric application.

Another advantage is accuracy. Because a computer can carry out rational arithmetic exactly, rational trigonometry applied to rational points yields exact results. No need to be concerned about the complications of floating point numbers.

A less obvious advantage is that the theory of rational trigonometry is independent of the underlying field [2]. You can work over the rational numbers, but you don’t need to. You could work over real or complex numbers, or even finite fields. Because you don’t take square roots, you can work over fields that don’t necessarily have square roots. If you’re working with integers modulo a prime, half of your numbers have no square root and the other half have two square roots. More on that here.

Why would you want to do geometry over finite fields? Finite fields are important in applications: error-correcting codes, cryptography, signal processing, combinatorics, etc. And by thinking of problems involving finite fields as geometry problems, you can carry your highly developed intuition for plane geometry into a less familiar setting. You can forget for a while that you’re working over a finite field and proceed as you ordinarily would if you were working over real numbers, then check your work to make sure you didn’t do anything that is only valid over real numbers.

[1] Expressing spread in terms of sine gives a correspondence between rational and classical trigonometry, but it is not a definition. Spread is defined as the ratio of certain quadrances. It is not necessary to first compute a sine, and indeed rational trigonometry extends to contexts where it is not possible to define a sine.

[2] Rational trig is not entirely independent of field. It requires that fields not have characteristic 2, fields where 1 + 1 ≠ 0. This rules out finite fields of order 2n.

Why determinants with columns of ones?

Geometric equations often involve a determinant with a column of 1s. For example, the equation of a line through two points:

\begin{vmatrix} x & y & 1 \\ x_1 & y_1 & 1 \\ x_2 & y_2 & 1 \\ \end{vmatrix} = 0

The equation of a circle through three points:

\begin{vmatrix} x^2 + y^2 & x & y & 1 \\ x_1^2 + y_1^2 & x_1 & y_1 & 1 \\ x_2^2 + y_2^2 & x_2 & y_2 & 1 \\ x_3^2 + y_3^2 & x_3 & y_3 & 1 \\ \end{vmatrix} = 0

The equation of a general conic section through five points

\begin{vmatrix} x^2 & xy & y^2 & x & y & 1 \\ x_1^2 & x_1 y_1 & y_1^2 & x_1 & y_1 & 1 \\ x_2^2 & x_2 y_2 & y_2^2 & x_2 & y_2 & 1 \\ x_3^2 & x_3 y_3 & y_3^2 & x_3 & y_3 & 1 \\ x_4^2 & x_4 y_4 & y_4^2 & x_4 & y_4 & 1 \\ x_5^2 & x_5 y_5 & y_5^2 & x_5 & y_5 & 1 \\ \end{vmatrix} = 0

The equation of a Mobius (bilinear) transformation sending z1, z2, and z3 to w1, w2, and w3:

\begin{vmatrix} 
z & w & zw  & 1\\
z_1 & w_1 & z_1w_1 & 1 \\
z_2 & w_2 & z_2w_2 & 1 \\
z_3 & w_3 & z_3w_3 & 1 \\ 
\end{vmatrix} = 0

Why all the determinants and why all the 1s?

When you see a determinant equal to zero, you immediately think of matrix rows or columns being linearly dependent. But in the examples above it isn’t the Cartesian coordinates that are linearly dependent but projective coordinates that are dependent.

The 1s are in the last column, though they need not be, as a clue as to where they came from. You could permute the rows and columns any way you like and the determinant would still be zero. The 1s are in the last column because you can take Cartesian coordinates into projective coordinates by adding a 1 at the end.

This 1 is sort of a silent partner, and can be ignored much of the time. But the last projective coordinate is critical when it’s necessary to be rigorous about points at infinity. The examples above are interesting because they are an application of homogeneous coordinates when there’s no concern about points at infinity.

Related posts

When a cubic or quartic has a double root

Thanks to the quadratic equation, it’s easy to tell whether a quadratic equation has a double root. The equation

ax^2 + bx + c = 0

has a double root if and only if the discriminant

b^2 - 4ac

is zero.

The discriminant of a cubic is much less known, and the analogs for higher order polynomials are unheard of. There is a discriminant for polynomials of all degrees, though the complexity of the discriminant grows quickly with the degree of the polynomial.

This post will derive the discriminant of a cubic and a quartic.

Resultants

The resultant of a pair of polynomials is zero if and only if the two polynomials have a common root. Resultants have come up in a couple previous posts about solving trigonometric equations.

A polynomial p(x) has a double root if p and its derivative p‘ are both zero somewhere. The discriminant of p is the resultant of p and p’.

The resultant of two polynomials is a determinant of their Sylvester matrix. This matrix is easier to describe by example than by equation. You basically fill a matrix with shifts of the coefficients of both polynomials and fill in the gaps with zeros.

MathWorld gives the following Mathematica code for the Sylvester matrix of two inputs.

    SylvesterMatrix1[poly1_, poly2_,  var_] :=
      Function[{coeffs1, coeffs2}, With[
        {l1 = Length[coeffs1], l2 = Length[coeffs2]},
          Join[
            NestList[RotateRight, PadRight[coeffs1,
              l1 + l2 -  2], l2 - 2],
            NestList[RotateRight, PadRight[coeffs2,
              l1 + l2 - 2], l1 - 2]
          ]
        ]
      ][
        Reverse[CoefficientList[poly1, var]],
        Reverse[CoefficientList[poly2, var]]
    ]

Cubic discriminant

If we apply this to the cubic polynomial

ax^3 + bx^2 + dx + c

we get the following matrix.

\left( \begin{array}{ccccc} a & b & c & d & 0 \\ 0 & a & b & c & d \\ 3 a & 2 b & c & 0 & 0 \\ 0 & 3 a & 2 b & c & 0 \\ 0 & 0 & 3 a & 2 b & c \\ \end{array} \right)

We can compute the resultant by taking the determinant of the above matrix.

    g[x_] := a x^3 + b x^2 + c x + d
    SylvesterMatrix1[g[x], D[g[x], x], x]

We get the following result

-a b^2 c^2 + 4 a^2 c^3 + 4 a b^3 d - 18 a^2 b c d + 27 a^3 d^2

and we can verify that this is the same result we would get from calling the Resultant directly with

    Resultant[g[x], D[g[x], x], x]

Although the resultant is defined in terms of a determinant, that doesn’t mean that resultants are necessarily computed by computing determinants. The Sylvester matrix is a very special matrix, and there are clever ways to exploit its structure to create more efficient algorithms.

Each term in the resultant has a factor of a, and the discriminant is the resultant divided by –a.

Quartic discriminant

Now let’s repeat our exercise for the quartic. The Sylvester matrix for the quartic polynomial

ax^4 + bx^3 + cx^2 + dx + e

and its derivative is

\left( \begin{array}{ccccccc} a & b & c & d & e & 0 & 0 \\ 0 & a & b & c & d & e & 0 \\ 0 & 0 & a & b & c & d & e \\ 4 a & 3 b & 2 c & d & 0 & 0 & 0 \\ 0 & 4 a & 3 b & 2 c & d & 0 & 0 \\ 0 & 0 & 4 a & 3 b & 2 c & d & 0 \\ 0 & 0 & 0 & 4 a & 3 b & 2 c & d \\ \end{array} \right)

I created the image above with the following Mathematica code.

    f[x_] := a x^4 + b x^3 + c x^2 + d x + e
    TeXForm[SylvesterMatrix1[f[x], D[f[x], x], x]]

If we take the determinant, we get the resultant, but it’s a mess.

256 a^4 e^3-192 a^3 b d e^2-128 a^3 c^2 e^2+144 a^3 c d^2 e \\ -27 a^3 d^4+144 a^2 b^2 c e^2-6 a^2 b^2 d^2 e-80 a^2 b c^2 d e \\ +18 a^2 b c d^3+16 a^2 c^4 e-4 a^2 c^3 d^2-27 a b^4 e^2\\ +18 a b^3 c d e -4 a b^3 d^3 -4 a b^2 c^3 e+a b^2 c^2 d^2

Again each term has a factor of a, so we can divide by a. to get the discriminant.

If we want to use this in code, we can have Mathematica export the expression in C code using CForm. To generate Python code, it’s more convenient to use FortranForm since Python like Fortran uses ** for exponents.

The following Python code was created by pasting the output of

    FortranForm[Resultant[f[x], D[f[x], x], x]]

and making it into a function.

    def quartic_resultant(a, b, c, d, e):
        return a*b**2*c**2*d**2 - 4*a**2*c**3*d**2 - 4*a*b**3*d**3 + 18*a**2*b*c*d**3 \ 
        - 27*a**3*d**4 - 4*a*b**2*c**3*e + 16*a**2*c**4*e + 18*a*b**3*c*d*e \
        - 80*a**2*b*c**2*d*e - 6*a**2*b**2*d**2*e + 144*a**3*c*d**2*e \
        - 27*a*b**4*e**2 + 144*a**2*b**2*c*e**2 - 128*a**3*c**2*e**2 \
        - 192*a**3*b*d*e**2 + 256*a**4*e**3

Let’s try this on a couple examples. First

x^2(x-2)(x-3) = x^4 -5x^3 + 6x^2

which has a double root at 0.

As expected

    quartic_resultant(1, -5, 6, 0, 0)

returns 0.

Next let’s try

(x-1)(x-2)(x-3)(x-4) = x^4 -10x^3 +35x^2 - 50x + 24

The call

    quartic_resultant(1, -10, 35, -50, 24)

returns 144. We expected a non-zero result since our polynomial has distinct roots at 1, 2, 3, and 4.

Higher order discriminants

In general the discriminant of an nth degree polynomial is the resultant of the polynomial and its derivative, up to a constant. There’s no need to worry about this constant if you’re only concern is whether the discriminant is zero. To get the exact discriminant, you divide the resultant by the leading coefficient of the polynomial and adjust the sign.

The sign convention is a little strange. If you look back at the examples above, we divided by –a in the cubic case but we divided by a in the quartic case. You might reasonably guess that you should divide by a and multiply by (-1)n. But that won’t give you right sign for the quadratic case. The conventional sign is

(-1)n(n – 1)/2.

So when n equals 2 or 3 we get a negative sign, but when n equals 4 we don’t.

Related posts

Finding where two quadratic curves intersect

Suppose you have two quadratic polynomials in two real variables, f(x, y) and g(x, y), and you want to know whether the two curves

f(x, y) = 0

and

g(x, y) = 0

intersect, and if they do, find where they intersect.

David Eberly has a set of notes on solving systems of polynomial equations that give the conditions you need. The conditions are quite complicated, but they are explicit and do not involve any advanced math. He gives 14 equations that combine to form the coefficients in a 4th degree polynomial that will tell you whether the curves intersect, and will get you started on finding the points of intersection.

Algebraic geometry

There is a more elegant solution, but it requires much more background. It requires working in the complex projective plane ℙ² rather than the real plane ℝ². What does that mean? First, it means you work with complex numbers rather than real numbers. Second, it means adding “points at infinity.” This sounds awfully hand-wavy, but it can all be made rigorous using homogeneous coordinates with three complex numbers.

If you haven’t seen this before, you may be thinking “You want me to go from pairs of real numbers to triples of complex numbers? And this is supposed to make things easier?!” Yes, it actually does make things easier.

Once you move to ℙ², the question of whether the curves f(x, y) = 0 and g(x, y) = 0 intersect becomes trivial: yes, they intersect. Always. A special case of Bézout’s theorem says any two quadratic curves are either the same curve or they intersect in at 4 points, counting intersections with multiplicity.

Your original question was whether the curves intersect in ℝ², and now that question reduces to asking whether the 4 points guaranteed by Bézout’s theorem lie in ℝ². They might not be real, or they might be points at infinity.

Even if you go by David Eberly’s more elementary approach, you still face a similar question. You have a fourth degree polynomial, and you need to know whether it has real roots. How would you do that? You’d probably find all four complex roots first and see whether any are in fact real. Moving from the real numbers to the complex numbers makes it easier to find the roots of polynomials, and moving to complex projective space carries this idea further.

In the real plane ℝ² the zero set of a quadratic polynomial can be an ellipse, a parabola, or a hyperbola. (The circle is a special case of an ellipse.) But in ℙ² there is only one kind of conic section: there is a change of variables that turns the zero set of any quadratic polynomial into a circle. So now we’ve reduced the problem to finding where two circles intersect.

Finding the intersection of two conic sections illustrates the principles discussed in the preface to Algebraic Geometry: A Problem Solving Approach.

One way of doing mathematics is to ask bold questions about the concepts your are interested in studying. Usually this leads to fairly complicated answers having many special cases. An important advantage to his approach is that questions are natural and easy to understand. A disadvantage is that proofs are hard and often involve clever tricks, the origins of which are very hard to see.

A second approach is to spend time carefully defining the basic terms, with the aim that the eventual theorems and proofs are straightforward. Here the difficulty is in understanding how the definitions, which often initially seem somewhat arbitrary, ever came to be. The payoff is that the deep theorems are more natural, their insights more accessible, and the theory made more aesthetically pleasing.

Related posts

 

Solving systems of polynomial equations

In a high school algebra class, you learn how to solve polynomial equations in one variable, and systems of linear equations. You might reasonably ask “So when do we combine these and learn to solve systems of polynomial equations?” The answer would be “Maybe years from now, but most likely never.” There are systematic ways to solve systems of polynomial equations, but you’re unlikely to ever see them unless you study algebraic geometry.

Here’s an example from [1]. Suppose you want to find the extreme values of

x^3 + 2xyz - z^2

on the unit sphere using Lagrange multipliers. This leads to the following system of polynomial equations where λ is the Lagrange multiplier.

\begin{eqnarray*} 3x^2 + 2yz - 2x\lambda &=& 0 \\ 2xz - 2y\lambda &=& 0 \\ 2xy - 2z - 2z\lambda &=& 0 \\ x^2 + y^2 + z^2 - 1 &=& 0 \end{eqnarray*}

There’s no obvious way to go about solving this system of equations. However, there is a systematic way to approach this problem using a “lexicographic Gröbner basis.” This transforms the problem from into something that looks much worse but that is actually easier to work with. And most importantly, the transformation is algorithmic. It requires some computation—there are numerous software packages for doing this—but doesn’t require a flash of insight.

The transformed system looks intimidating compared to the original:

\begin{eqnarray*} \lambda - \frac{3}{2}x - \frac{3}{2}yz - \frac{167616}{3835}z^6 - \frac{36717}{590}z^4 - \frac{134419}{7670}z^2 &=& 0 \\ x^2 + y^2 + z^2 - 1 &=& 0 \\ xy - \frac{19584}{3835}z^5 + \frac{1999}{295}z^3 - \frac{6403}{3835}z &=& 0 \\ xz + yz^2 - \frac{1152}{3835}z^5 - \frac{108}{295}z^3 + \frac{2556}{3835}z &=& 0 \\ y^3 + yz^2 - y -\frac{9216}{3835}z^5 + \frac{906}{295}z^3 - \frac{2562}{3835}z &=& 0 \\ y^2z - \frac{6912}{3835}z^5 -\frac{827}{295}z^3 -\frac{3839}{3835}z &=& 0 \\ yz^3 - yz - \frac{576}{59}z^6 + \frac{1605}{118}z^4 - \frac{453}{118}z^2 &=& 0 \\ z^7 - \frac{1763}{1152}z^5 + \frac{655}{1152}z^3 - \frac{11}{288}z &=& 0 \end{eqnarray*}

We’ve gone from four equations to eight, from small integer coefficients to large fraction coefficients, from squares to seventh powers. And yet we’ve made progress because the four variables are less entangled in the new system.

The last equation involves only z and factors nicely:

z(z^2 - 1)\left(z^2 - \frac{4}{9}\right) \left(z^2 - \frac{11}{128}\right) = 0

This cracks the problem wide open. We can easily find all the possible values of z, and once we substitute values for z, the rest of the equations simplify greatly and can be solved easily.

The key is that Gröbner bases transform our problem into a form that, although it appears more difficult, is easier to work with because the variables are somewhat separated. Solving one variable, z, is like pulling out a thread that then makes the rest of the threads easier to separate.

Related: The great reformulation of algebraic geometry

* * *

[1] David Cox et al. Applications of Computational Algebraic Geometry: American Mathematical Society Short Course January 6-7, 1997 San Diego, California (Proceedings of Symposia in Applied Mathematics)

The great reformulation of algebraic geometry

“Tate helped shape the great reformulation of arithmetic and geometry which has taken place since the 1950’s.” — Andrew Wiles

At the Heidelberg Laureate Forum I had a chance to interview John Tate. In his remarks below, Tate briefly comments on his early work on number theory and cohomology. Most of the post consists of his comments on the work of Alexander Grothendieck.

* * *

JT: My first significant work after my thesis was to determine the cohomology groups of class field theory. The creators of the theory, including my thesis advisor Emil Artin, didn’t think in terms of cohomology, but their work could be interpreted as finding the cohomology groups H0, H1, and H2.

I was invited to give a series of three talks at MIT on class field theory. I’d been at a party, and I came home and thought about what I’d talk about. And I got this great idea: I realized I could say what all the higher groups are. In a sense it was a disappointing answer, though it didn’t occur to me then, that there’s nothing new in them; they were determined by the great work that had already been done. For that I got the Cole prize in number theory.

Later when I gave a talk on this work people would say “This is number theory?!” because it was all about cohomology groups.

JC: Can you explain what the great reformulation was that Andrew Wiles spoke of? Was it this greater emphasis on cohomology?

JT: Well, in the class field theory situation it would have been. And there I played a relatively minor part. The big reformulation of algebraic geometry was done by Grothendieck, the theory of schemes. That was really such a great thing, that unified number theory and algebraic geometry. Before Grothendieck, going between characteristic 0, finite characteristic 2, 3, etc. was a mess.

Grothendieck’s system just gave the right framework. We now speak of arithmetic algebraic geometry, which means studying problems in number theory by using your geometric intuition. The perfect background for that is the theory of schemes. ….

Grothendieck ideas [about sheaves] were so simple. People had looked at such things in particular cases: Dedekind rings, Noetherian rings, Krull rings, …. Grothendieck said take any ring. … He just had an instinct for the right degree of generality. Some people make things too general, and they’re not of any use. But he just had an instinct to put whatever theory he thought about in the most general setting that was still useful. Not generalization for generalization’s sake but the right generalization. He was unbelievable.

He started schemes about the time I got serious about algebraic geometry, as opposed to number theory. But the algebraic geometers classically had affine varieties, projective varieties, … It seemed kinda weird to me. But with schemes you had a category, and that immediately appealed to me. In the classical algebraic geometry there are all these birational maps, or rational maps, and they’re not defined everywhere because they have singularities. All of that was cleared up immediately from the outset with schemes. ….

There’s a classical algebraic geometer at Harvard, Joe Harris, who works mostly over the complex numbers. I asked him whether Grothendieck made much of a difference in the classical case — I knew for number theorists he had made a tremendous difference — and Joe Harris said yes indeed. It was a revolution for classical algebraic geometry too.

Related: Applied number theory