Solving systems of polynomial equations

In a high school algebra class, you learn how to solve polynomial equations in one variable, and systems of linear equations. You might reasonably ask “So when do we combine these and learn to solve systems of polynomial equations?” The answer would be “Maybe years from now, but most likely never.” There are systematic ways to solve systems of polynomial equations, but you’re unlikely to ever see them unless you study algebraic geometry.

Here’s an example from [1]. Suppose you want to find the extreme values of

x^3 + 2xyz - z^2

on the unit sphere using Lagrange multipliers. This leads to the following system of polynomial equations where λ is the Lagrange multiplier.

\begin{eqnarray*} 3x^2 + 2yz - 2x\lambda &=& 0 \\ 2xz - 2y\lambda &=& 0 \\ 2xy - 2z - 2z\lambda &=& 0 \\ x^2 + y^2 + z^2 - 1 &=& 0 \end{eqnarray*}

There’s no obvious way to go about solving this system of equations. However, there is a systematic way to approach this problem using a “lexicographic Gröbner basis.” This transforms the problem from into something that looks much worse but that is actually easier to work with. And most importantly, the transformation is algorithmic. It requires some computation—there are numerous software packages for doing this—but doesn’t require a flash of insight.

The transformed system looks intimidating compared to the original:

\begin{eqnarray*} \lambda - \frac{3}{2}x - \frac{3}{2}yz - \frac{167616}{3835}z^6 - \frac{36717}{590}z^4 - \frac{134419}{7670}z^2 &=& 0 \\ x^2 + y^2 + z^2 - 1 &=& 0 \\ xy - \frac{19584}{3835}z^5 + \frac{1999}{295}z^3 - \frac{6403}{3835}z &=& 0 \\ xz + yz^2 - \frac{1152}{3835}z^5 - \frac{108}{295}z^3 + \frac{2556}{3835}z &=& 0 \\ y^3 + yz^2 - y -\frac{9216}{3835}z^5 + \frac{906}{295}z^3 - \frac{2562}{3835}z &=& 0 \\ y^2z - \frac{6912}{3835}z^5 -\frac{827}{295}z^3 -\frac{3839}{3835}z &=& 0 \\ yz^3 - yz - \frac{576}{59}z^6 + \frac{1605}{118}z^4 - \frac{453}{118}z^2 &=& 0 \\ z^7 - \frac{1763}{1152}z^5 + \frac{655}{1152}z^3 - \frac{11}{288}z &=& 0 \end{eqnarray*}

We’ve gone from four equations to eight, from small integer coefficients to large fraction coefficients, from squares to seventh powers. And yet we’ve made progress because the four variables are less entangled in the new system.

The last equation involves only z and factors nicely:

z(z^2 - 1)\left(z^2 - \frac{4}{9}\right) \left(z^2 - \frac{11}{128}\right) = 0

This cracks the problem wide open. We can easily find all the possible values of z, and once we substitute values for z, the rest of the equations simplify greatly and can be solved easily.

The key is that Gröbner bases transform our problem into a form that, although it appears more difficult, is easier to work with because the variables are somewhat separated. Solving one variable, z, is like pulling out a thread that then makes the rest of the threads easier to separate.

Related: The great reformulation of algebraic geometry

* * *

[1] David Cox et al. Applications of Computational Algebraic Geometry: American Mathematical Society Short Course January 6-7, 1997 San Diego, California (Proceedings of Symposia in Applied Mathematics)

9 thoughts on “Solving systems of polynomial equations

  1. So *that’s* what it was!

    In the late ’80’s and early ’90’s I wrote software for R&D instrumentation, where my main job was algorithm extraction from scientists and their FORTRAN, then porting/re-implementing it in C running on bare metal.

    One algorithm required a system of polynomials that I couldn’t solve in the minimal C domain, as it relied on a special FORTRAN library from a US national lab. One of the scientists transformed it into the above basis and simplified it, and after recovering from my initial disbelief that it was equivalent to the original, I was off and running.

    That was about 30 years ago, and I’m amazed I recognized it at after all this time.

    Better late than never, I suppose…

  2. Sets of equations like this arise often in engineering contexts. When I was in college we learned to solve them using an iterative method that is so simple it’s a wonder it even works.

    Basically, given a set on n algebraic equations in n unknowns, you solve each equation for a single variable, so that you end up with a transformed set of equations x1 = f1(x2,x3,…xn); x2 = f2(x1,x3,…,xn); etc.

    You begin with an initial guess for the value of each variable (something that’s usually easy to do since there are physical constraints on most probles) and use the equations to calculate new values for each variable; iterate this way until the values converge (to within some predefined tolerance).

    There are other approximate methods, such as Gauss-Jordan.

  3. One advantage of algebraic methods over numerical methods is that the former can tell you how many solutions there are. And of course it doesn’t have to be either-or.

    Maybe algebraic methods tell you how many solutions there are, and give you some idea where they are, then you compute them numerically. Maybe Grobner bases partially untangle your equations, but you still need to solve the transformed equations numerically.

  4. Nice. But it’s interesting to append to the Groebner Basis successively each of the 4 factors of the z polynomial and then apply the algorithm
    to that set. The answers are very simple. For example, appending 128 z^2 -11 yields basis {128 z^2 -11, y + 3 z, 3 + 8 x, 8 (lambda) – 1.

  5. Extremely interesting! I love your writing , I can see a programming analogy regarding your last statement:
    Solving one variable, z, is like pulling out a thread that then makes the rest of the threads easier to separate.

    Similar to how I think of isolating concerns so it’s easier to deal with one thing at a time when keeping a program in the head.

  6. Can anyone suggest a book for self-studying the Grobner bases method for solving systems of polynomials?

Comments are closed.