Multiplying by quaternions on the left and right

The map that takes a quaternion x to the quaternion qx is linear, so it can be represented as multiplication by a matrix. The same is true of the map that takes x to xq, but the two matrices are not the same because quaternion multiplication does not commute.

Let qa + bi + cjdk and let qM be the matrix that represents multiplication on the left by q. Then

_qM = \begin{bmatrix} a & -b & -c & -d \\ b & a & -d & c \\ c & d & a & -b \\ d & -c & b & a \\ \end{bmatrix}

Now let Mq be the matrix that represents multiplication on the right by q. Then

M_q = \begin{bmatrix} a & -b & -c & -d \\ b & a & d & -c \\ c & -d & a & b \\ d & c & -b & a \\ \end{bmatrix}

Can prove both matrix representations are correct by showing that they do the right thing when q = 1, ij, and k. The rest follows by linearity.

You might speculate that the matrix representation for multiplying on the right by q might be the transpose of the matrix representation for multiplying on the left by q. You can look at the matrices above and see that’s not the case.

In this post I talk about how to represent rotations with quaterions, and in this post I give an equation for the equivalent rotation matrix for a rotation described by a quaternion. You can prove that the matrix representation is correct by multiplying out qM and Mq* . Keep in mind that q in that case is a unit quaterion, so the sum of the squares of its components add to 1.

Related posts

Embeddings, Projections, and Inverses

I just revised a post from a week ago about rotations. The revision makes explicit the process of embedding a 3D vector into the quaternions, then pulling it back out.

The 3D vector is embedded in the quaternions by making it the vector part of a quaternion with zero real part:

(p1p2p3)  → (0, p1p2p3)

and the quaternion is returned to 3D by cutting off the real part:

(p0, p1p2p3)  → (p1p2p3).

To give names to the the process of moving to and from quaterions, we have an embedding E : ℝ3 to ℝ4 and a projection P from ℝ4 to ℝ3.

We can represent E as a 4 × 3 matrix

E = \begin{bmatrix} 0 & 0 & 0 \\ 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \\ \end{bmatrix}

and P by a 3 × 4 matrix

P = \begin{bmatrix} 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \\ \end{bmatrix}

We’d like to say E and P are inverses. Surely P undoes what E does, so they’re inverses in that sense. But E cannot undo P because you lose information projecting from ℝ4 to ℝ3 and E cannot recover information that was lost.

The rest of this post will look at three generalizations of inverses and how E and P relate to each.

Left and right inverse

Neither matrix is invertible, but PE equals the identity matrix on ℝ3 , and so P is a left inverse of E and E is a right inverse of P.

PE = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \\ \end{bmatrix}

On the other hand, EP is not an identity matrix, and so E is not a left inverse of E, and neither is P a right inverse of E.

EP = \begin{bmatrix} 0 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \\ \end{bmatrix}

Adjoint

P is the transpose of E, which means it is the adjoint of E. Adjoints are another way to generalize the idea of an inverse. More on that here.

Pseudo-inverse

The Moore-Penrose pseudo-inverse acts a lot like an inverse, which is somewhat uncanny because all matrices have a pseudo-inverse, even rectangular matrices.

Pseudo-inverses are symmetric, i.e. if A+ is the pseudo-inverse of A, then A is the pseudo-inverse of A+.

Given an m by n matrix A, the Moore-Penrose pseudoinverse A+ is the unique n by m matrix satisfying four conditions:

  1. A A+ A = A
  2. A+ A A+ = A+
  3. (A A+)* = A A+
  4. (A+ A)* = A+ A

To show that A+ = P we have to establish

  1. EPE = E
  2. PEP = A+
  3. (EP)* = EP
  4. (PE)* = PE

We calculated EP and PE above, and both are real and symmetric, so properties 3 and 4 hold.

We can also compute

 EPE = \begin{bmatrix} 0 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \\ \end{bmatrix} \begin{bmatrix} 0 & 0 & 0 \\ 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \\ \end{bmatrix} = \begin{bmatrix} 0 & 0 & 0 \\ 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \\ \end{bmatrix} = E

and

PEP = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \\ \end{bmatrix} \begin{bmatrix} 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \\ \end{bmatrix} = \begin{bmatrix} 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \\ \end{bmatrix} =P

showing that properties 1 and 2 hold as well.

Related posts

Alternative exp and log notation

The other day I stumbled on an article [1] that advocated writing ab as ab and loga(b) as ab.

\begin{align*} a &\uparrow b \equiv a^b  \\ a &\downarrow b \equiv \log_a b \end{align*}

This is a special case of Knuth’s up arrow and down arrow notation. Knuth introduces his arrows with the intention of repeating them to represent hyperexponentiation and iterated logarithms. But the emphasis in [1] is more on the pedagogical advantages of using a single up or down arrow.

Advantages

One advantage is that the notation is more symmetric. Exponents and logs are inverses of each other, and up and down arrows are visual inverses of each other.

Another advantage is that the down arrow notation makes the base of the logarithm more prominent, which is sometimes useful.

Finally, the up and down arrow notation is more typographically linear:  ab and ab stay within a line, whereas ab and loga(b) extend above and below the line. LaTeX handles subscripts and superscripts well, but HTML doesn’t. That’s one reason I usually write exp(x) rather than ex here.

Comparison

Here are the basic properties of logs and exponents using conventional notation.

\begin{align} a^b = c &\iff \log_a c = b \\ \log_b 1 &= 0 \\ \log_b b &= 1 \\ \log_b(b^x) &= x \\ b^{\log_b x} &= x \\ \log_b xy &= \log_b x + \log_b y \\ \log_b \frac{x}{y} &= \log_b x - \log_by \\ a^{\log_b c} &= c^{\log_b a} \\ \log_a b^c &= c (\log_a b) \\ (\log_b a) (\log_a x) &= \log_b x \end{align}

Here are the same properties using up and down arrow notation.

\begin{align} a \uparrow b = c &\iff a \downarrow c = b \\ b \downarrow 1 &= 0 \\ b \downarrow b &= 1 \\ b \downarrow (b \uparrow x) &= x \\ b \uparrow (b \downarrow x) &= x \\ b \downarrow xy &= b \downarrow x + b \downarrow y \\ b \downarrow \frac{x}{y} &= b \downarrow x - b \downarrow y \\ a \uparrow (b \downarrow c) &= c \uparrow (b \downarrow a ) \\ a \downarrow (b \uparrow c) &= c (a \downarrow b) \\ (b \downarrow a) (a \downarrow x) &= b \downarrow x \end{align}

Related posts

[1] Margaret Brown. Some Thoughts on the Use of Computer Symbols in Mathematics. The Mathematical Gazette, Vol. 58, No. 404 (Jun., 1974), pp. 78-79

Decimal Separator and Internationalization

This morning I ran across the following tip from Joost Helberg on Mastodon:

TIL don’t report numbers with three digits after the decimal point. People may interpret the decimal point as a thousands separator. Using 2 or 4 digits, although wrong, avoids off by a factor thousand errors.

I usually report four decimal places, but I hadn’t thought about that in relation to the decimal separator problem.

In software development, it’s best to let a library handle numeric input and output, using local conventions. Failure to do so can lead to problems, as I’ll never forget.

Embarrassed in Bordeaux

In 2006, Peter Thall and I gave a week-long course on Bayesian clinical trial design in Bordeaux, France. Part of the course was presenting software that my team at MD Anderson Cancer Center had developed.

A few minutes into my first presentation I realized the software wasn’t working for the course attendees. The problem had to do with the US and France using opposite conventions for decimal separator and thousands separator. I had tested our software on a French version of Windows, but I had entered integers in my testing and decimals in my presentation.

I apologized and asked my French audience to enter decimals in the American style, such as 3.14 rather than 3,14. But that didn’t work either!

We were using a Windows API for parsing input, which correctly handles input and output per local conventions. But we had written custom validation code. We checked that the input fields contained only valid numeric characters, i.e. digits and periods. Oops!

Users were between a rock and hard place. The input validation would not accept French notation, and the parsing code would not accept American notation.

The solution was for the attendees to set their operating system locale to the US. They were gracious about having to apply the hack and said that it was a common problem. It was a humiliating way to start the course, but the rest of the week went well.

A crowded little chess puzzle

Here’s a puzzle by Martin Garnder [1].

Can a queen, king, rook, bishop, and knight be placed on a 4² board so no piece attacks another?

There are two solutions, plus symmetries.

Note that in all non-attacking chess puzzles, the colors of the pieces are irrelevant. In the solutions I chose the piece colors to be the opposite of the square colors strictly for aesthetic reasons.

More chess posts

More Martin Gardner posts

[1] Martin Gardner. Some New Results on Nonattacking Chess Tasks. Math Horizons. February 2001, pp 10–12.

Formulating eight queens as a SAT problem

The Boolean satisfiability problem is to determine whether there is a way to assign values to variables in a set of Boolean formulas to make the formulas hold [1]. If there is a solution, the next task would be to enumerate the solutions.

You can solve the famous eight queens problem, or its generalization to n-queens, by formulating the problem as a Boolean formula then using a SAT solver to find the solutions.

It’s pretty obvious how to start. A chessboard is an 8 by 8 grid, so you have a variable for each square on the board, representing whether or not that square holds a queen. Call the variables bij where i and j run from 1 to 8.

The requirement that every row contains exactly one queen can be turned into two subrequirements:

  1. Each row contains at least one queen.
  2. Each row contains at most one queen.

The first requirement is easy. For each row i, we have a clause

bi1bi2bi3 ∨ … ∨ bi8

The second requirement is harder. How do you express in terms of our boolean variables that there is no more than one queen in each row? This is the key difficulty. If we can solve this problem, then we’re essentially done. We just need to do the analogous thing for columns and diagonals. (We don’t require a queen on every diagonal, but we require that there be at most one queen on every diagonal.)

First approach

There are two ways to encode the requirement that every row contain at most one queen. The first is to use implication. If there’s a queen in the first column, then there is not a queen in the remaining columns. If there’s a queen in the second column, then there is not a queen in all but the second column, etc. We have an implication for each row in each column. Let’s just look at the first row and first column.

b11 ⇒ ¬ (b12b13 ∨ … ∨ b18)

We can turn an implication of the form a ⇒ b into the clause ¬a ∨ b.

Second approach

The second way to encode the requirement that every row contain at most one queen is to say that for every pair of squares in a row (ab) either a has no queen or b has no queen. So for the first row we would have 8C2 = 28 clauses because there are 28 ways to choose pairs from a set of 8 things.

b11 ∨ ¬b12) ∧ (¬b11 ∨ ¬b13) ∧ … ∧ (¬b17 ∨ ¬b18)

An advantage of this approach is that it directly puts the problem into conjunctive normal form (CNF). That is, our formula is a conjunction of terms that contain only disjunctions, an AND of ORs.

Related posts

[1] You’ll see the SAT problem described as finding the solution to a Boolean formula. If you have multiple formulas, then the first holds, and the second, etc. So you can AND them all together to make one formula.

Special solutions to the eight queens problem

There are 92 ways to place eight queens on a chessboard so that no queen is attacking any other. These fall into 12 equivalence classes. The 92 solutions are all rotations and reflections of these 12 basic solutions.

If you think about the previous numbers a minute, you might wonder why the total number of solutions is not a multiple of 12. There is one particular solution that is more symmetric than the others. So the total number of solutions 92 breaks down into

8 × 11 + 4 × 1.

To illustrate this, let’s look at two fundamental solutions: one that is particularly ordered and one that is particularly disordered in a sense that we’ll get to later on.

Most basic solutions, like the one below, are part of an equivalence class of eight solutions.

You can rotate the board 90° or reflect it about the middle[1], or some combination of both [2]. This amounts to eight possibilities.

This solution, however, is more symmetric.

You can rotate the basic solution, but flipping it over does not create a new solution: a flip produces the same result as two 90° rotations.

It’s curious that there is only one highly symmetric solution to the eight queens problem. When I first saw the problem as a child I expected all the solutions to be highly symmetric. That may be why I wasn’t able to find a solution.

Among the 11 basic solutions that are less ordered, the one shown above is uniquely disordered in the following sense: no three queens lie on a straight line.

The eight queens problem is a problem about restricted straight lines. It says no two queens lie on the same rank, file, or diagonal. But if we look at all straight lines, then of course there is a line through any two queens. In the most orderly solution, every queen is on a straight line with two others. In the least orderly solution, no queen is on a straight line with two others.

In 1900 Henry Dudenay introduced the no-three-in-line problem, looking at ways to place points on a lattice such that no line goes through three points, with no restriction on the slope of the lines. So one family of solutions to the eight queens problem is also a solution to the no-three-in-line problem.

Related posts

[1] It doesn’t matter whether you flip about the horizontal or vertical axis.

[2] In fancy terminology, the action of the dihedral group D8 applied to a solution yields another solution.

The non-attacking bishops problem

How many bishops can you place on a chessboard so that no bishop is attacking any other bishop?

For a standard 8 × 8 chessboard the answer is 14. In general, for an n × n chessboard the answer is 2n − 2.

Here’s one way to place the maximum number of non-attacking bishops.

To see that the bishops cannot attack each other, I think it’s helpful to imagine extending the chessboard so that each bishop attacks the same number of squares. Then we can see that they miss each other.

Related posts

Sorting Roman numerals

This morning I wrote about the frequencies of names for popes and kings. This involved sorting strings with Roman numerals since it’s common for popes and kings to have Roman numerals after their names.

Something that surprised me was that sorting Roman numerals alphabetically roughly sorts them in numerical order, especially for small numbers. It’s not perfect. For example, IX comes before V in alphabetical order.

Everyone who has done much work with data will have run into the problem of a column of numbers being sorted alphabetically rather than numerically. For example, “10” comes between “1” and “2” even though 10 comes after 1 and 2.

So you can’t sort numerals, Roman or Arabic, as strings and expect them to appear in numerical order. But Roman numbers come close when you’re sorting small numbers, such as I through XXIII for popes named John or I through VIII for kings of England named Henry.

To illustrate this, I plotted how well string sort order correlates with numeric order for Roman and Arabic numbers, for the sequence 1 … n for increasing values of n. I measured correlation using Spearman’s rank-order correlation. I tried Kendall’s tau and as well and got similar results.

Alphabetical order and numerical order for Roman numerals agree pretty well up to XXXVIII, with just a few numbers out of place, namely IX, XIX, and XXIX. But alphabetical order and numerical order diverge quite a bit for Arabic numerals when all the numbers between 10 and 19 come before 2.

As you go further out, alphabetical order and numerical order diverge for both writing systems, but especially for Roman numerals.