You can use quaternions to describe rotations and quaternion products to carry out these rotations. These products have the form

*q**pq**

where *q* represents a rotation, *q** is its conjugate, and *p* is the the vector being rotated. (That’s leaving out some details that we’ll get to shortly.)

The primary advantage of using quaternions to represent rotations is that it avoids gimbal lock, a kind of coordinate singularity.

## Saving multiplications

If you multiply quaternions directly from the definitions, the product takes 16 real multiplications. (See the function `mult`

in the Python code below.) So a naive implication of the product above, with two quaternion multiplications, would require 32 real multiplications.

You can do something analogous to Gauss’s trick for complex multiplication to do each quaternion product with fewer multiplications [1, 2], but there’s an even better way. It’s possible to compute the rotation in 15 multiplications [3].

## Details

Before we can say what the faster algorithm is, we have to fill in some details we left out at the top of the post. We’re rotating a vector in 3 dimensions using quaternions that have 4 dimensions. We do that by embedding our vector in the quaternions, carrying out the product above, and then pulling the rotated vector out.

Specifically, let

*v* = (*v*_{1}, *v*_{2}, *v*_{3})

be the vector we want to rotate. We turn *v* into a quaternion by defining

*p* = (o, *v*_{1}, v_{2}, v_{3}).

We encode our rotation in the unit quaternion

*q* = (*q*_{0}, *q*_{1}, *q*_{2}, *q*_{3})

where *q*_{0} = cos(θ/2) and (*q*_{1}, *q*_{2}, *q*_{3}) is the axis we want to rotate around and θ is the amount we rotate.

The rotated vector is the vector part of

*q**pq**

i.e. the vector formed by chopping off the first component of the quaternion product.

## Algorithm

Let

*t* = 2 (*q*_{1}, *q*_{2}, *q*_{3}) × (*v*_{1}, *v*_{2}, *v*_{3}) = (*t*_{1}, *t*_{2}, *t*_{3})

where × is cross product. Then the rotated vector is

*v*‘ = (*v*_{1}, *v*_{2}, *v*_{3}) + *q*_{0}(*t*_{1}, *t*_{2}, *t*_{3}) + (*q*_{1}, *q*_{2}, *q*_{3}) × (*t*_{1}, *t*_{2}, *t*_{3}).

The algorithm involves two cross products, with 6 real multiplications each, and one scalar-vector product requiring 3 real multiplications, for a total of 15 multiplications. (I’m not counting the multiplication by 2 as a multiplications because it could be done by an addition or by a bit-shift.)

## Code

Here’s some Python code to carry out the naive product and the faster algorithm.

import numpy as np
def mult(x, y):
return np.array([
x[0]*y[0] - x[1]*y[1] - x[2]*y[2] - x[3]*y[3],
x[0]*y[1] + x[1]*y[0] + x[2]*y[3] - x[3]*y[2],
x[0]*y[2] - x[1]*y[3] + x[2]*y[0] + x[3]*y[1],
x[0]*y[3] + x[1]*y[2] - x[2]*y[1] + x[3]*y[0]
])
def cross(x, y):
return np.array( (x[1]*y[2] - x[2]*y[1], x[2]*y[0] - x[0]*y[2], x[0]*y[1] - x[1]*y[0]) )
def slow_rot(q, v):
qbar = -q
qbar[0] = q[0]
return mult(mult(q, v), qbar)[1:]
def fast_rot(q, v):
t = 2*cross(q[1:], v[1:])
return v[1:] + q[0]*t + cross(q[1:], t)

And here’s some test code.

def random_quaternion():
x = np.random.normal(size=4)
return x
for n in range(10):
v = random_quaternion()
v[0] = 0
q = random_quaternion()
q /= np.linalg.norm(q)
vprime_slow = slow_rot(q, v)
vprime_fast = fast_rot(q, v)
print(vprime_fast - vprime_slow)

This prints 10 vectors that are essentially zero, indicating that `vprime_slow`

and `vprime_fast`

produced the same result.

It’s possible to compute the rotation in less time than two general quaternion products because we had three things we could exploit.

- The quaternions
*q* and *q** are very similar.
- The first component of
*p* is zero.
- We don’t need the first component of
*qpq**, only the last three components.

## Related posts

[1] I know you can multiply quaternions using 12 real multiplications, and I suspect you can get it down to 9. See this post.

[2] Note that I said “fewer multiplications” rather than “faster.” Gauss’ method eliminates a multiplication at the expense of 3 additions. Whether the re-arranged calculation is faster depends on the ratio of time it takes for a multiply and an add, which depends on hardware.

However, the rotation algorithm given here reduces multiplications and additions, and so it should be faster on most hardware.

[3] I found this algorithm here. The author cites two links, both of which are dead. I imagine the algorithm is well known in computer graphics.