Faster quaternion product rotations

You can use quaternions to describe rotations and quaternion products to carry out these rotations. These products have the form


where q represents a rotation, q* is its conjugate, and p is the the vector being rotated. (That’s leaving out some details that we’ll get to shortly.)

The primary advantage of using quaternions to represent rotations is that it avoids gimbal lock, a kind of coordinate singularity.

Saving multiplications

If you multiply quaternions directly from the definitions, the product takes 16 real multiplications. (See the function mult in the Python code below.) So a naive implication of the product above, with two quaternion multiplications, would require 32 real multiplications.

You can do something analogous to Gauss’s trick for complex multiplication to do each quaternion product with fewer multiplications [1, 2], but there’s an even better way. It’s possible to compute the rotation in 15 multiplications [3].


Before we can say what the faster algorithm is, we have to fill in some details we left out at the top of the post. We’re rotating a vector in 3 dimensions using quaternions that have 4 dimensions. We do that by embedding our vector in the quaternions, carrying out the product above, and then pulling the rotated vector out.

Specifically, let

v = (v1, v2, v3)

be the vector we want to rotate. We turn v into a quaternion by defining

p = (o, v1, v2, v3).

We encode our rotation in the unit quaternion

q = (q0, q1, q2, q3)

where q0 = cos(θ/2) and (q1, q2, q3) is the axis we want to rotate around and θ is the amount we rotate.

The rotated vector is the vector part of


i.e. the vector formed by chopping off the first component of the quaternion product.



t = 2 (q1, q2, q3) × (v1, v2, v3) = (t1, t2, t3)

where × is cross product. Then the rotated vector is

v‘ = (v1, v2, v3) + q0(t1, t2, t3) + (q1, q2, q3) × (t1, t2, t3).

The algorithm involves two cross products, with 6 real multiplications each, and one scalar-vector product requiring 3 real multiplications, for a total of 15 multiplications. (I’m not counting the multiplication by 2 as a multiplications because it could be done by an addition or by a bit-shift.)


Here’s some Python code to carry out the naive product and the faster algorithm.

    import numpy as np

    def mult(x, y):
        return np.array([
            x[0]*y[0] - x[1]*y[1] - x[2]*y[2] - x[3]*y[3],
            x[0]*y[1] + x[1]*y[0] + x[2]*y[3] - x[3]*y[2],
            x[0]*y[2] - x[1]*y[3] + x[2]*y[0] + x[3]*y[1],
            x[0]*y[3] + x[1]*y[2] - x[2]*y[1] + x[3]*y[0]

    def cross(x, y):
        return np.array( (x[1]*y[2] - x[2]*y[1], x[2]*y[0] - x[0]*y[2], x[0]*y[1] - x[1]*y[0]) )

    def slow_rot(q, v):
        qbar = -q
        qbar[0] = q[0]
        return mult(mult(q, v), qbar)[1:]

    def fast_rot(q, v):
        t = 2*cross(q[1:], v[1:])
        return v[1:] + q[0]*t + cross(q[1:], t)

And here’s some test code.

    def random_quaternion():
        x = np.random.normal(size=4)
        return x 

    for n in range(10):
        v = random_quaternion()
        v[0] = 0
        q = random_quaternion()
        q /= np.linalg.norm(q)
        vprime_slow = slow_rot(q, v)
        vprime_fast = fast_rot(q, v)
        print(vprime_fast - vprime_slow)

This prints 10 vectors that are essentially zero, indicating that vprime_slow and vprime_fast produced the same result.

It’s possible to compute the rotation in less time than two general quaternion products because we had three things we could exploit.

  1. The quaternions q and q* are very similar.
  2. The first component of p is zero.
  3. We don’t need the first component of qpq*, only the last three components.

Related posts

[1] I know you can multiply quaternions using 12 real multiplications, and I suspect you can get it down to 9. See this post.

[2] Note that I said “fewer multiplications” rather than “faster.” Gauss’ method eliminates a multiplication at the expense of 3 additions. Whether the re-arranged calculation is faster depends on the ratio of time it takes for a multiply and an add, which depends on hardware.

However, the rotation algorithm given here reduces multiplications and additions, and so it should be faster on most hardware.

[3] I found this algorithm here. The author cites two links, both of which are dead. I imagine the algorithm is well known in computer graphics.

2 thoughts on “Faster quaternion product rotations

  1. I very seldom use quaternion multiplication to rotate a vector; I typically compute the direction cosine matrix from the quaternion and then do the matrix-vector multiplication to compute the rotated vector. Faster, right? Well, not necessarily; I have to compute the DCM after all.

    But wait – I need the same DCM for more than one vector rotation, plus I need to extract Euler angles from the DCM, so it pays off in the end, right? Then why am I using a quaternion at all? Well, because I’m computing the changing angular displacement of two coordinate systems by numerically integrating the measured angular rate of one with respect to the other, and by doing that integration with the quaternion instead of the DCM, I can keep the quaternion normalized much more straightforwardly than normalizing the DCM, plus I only have to integrate four quaternion components instead of six DCM elements, plus I can compute the quaternion rate with fewer operations than compute the DCM rate.

    Moral: in algorithm design, there are often more considerations beyond just number of operations that must be considered; one usually needs to consider the totality of the computations. In the case described above, strap down navigation.

Leave a Reply

Your email address will not be published. Required fields are marked *