Euler’s formula for dual numbers and double numbers

Posted on 28 January 2025 by John

The complex numbers are formed by adding an element i to the real numbers such that i² = − 1. We can create other number systems by adding other elements to the reals.

One example is dual numbers. Here we add a number ε ≠ 0 with the property ε² = 0. Dual numbers have been used in numerous applications, most recently in automatic differentiation.

Another example is double numbers [1]. Here we add a number j ≠ ±1 such that j² = 1. (Apologies to electrical engineers and Python programmers. For this post, j is not the imaginary unit from complex numbers.)

(If adding special numbers to the reals makes you uneasy, see the next post for an alternative approach to defining these numbers.)

We can find analogs of Euler’s formula

$\exp(i\theta) = \cos(\theta) + i \sin(\theta)$

for dual numbers and double numbers by using the power series for the exponential function

$\exp(z) = \sum_{k=0}^\infty \frac{z^k}{k!}$

to define exp(z) in these number systems.

For dual numbers, the analog of Euler’s theorem is

$\exp(\varepsilon x) = 1 + \varepsilon x$

because all the terms in the power series after the first two involve powers of ε that evaluate to 0. Although this equation only holds for dual numbers, not real numbers, it is approximately true of ε is a small real number. This is the motivation for using ε as the symbol for the special number added to the reals: Dual numbers can formalize calculations over the reals that are not formally correct.

For double numbers, the analog of Euler’s theorem is

$\exp(j x) = \cosh(x) + j \sinh(x)$

and the proof is entirely analogous to the proof of Euler’s theorem for complex numbers: Write out the power series, then separate the terms involving even exponents from the terms involving odd exponents.

[1] Double numbers have also been called motors, hyperbolic numbers, split-complex numbers, spacetime numbers, …

A magical land where rounding equals truncation

Posted on 23 January 2025 by John

Rounding numbers has a surprising amount of detail. It may seem trivial but, as with most things, there is a lot more to consider than is immediately obvious. I expect there have been hundreds if not thousands of pages devoted to rounding in IEEE journals.

An example of the complexity of rounding is what William Kahan called The Tablemaker’s Dilemma: there is no way in general to know in advance how accurately you’ll need to compute a number in order to round it correctly.

Rounding can be subtle in any number system, but there is an alternative number system in which it is a little simpler than in base 10. It’s base 3, but with a twist. Instead of using 0, 1, and 2 as “digits”, we use −1, 0, and 1. This is known as the balanced ternary system: ternary because of base 3, and balanced because the digits are symmetrical about 0.

We need a symbol for −1. A common and convenient choice is to use T. Think of moving the minus sign from in front of a 1 to on top of it. Now we could denote the number of hours in a day as 10T0 because

$1 \times 3^3 + 0 \times 3^2 + (-1)\times 3 + 0 = 24$

A more formal way of a describing balanced ternary representation of a number x is a set of coefficients t_k such that

$x = \sum_{k=-\infty}^\infty t_k 3^k$

with the restriction that each t_k is in the set {−1, 0, 1}.

Balanced ternary representation has many interesting properties. For example, positive and negative numbers can all be represented without a minus sign. See, for example, Brain Hayes’ excellent article Third Base. The property we’re interested in here is that to round a balanced ternary number to the nearest integer, you simply lop off the fractional part. Rounding is the same as truncation. To see this, note that the largest possible fractional part is a sequence of all 1s, which represents ½:

$\frac{1}{3} + \frac{1}{3^2} + \frac{1}{3^3} + \cdots = \frac{1}{2}$

Similarly, the most negative possible fractional part is a sequence of all Ts, which represents −½. So unless the fractional part is exactly equal to ½, truncating the fractional part rounds to the nearest integer. If the fractional part is exactly ½ then there is no nearest integer but two integers that are equally near.

Non-associative multiplication

Posted on 1 February 2022 by John

There are five ways to parenthesize a product of four things:

((ab)c)d
(ab)(cd)
(a(b(cd))
(a(bc))d
(a((bc)d)

In a context where multiplication is not associative, the five products above are not necessarily the same. Maybe all five are different.

This post will give two examples where the products above are all different: octonions and matrix multiplication.

If you’re thinking “But wait: matrix multiplication is associative!” then read further and see in what sense I’m saying it’s not.

Octonions

Octonion multiplication is not associative. I wrote a blog post a while back asking how close the octonions come to being associative. That is, if we randomly generate unit-length octonions a, b, and c, we can calculate the norm of

(ab)c − a(bc)

and ask about its expected value. Sometimes for a triple of octonions this value is zero, but on average this expression has norm greater than 1. I estimated the average value via simulation, and later Greg Egan worked out the exact value. His write-up had gone down with the Google+ ship, but recently Greg posted a new version of his notes.

In this post I gave Python code for multiplying octonions using a function called CayleyDickson named after the originators of the algorithm. Let’s rename it m to have something shorter to work with and compute all five ways of associating a product of four octonions.

    import numpy as np

    def m(x, y):
        return CayleyDickson(x, y)

    def take_five(a, b, c, d):
        return [
            m(a, (m(b, m(c, d)))),
            m(m(a, b), m(c, d)),
            m(m(m(a, b), c), d),
            m(a, m(m(b, c), d)),
            m(m(a, m(b, c)), d)
    ]

I first tried products of basis elements, and I only got two different products out of the five ways of associating multiplication, but I only tried a few examples. However, when I tried using four random octonions, I got five different products.

    np.random.seed(20220201)

    a = np.random.rand(8)
    b = np.random.rand(8)
    c = np.random.rand(8)
    d = np.random.rand(8)

    for t in take_five(a, b, c, d):
        print(t)

This gave a very interesting result: I got five different results, but only two different real (first) components. The five vectors all differed in the last seven components, but only produced two distinct first components.

    [ 2.5180856  -2.61618184  ...]
    [ 2.5180856   0.32031027  ...]
    [ 2.5180856   1.13177500  ...]
    [ 3.0280984  -0.30169446  ...]
    [ 3.0280984  -2.36523580  ...]

I repeated this experiment a few times, and the first three results always had the same real component, and the last two results had another real component.

I suspect there’s a theorem that says

Re( ((ab)c)d ) = Re( (ab)(cd) ) = Re( a(b(cd)) )

and

Re( (a(bc))d ) = Re( a((bc)d) )

but I haven’t tried to prove it. If you come with a proof, or a counterexample, please post a link in the comments.

Matrix multiplication

Matrix multiplication is indeed associative, but the efficiency of matrix multiplication is not. That is, any two ways of parenthesizing a matrix product will give the same final matrix, but the cost of the various products are not the same. I first wrote about this here.

This is a very important result in practice. Changing the parentheses in a matrix product can make the difference between a computation being practical or impractical. This comes up, for example, in automatic differentiation and in backpropagation for neural networks.

Suppose A is an m by n matrix and B is an n by p matrix. Then AB is an m by p matrix, and forming the product AB requires mnp scalar multiplications. If C is a p by q matrix, then (AB)C takes

mnp + mpq = mp(n + q)

scalar multiplications, but computing A(BC) takes

npq + mnq = nq(m + p)

scalar multiplications, and in general these are not equal.

Let’s rewrite our multiplication function m and our take_five function to compute the cost of multiplying four matrices of random size.

We’ve got an interesting programming problem in that our multiplication function needs to do two different things. First of all, we need to know the size of the resulting matrix. But we also want to keep track of the number of scalar multiplications the product would require. We have a sort of main channel and a side channel. Having our multiplication function return both the dimension and the cost would make composition awkward.

This is kind of a fork in the road. There are two ways to solving this problem, one high-status and one low-status. The high-status approach would be to use a monad. The low-status approach would be to use a global variable. I’m going to take the low road and use a global variable. What’s one little global variable among friends?

    mults = 0

    def M(x, y):
        global mults
        mults += x[0]*x[1]*y[0]
        return (x[0], y[1])

    def take_five2(a, b, c, d):

        global mults
        costs = []

        mults = 0; M(a, (M(b, M(c, d)))); costs.append(mults)
        mults = 0; M(M(a, b), M(c, d));   costs.append(mults)
        mults = 0; M(M(M(a, b), c), d);   costs.append(mults)
        mults = 0; M(a, M(M(b, c), d));   costs.append(mults)
        mults = 0; M(M(a, M(b, c)), d);   costs.append(mults)

        return costs

Next, I’ll generate five random integers, and group them in pairs as sizes of matrices conformable for multiplication.

    dims = np.random.random_integers(10, size=5)
    dim_pairs = zip(dims[:4], dims[1:])
    c = take_five2(*[p for p in dim_pairs])

When I ran this dims was set to

[3 9 7 10 6]

and so my matrices were of size 3×9, 9×7, 7×10, and 10×6.

The number of scalar multiplications required by each way of multiplying the four matrices, computed by take_five2 was

[1384, 1090, 690, 1584, 984]

So each took a different number of operations. The slowest approach would take more than twice as long as the fastest approach. In applications, matrices can be very long and skinny, or very wide and thin, in which case one approach may take orders of magnitude more operations than another.

Conjugate theorem for octonions

Posted on 19 December 2021 by John

Yesterday I wrote about the fact that quaternions, unlike complex numbers, can form conjugates via a series of multiplications and additions. This post will show that you can do something similar with octonions.

If x is an octonion

x = r₀ e₀ + r₁ e₁ + … + r₇ e₇

where all the r‘s are real numbers. The conjugate flips the signs of all the components except the real component:

x* = r₀ e₀ − r₁ e₁ − … − r₇ e₇

The conjugate theorem says

x* = − (x + (e₁ x) e₁ + … + (e₇ x) e₇) / 6

which is analogous to the theorem

q* = − (q + iqi + jqj + kqk) /2

for quaternions.

The internal parentheses are necessary because multiplication in octonions is not associative:

xyz

is ambiguous because it could mean

(xy)z

x(yz)

and the two are not necessarily equal.

Update: It is true in general that (xy)z ≠ x(yz) for octonions. However, the subalgebra of octonions generated by any two elements is associative. In particular, (xy)x = x(yz) and so we can write the equation for x* above as simply

x* = − (x + e₁ x e₁ + … + e₇ x e₇) / 6.

Thanks to Martin Erik Horn for pointing this out.

The proof is also analogous to the proof given in the earlier post for quaternions. First work out what the effect is of multiplying on the left and right by one of the imaginary units, then add everything up. You’ll find that the real component is multiplied by -6 and the rest of the components are multiplied by 6.

How to multiply octonions

Posted on 19 December 2021 by John

This post will present a way of multiplying octonions that’s easy to remember.

Please note that there are varying conventions for how to define multiplication for octonions [1].

Octonions

The complex numbers have one imaginary unit i, and the quaternions have three: i, j, and k. The octonions have seven, and so it makes sense to switch over to subscripts rather than venturing further out into the alphabet. Besides, numerical indices will be useful for reasons we’ll see shortly.

Let e₀ = 1 and let e₁ through e₇ be the imaginary units for octonions. Then the e‘s form a basis for ℝ⁸. The octonions are ℝ⁸ with a product that distributes over addition, but the product is not commutative or associative.

Multiplication is determined by how the basis elements multiply, and there are multiple ways of presenting these multiplication rules, the simplest being to write down a multiplication table. Here I’ll present the way I find easiest to remember [2].

Multiplication rules

The squares of each of the imaginary units e₁ through e₇ are equal to −1, as you’d expect based on experience with complex numbers and quaternions. You can derive the rest of the multiplication facts from

e₁ e₂ = e₄

and two simple rules.

The first rule is that the equation above holds when you shift the indices mod 7.

e₁ e₂ = e₄e₂ e₃ = e₅
e₃ e₄ = e₆
e₄ e₅ = e₇
e₅ e₆ = e₁
e₆ e₇ = e₂
e₇ e₁ = e₃

The second rule is that each of the triplets of units above behave analogous to i, j, and k in quaternions. That is, when you take a triple like (e₂, e₃, e₅), rotations don’t flip the sign but swapping adjacent elements does. That is,

e₂ e₃ = e₅
e₅ e₂ = e₃
e₃ e₅ = e₂

and

e₃ e₂ = −e₅
e₂ e₅ = −e₃
e₅ e₃ = −e₂

So we have seven rotations of the equation

e₁ e₂ = e₄

and six permutations of each rotation, for a total of 42 rules. We already had 7 rules, saying each unit squares to -1, so we have the 49 rules we need to multiply all the non-real units by each other.

Complex Conjugates versus Quaternion Conjugates

Posted on 18 December 2021 by John

The conjugate of a complex number

is the complex number

Taking the conjugate flips over a complex number, taking its reflection in the real axis.

Multiplication stretches and rotates complex numbers, and addition translates complex numbers. You can’t flip the complex plane over by any series of dilatations, rotations, and translations.

The situation is different for quaternions. The conjugate of a quaternion

You can flip four dimensional space over by a series of dilations, rotations, and translations. Namely

$q^* = -\frac{1}{2}(q + iqi + jqj + kqk)$

To prove this equation, let’s first see what happens when you multiply q on both sides by i:

That is, the effect of multiplying on both sides by i is to flip the sign of the real component and the i component.

Multiplying on both sizes by j or k works analogously: it flips the sign of the real component and its component, and leaves the other two alone.

It follows that

$\begin{align*} q +iqi + jqj + kqk &= \phantom{-}a + bi + cj +dk \\ & \,\phantom{=} - a - bi + cj + dk \\ & \,\phantom{=} - a + bi - cj + dk \\ & \,\phantom{=} - a + bi + cj - dk \\ &= -2a + 2bi + 2cj + 2dk\\ &= -2q^* \end{align*}$

and so the result follows from dividing by −2.

Update: There’s an analogous theorem for octonions.

How close is octonion multiplication to being associative?

Posted on 9 July 2018 by John

Quaternion multiplication is associative but not commutative. An earlier post looked at the average size of the commutator xy − yx as a measure of how far quaternion multiplication is from being commutative.

This post looks at an analogous question for octonions. Octonion multiplication is neither commutative nor associative. So in this post we look at the associator of three octonions (xy)z − x(yz) as a measure of how far octonion multiplication is from being associative.

(A post from yesterday looked at how close octonion multiplication comes to being associative in an algebraic sense, looking at weak forms of associativity. This post looks at how close multiplication comes to being associative in an analytical sense, looking at norm distances rather than algebraic identities.)

We will use a simulation to look at the average norm of the octonion associator over the octonions of unit length, analogous to the earlier post that looked at the commutator of the quaternions. We developed code for octonion multiplication in the previous post and will reuse that code here. We also developed code for generating random unit-length octonions in the same post.

Here’s code to find the average norm of the associator and plot a histogram of its values.

    
    import matplotlib.pyplot as plt

    # omult is octonion multiplication. 
    # See previous post.
    def associator(x, y, z):
        return omult(omult(x, y), z) - 
               omult(x, omult(y, z))

    N = 100000
    s = 0
    h = np.zeros(N)
    for n in range(N):
        x = random_unit_octonion()
        y = random_unit_octonion()
        z = random_unit_octonion()    
        t = norm(associator(x, y, z))
        s += t
        h[n] = t

    print(s/N)
    plt.hist(h, bins=50)
    plt.show()

This gives an average of 1.095 and the histogram below.

histogram of octonion associator norm values

Update: Greg Egan calculated the exact mean value for the norm of the associator to be 147456/(42875 π) ≈ 1.0947336. Here are the details.

Python code for octonion and sedenion multiplication

Posted on 9 July 2018 by John

The previous post discussed octonions. This post will implement octonion multiplication in Python, and then sedenion multiplication.

Cayley-Dickson construction

There’s a way to bootstrap quaternion multiplication into octonion multiplication, so we’ll reuse the quaternion multiplication code from an earlier post. It’s called the Cayley-Dickson construction. For more on this construction , see John Baez’s treatise on octonions.

If you represent an octonion as a pair of quaternions, then multiplication can be defined by

(a, b) (c, d) = (ac − d*b, da + bc*)

where a star superscript on a variable means (quaternion) conjugate.

Note that this isn’t the only way to define multiplication of octonions. There are many (480?) isomorphic ways to define octonion multiplication.

(You can take the Cayley-Dickson construction one step further, creating sedenions as pairs of octonions. We’ll also provide code for sedenion multiplication below.)

Code for octonion multiplication

    import numpy as np

    # quaternion multiplication
    def qmult(x, y):
        return np.array([
            x[0]*y[0] - x[1]*y[1] - x[2]*y[2] - x[3]*y[3],
            x[0]*y[1] + x[1]*y[0] + x[2]*y[3] - x[3]*y[2],
            x[0]*y[2] - x[1]*y[3] + x[2]*y[0] + x[3]*y[1],
            x[0]*y[3] + x[1]*y[2] - x[2]*y[1] + x[3]*y[0]
       ])

    # quaternion conjugate
    def qstar(x):
        return x*np.array([1, -1, -1, -1])

    # octonion multiplication
    def omult(x, y):
        # Split octonions into pairs of quaternions
        a, b = x[:4], x[4:]
        c, d = y[:4], y[4:]
    
        z = np.zeros(8)
        z[:4] = qmult(a, c) - qmult(qstar(d), b)
        z[4:] = qmult(d, a) + qmult(b, qstar(c))
        return z

Update: See this post for refactored code. Handles quaternions, octonions, sedenions, etc. all with one function.

Unit tests

Here are some unit tests to verify that our multiplication has the expected properties. We begin by generating three octonions with norm 1.

    from numpy.linalg import norm

    def random_unit_octonion():
        x = np.random.normal(size=8)
        return x / norm(x)

    x = random_unit_octonion()
    y = random_unit_octonion()
    z = random_unit_octonion()

We said in the previous post that octonions satisfy the “alternative” condition, a weak form of associativity. Here we verify this property as a unit test.

    eps = 1e-15

    # alternative identities

    a = omult(omult(x, x), y)
    b = omult(x, omult(x, y))
    assert( norm(a - b) < eps )

    a = omult(omult(x, y), y)
    b = omult(x, omult(y, y))
    assert( norm(a - b) < eps )

We also said in the previous post that the octonions satisfy the “Moufang” conditions.

    # Moufang identities
    
    a = omult(z, omult(x, omult(z, y)))
    b = omult(omult(omult(z, x), z), y)
    assert( norm(a - b) < eps )

    a = omult(x, omult(z, omult(y, z)))
    b = omult(omult(omult(x, z), y), z)
    assert( norm(a - b) < eps )

    a = omult(omult(z,x), omult(y, z))
    b = omult(omult(z, omult(x, y)), z)
    assert( norm(a  - b) < eps )

    a = omult(omult(z,x), omult(y, z))
    b = omult(z, omult(omult(x, y), z))
    assert( norm(a - b) < eps )

And finally, we verify that the product of unit length octonions has unit length.

    # norm condition
    n = norm(omult(omult(x, y), z))
    assert( abs(n - 1) < eps )

The next post uses the octonion multiplication code above to look at the distribution of the associator (xy)z – x(yz) to see how far multiplication is from being associative.

Sedenion multiplication

The only normed division algebras over the reals are the real numbers, complex numbers, quaternions, and octonions. These are real vector spaces of dimension 1, 2, 4, and 8.

You can proceed analogously and define a real algebra of dimension 16 called the sedenions. When we go from complex numbers to quaternions we lose commutativity. When we go from quaternions to octonions we lose full associativity, but retain a weak version of associativity. Even weak associativity is lost when we move from octonions to sedenions. Non-zero octonions form an alternative loop with respect to multiplication, but sedenions do not.

Sedenions have multiplicative inverses, but there are also some zero divisors, i.e. non-zero vectors whose product is zero. So sedenions do not form a division algebra. If you continue the Cayley-Dickson construction past the sedenions, you keep getting zero divisors, so no more division algebras.

Here’s Python code for sedenion multiplication, building on the code above.

    def ostar(x):
        mask = -np.ones(8)
        mask[0] = 1
        return x*mask
 
    # sedenion multiplication
    def smult(x, y):
        # Split sedenions into pairs of octonions
        a, b = x[:8], x[8:]
        c, d = y[:8], y[8:]
        z = np.zeros(16)
        z[:8] = omult(a, c) - omult(ostar(d), b)
        z[8:] = omult(d, a) + omult(b, ostar(c))
        return z

As noted above, see this post for more concise code that also generalizes further.

Fibonacci number system

Posted on 17 May 2015 by John

Every positive integer can be written as the sum of distinct Fibonacci numbers. For example, 10 = 8 + 2, the sum of the fifth Fibonacci number and the second.

This decomposition is unique if you impose the extra requirement that consecutive Fibonacci numbers are not allowed. [1] It’s easy to see that the rule against consecutive Fibonacci numbers is necessary for uniqueness. It’s not as easy to see that the rule is sufficient.

Every Fibonacci number is itself the sum of two consecutive Fibonacci numbers—that’s how they’re defined—so clearly there are at least two ways to write a Fibonacci number as the sum of Fibonacci numbers, either just itself or its two predecessors. In the example above, 8 = 5 + 3 and so you could write 10 as 5 + 3 + 2.

The nth Fibonacci number is approximately φⁿ/√5 where φ = 1.618… is the golden ratio. So you could think of a Fibonacci sum representation for x as roughly a base φ representation for √5x.

You can find the Fibonacci representation of a number x using a greedy algorithm: Subtract the largest Fibonacci number from x that you can, then subtract the largest Fibonacci number you can from the remainder, etc.

Programming exercise: How would you implement a function that finds the largest Fibonacci number less than or equal to its input? Once you have this it’s easy to write a program to find Fibonacci representations.

* * *

[1] This is known as Zeckendorf’s theorem, published by E. Zeckendorf in 1972. However, C. G. Lekkerkerker had published the same result 20 years earlier.

Number systems

Euler’s formula for dual numbers and double numbers

Related posts

A magical land where rounding equals truncation

Related posts

Non-associative multiplication

Octonions

Matrix multiplication

Related posts

Conjugate theorem for octonions

How to multiply octonions

Octonions

Multiplication rules

More octonion posts

Complex Conjugates versus Quaternion Conjugates

More on quaternions

How close is octonion multiplication to being associative?

Python code for octonion and sedenion multiplication

Cayley-Dickson construction

Code for octonion multiplication

Unit tests

Sedenion multiplication

Fibonacci number system