Gaussian distributed weights for LLMs

The previous post looked at the FP4 4-bit floating point format. This post will look at another 4-bit floating point format, NF4, and higher precision analogs. NF4 and FP4 are common bitsandbytes 4-bit data types. If you download LLM weights from Hugging Face quantized to four bits, the weights might be in NF4 or FP4 format. Or maybe some other format: there’s a surprising amount of variety in how 4-bit numbers are implemented.

Why NF4

LLM parameters have a roughly Gaussian distribution, and so evenly spaced numeric values are not ideal for parameters. Instead, you’d like numbers that are closer together near 0.

The FP4 floating point numbers, described in the previous post, are spaced 0.5 apart for small values, and the larger values are spaced 1 or 2 apart. That’s hardly a Gaussian distribution, but it’s closer to Gaussian than a uniform distribution would be. NF4 deliberately follows more of a Gaussian distribution.

QLoRA

The QLoRA formats [1], unlike FP4, are not analogs of IEEE numbers. The bits are not interpreted as sign, exponent, and mantissa, but rather as integers to be used as indexes. An NFn number is an index into a list of 2n real numbers with Gaussian spacing. To put it another way, the numbers represented by NFn have uniformly distributed z-scores.

That makes sense at a high level, but the paper [1] is hard to follow in detail. It says

More formally, we estimate the 2k values qi of the data type as follows:

q_i = \frac{1}{2}\left(Q_X\left(\frac{i}{2^k + 1}\right) + Q_X\left(\frac{i+1}{2^k + 1} \right) \right)

where QX(·) is the quantile function of the standard normal distribution N(0, 1).

The paper doesn’t give the range of i but it says there are 2k values, implying that i runs from 0 to 2k −1 or from 1 to 2k. Either way runs into infinite values since Q(0) = −∞ and Q(1) = ∞. We could avoid infinities by letting i run from 1 to 2n − 1.

The next sentence is puzzling.

A problem for a symmetric k-bit quantization is that this approach does not have an exact representation of zero, which is an important property to quantize padding and other zero-valued elements with no error.

I understand the desire to represent 0 exactly, but the equation above has an exact representation of 0 when i = 2n − 1. Perhaps the authors had in mind that i takes on the values ½, 1 + ½, 2 + ½, …, 2n − ½. This would be reasonable, but a highly unusual use of notation. It seems that the real problem is not the lack of a representation of 0 but an unused index, with i running from 1 to 2n − 1.

To be fair, the first sentence quoted above says “we estimate the 2k values …” and so the equation above may not be intended as a definition but as motivation for the actual definition.

Reproducing NF4

The authors give a procedure for using 2n values of i and obtaining an exact representation of 0, and they give a list of NF4 values in Appendix E. I was not able to get the two to match. I implemented a few possible interpretations of the procedure described in the paper, and each approximates the list of values in the appendix, but not closely.

The following code, written with the help of ChatGPT, reverse engineers the NF4 values to 8 decimal places, i.e. to the precision of a 32-bit floating point number.

from scipy.stats import norm

Q = norm.ppf

α  = 0.9677083
Z  = Q(α)
δ1 = (α - 0.5)/7
δ2 = (α - 0.5)/8

q = [0]*16
for i in range(7):
    q[i] = -Q(α - i*δ1)/Z
for i in range(8):
    q[i+8] = Q(0.5 + (i+1)*δ2)/Z
    
# Values given in Appendix E
NF4 = [
    -1.0,
    -0.6961928009986877,
    -0.5250730514526367,
    -0.39491748809814453,
    -0.28444138169288635,
    -0.18477343022823334,
    -0.09105003625154495,
    0.0,
    0.07958029955625534,
    0.16093020141124725,
    0.24611230194568634,
    0.33791524171829224,
    0.44070982933044434,
    0.5626170039176941,
    0.7229568362236023,
    1.0
]

# Compare 
for i in range(16):
    print(i, NF4[i] - q[i])

The magic number α = 0.9677083 is a mystery. I asked ChatGPT to look into this further, and it said that bitsandbytes uses α = 929/960 = 0.9677083333333333. When I use this value for α the precision is about the same, which is fine. However, the values in the paper were given to 16 decimal places, so I thought it might be able to match the values to more precision.

Quibbles over the exact values of NF4 aside, the NF4 format works well in practice. Models. quantized to 4 bits using NF4 perform better than models quantized to other 4-bit formats on some benchmarks.

Related posts

[1] QLoRA: Efficient Finetuning of Quantized LLMs by Tim Dettmers, Artidoro Pagnoni, Ari Holtzman, and Luke Zettlemoyer. https://arxiv.org/abs/2305.14314.

4-bit floating point FP4

In ancient times, floating point numbers were stored in 32 bits. Then somewhere along the way 64 bits became standard. The C programming language retains the ancient lore, using float to refer to a 32-bit floating point number and double to refer to a floating point number with double the number of bits. Python simply uses float to refer to the most common floating point format, which C calls double.

Programmers were grateful for the move from 32-bit floats to 64-bit floats. It doesn’t hurt to have more precision, and some numerical problems go away when you go from 32-bits to 64-bits. (Though not all. Something I’ve written about numerous times.)

Neural networks brought about something extraordinary: demand for floating point numbers with less precision. These networks have an enormous number of parameters, and its more important to fit more parameters into memory than it is to have higher precision parameters. Instead of double precision (64 bits) developers wanted half precision (16 bits), or even less, such as FP8 (8 bits) or FP4 (4 bits). This post will look at 4-bit floating point numbers.

Why even bother with floating point numbers when you don’t need much precision? Why not use integers? For example, with four bits you could represent the integers 0, 1, 2, …, 15. You could introduce a bias, subtracting say 7 from each value, so your four bits represent −7 through 8. Turns out it’s useful to have a more dynamic range.

FP4

Signed 4-bit floating point numbers in FP4 format use the first bit to represent the sign. The question is what to do with the remaining three bits. The notation ExMy denotes a format with x exponent bits and y mantissa bits. In the context of signed 4-bit numbers,

xy = 3

but in other contexts the sum could be larger. For example, for an 8-bit signed float, xy = 7.

For 4-bit signed floats we have four possibilities: E3M0, E2M1, E1M2, and E0M3. All are used somewhere, but E2M1 is the most common and is supported in Nvidia hardware.

A number with sign bit s, exponent e, and mantissa m has the value

(−1)s 2eb (1 + m/2)

where b is the bias. The purpose of the bias is to allow positive and negative exponents without using signed numbers for e. So, for example, if b = 1 and e = 1, 2, or 3 then the exponent part 2eb can represent 1, 2, or 4.

The bias impacts the range of possible numbers but not their relative spacing. For any value of bias b, the E3M0 format is all exponent, no mantissa, and so its possible values are uniformly distributed on a log scale. The E0M3 format is all mantissa, so its values are uniformly distributed on a linear scale. The E1M2 and E2M1 formats are unevenly spaced on both log and linear scales.

There is an exception to the expression above for converting (sem) into a real number when e = 0. In that case, m = 0 represents 0 and m = 1 represents ½.

Table of values

Since there are only 16 possible FP4 numbers, it’s possible to list them all. Here is a table for the E2M1 format.

Bits s exp m  Value
-------------------
0000 0  00 0     +0
0001 0  00 1   +0.5
0010 0  01 0     +1
0011 0  01 1   +1.5
0100 0  10 0     +2
0101 0  10 1     +3
0110 0  11 0     +4
0111 0  11 1     +6
1000 1  00 0     -0
1001 1  00 1   -0.5
1010 1  01 0     -1
1011 1  01 1   -1.5
1100 1  10 0     -2
1101 1  10 1     -3
1110 1  11 0     -4
1111 1  11 1     -6

Note that even in this tiny floating point format, there are two zeros, +0 and −0, just like full precision floats. More on that here.

Pychop library

The Python library Pychop emulates a wide variety of reduced-precision floating point formats. Here is the code that used Pychop to create the table above.

import pychop

# Pull the format metadata from Pychop.
spec = pychop.MX_FORMATS["mxfp4_e2m1"]
assert (spec.exp_bits, spec.sig_bits) == (2, 1)

def e2m1_value(s: int, e: int, m: int) -> float:
    sign = -1.0 if s else 1.0

    # Subnormal / zero
    if e == 0:
        return sign * (m / 2.0)

    # Normal
    return sign * (2.0 ** (e - 1)) * (1.0 + m / 2.0)

def display_value(bits: int, x: float) -> str:
    if bits == 0b0000:
        return "+0"
    if bits == 0b1000:
        return "-0"
    return f"{x:+g}"

rows = []
for bits in range(16):
    s = (bits >> 3) & 0b1
    e = (bits >> 1) & 0b11
    m = bits & 0b1
    x = e2m1_value(s, e, m)

    rows.append(
        {
            "Bits": f"{bits:04b}",
            "s": s,
            "exp_bits": f"{e:02b}",
            "m": m,
            "Value": display_value(bits, x),
        }
    )

# Pretty-print the table.
header = f"{'Bits':<4} {'s':>1} {'exp':>3} {'m':>1} {'Value':>6}"
print(header)
print("-" * len(header))
for row in rows:
    print(
        f"{row['Bits']:<4} " f"{row['s']:>1} "
        f"{row['exp_bits']:>3} "
        f"{row['m']:>1} "
        f"{row['Value']:>6}"
    )

Other formats

FP4 isn’t the only 4-bit floating point format. There’s a surprisingly large number of formats in use. I intend to address another format my next post.

Update: See the next post for a discussion of NF4, a format whose representable numbers more closely matches the distribution of LLM weights.

Knuth’s Twindragon

A few days ago I wrote about a random process that creates a fractal known as the Twin Dragon. This post gives a deterministic approach to create the same figure.

As far as I can tell, the first reference to this fractal is in a paper by Davis and Knuth in the Journal of Recreational Mathematics from 1970. Unfortunately this journal is out of print and hard or impossible to find online [1]. Knuth presents the twindragon (one word, lowercase) fractal in TAOCP Vol 2, page 206.

Knuth defines the twindragon via numbers base b = 1 − i. Every complex number can be written in the form

z = \sum_{k=-\infty}^\infty a_k (1 - i)^k

where the “digits” ak are either 0 or 1.

The twindragon fractal is the set of numbers that only have non-zero digits to the right of the decimal point, i.e. numbers of the form

z = \sum_{k=1}^\infty a_k (1 - i)^{-k}

I implemented this in Python as follows.

import matplotlib.pyplot as plt
from itertools import product

for bits in product([0, 1], repeat=15):
    z = sum(a*(1-1j)**(-k) for k, a in enumerate(bits))
    plt.plot(z.real, z.imag, 'bo', markersize=1)
plt.show()

This produced the image below.

Related posts

[1] If you can find an archive of Journal of Recreational Mathematics, please let me know.

Base58 versus Base85 encoding

Base58 encoding and Base85 encoding are used to represent binary data in a human-friendly way. Base58 uses a smaller character set and so is more conservative. Base85 uses a larger character set and so is more efficient.

There is a gotcha in that “base” means something different in Base58 compared to Base85. More on that below.

Base58

Base58 encoding is primarily used as part of the Bitcoin system. It is part of the Base58Check protocol used for encoding addresses and keys.

Base58 encoding is essentially the same as mathematical base 58 encoding, with a specific character set. The symbols for the “digits” 0 through 57 are chosen to avoid typographically similar letters. We’ll give that character set in the examples below.

There is only one version of Base58 in common use as far as I know, unlike Base85.

Base85

Base85 is a more compact alternative to Base64 encoding. The former encodes 4 bytes in 5 characters while the latter requires 6 characters. Base85 is used inside the PDF format. It is also used in the patch encoding for git.

Base85 encoding is analogous to binary-coded decimal (BCD). In some early computer systems, integers would not be expressed in binary per se. Instead, each digit would be represented as by four bits. So to represent a number like 427, you’d express 4, 2, and 7 in binary: 0100 0010 0111. If you were to express 427 in binary you’d get 110101011.

Base85 breaks bits into 32-bit words, then expresses each word in base 85. So you might say it’s base 85-encoded 32-bit words by analogy to binary coded decimal.

There are variations on Base85 encoding that use different alphabets, and so two software packages that say they do Base85 encoding might produce different results.

Base85 is more efficient than Base58 in the sense that it represents data using fewer symbols. It is also more computationally efficient because each 32-bit word is encoded independently.

Examples

We give four examples below: Base58 and Base85 applied to four bytes of data and eight bytes of data. The data length matters for Base85.

Base58, four bytes

Let n = CAFEBABEhex = 3405691582ten. This is the “magic number” at the beginning of Java class files, a pun on “java” as a slang for coffee.

In base 58 this number would be

5:10:55:3:26:22

We can verify this as follows:

    >>> 5*58**5 + 10*58**4 + 55*58**3 + 3*58**2 + 26*58 + 22
    3405691582
    >>>  hex(_)
    '0xcafebabe'

The Base58 alphabet is

    123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefghijkmnopqrstuvwxyz

and so the Base58 encoding of 0xCAFEBABE would be the 5th, 10th, 55th, … elements of this alphabet (with zero-based index) which results in 6Bx4TP.

Note that the Base58 alphabet contains the digit 1 but not the letter l. It contains the lower case letter o but not the capital letter 0 or the digit 0. Some of the remaining characters are visibly similar, depending on your font. This post shows how one font makes the Base58 characters more distinct.

Base85, four bytes

Now suppose we want to encode n using Base85. Now we would get

65:20:50:84:67

If we use the alphabet

    !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstu

then the “digits” above become b5Sud.

Note that the Base85 alphabet contains characters that could be confused, such as 0 (zero), O (capital letter), o (lower case letter). The characters were chosen to be printable ASCII characters, not necessarily visually distinct.

Base58, eight bytes

Now suppose n = CAFEBABECAFEBABEhex = 14627333968358193854ten.

We convert n to base 58 to get

33:55:17:43:49:44:3:47:49:44:26

which becomes axJkrm4prmT in the Base58 alphabet.

Base85, eight bytes

To encode CAFEBABECAFEBABEhex in Base85 we do not convert the number to base 85. Instead, we convert each 4-byte word to base 85. So we get two copies of CAFEBABEhex and so the encoding is b5Sudb5Sud.

If we were to wrongly convert n to base 85, we’d get

63:13:1:27:77:35:57:62:38:49

which becomes `."<nDZ_GR which is not the correct encoding.

Related posts

Golden ratio base numbers

It is possible to express every positive integer as a sum of powers of the golden ratio φ using each power at most once. This means it is possible to create a binary-like number system using φ as the base with coefficients of 0 and 1 in front of each power of φ.

This system is sometimes called phinary because of the analogy with binary. I’ll use that term here rather than more formal names such as base-φ or golden base number system.

An interesting feature of phinary is that in general you need to include negative powers of φ to represent positive integers. For example,

2 = \varphi + \varphi^{-2}

and so you could write 2 in this system as 10.01.

To state things more formally, every positive integer n satisfies the following equation where a finite number of coefficients ak are equal to 1 and the rest are equal to 0.

n = \sum_{k=-\infty}^\infty a_k\varphi^k

The golden ratio satisfies φ² = φ + 1 and so phinary representations are not unique. But if you add the rule that number representations must not have consecutive 1s, then representations are unique, analogous to the Fibonacci number system.

The original paper describing the phinary system [1] is awkwardly written. It has the flavor of “Here are some examples. You can see how this generalizes.” rather than a more typical mathematical style.

The end of the article says “Jr. High School 246 Brooklyn, N.Y.” and so when I got to that point I thought the style was due to the paper having been written by a public school teacher rather than a professional mathematician. I later learned from [2] that the author was not a math teacher but a student: George Bergman was 12 years old when he discovered and published his number system.

Phinary is not as simple to develop as you might expect. Bergman’s discovery was impressive, and not only because he was 12 years old at the time. You can find more sophisticated developments in [2] and in [3], but both require a few preliminaries and are not simple.

***

[1] George Bergman. A Number System with an Irrational Base. Mathematics Magazine31 (2): 98–110. 1957.

[2] Cecil Rousseau. The Phi Number System Revisited. Mathematics Magazine, Vol. 68, No. 4 (Oct., 1995), pp. 283-284

[3] Donald Knuth. The Art of Computer Programming, volume 1.

Binomial number system

I just stumbled across the binomial number system in Exercise 5.38 of Concrete Mathematics. The exercise asks the reader to show that every non-negative integer n can be written as

n = \binom{a}{1} + \binom{b}{2} + \binom{c}{3}

and that the representation is unique if you require 0 ≤ abc. The book calls this the binomial number system. I skimmed a paper that said this has some application in signal processing, but I haven’t looked at it closely [1].

You can find ab, and c much as you would find the representation in many other number systems: first find the largest possible c, then the largest possible b for what’s left, and then the remainder is a.

In order to find c, we start with the observation that the binomial coefficient C(k, 3) is less than k³/6 and so c is less than the cube root of 6n. We can use this as an initial lower bound on c then search incrementally. If we wanted to be more efficient, we could do some sort of binary search.

Here’s Python code to find ab, and c.

from math import comb, factorial

def lower(n, r):
    "Find largest k such that comb(k, r) <= n."
    k = int( (factorial(r)*n)**(1/r) ) # initial guess
    while comb(k, r) <= n: 
        k += 1 
    return k - 1 

def binomial_rep(n): 
    c = lower(n, 3) 
    cc = comb(c, 3) 
    b = lower(n - comb(c, 3), 2) 
    bb = comb(b, 2) 
    a = n - cc - bb 
    assert(c > b > a >= 0)
    return (a, b, c)

For example, here’s the binomial number system representation of today’s date.

>>> binomial_rep(20250605)
(79, 269, 496)
>>> comb(496, 3) + comb(269, 2) + comb(79, 1)
20250605

You could use any number of binomial terms, not just three.

[1] I looked back at the paper, and it is using a different kind of binomial number system, expressing numbers as sums of fixed binomial coefficients, not varying the binomial coefficient arguments. This representation has some advantages for error correction.

Representing octonions as matrices, sorta

It’s possible to represent complex numbers as a pair of real numbers or 2 × 2 matrices with real entries.

z \leftrightarrow (a, b) \leftrightarrow \begin{bmatrix}a & -b \\ b & a \end{bmatrix}

And it’s possible to represent quaternions as pairs of complex numbers or 2 × 2 matrices with complex entries

q \leftrightarrow (z_0, z_1) \leftrightarrow \begin{bmatrix} z_0 & z_1 \\ -z_1^* & z_0^* \end{bmatrix}

were z* is the complex conjugate of z.

And it’s also possible to represent octonions as pairs of quaternions or 2 × 2 matrices with quaternion entries, with a twist.

o \leftrightarrow (q_0, q_1) \leftrightarrow \begin{bmatrix} q_0 & q_1 \\ -q_1^* & q_0^* \end{bmatrix}

where q* is the quaternion conjugate of q.

Matrix multiplication is associative, but octonion multiplication is not, so something has to give. We have to change the definition of matrix multiplication slightly.

\begin{bmatrix} \alpha_0 & \alpha_1 \\ \alpha_2 & \alpha_3 \end{bmatrix}\circ\begin{bmatrix} \beta_0 & \beta_1 \\ \beta_2 & \beta_3 \end{bmatrix}=\begin{bmatrix} \alpha_0\beta_0+\beta_2\alpha_1 & \beta_1\alpha_0+\alpha_1\beta_3\\ \beta_0\alpha_2+\alpha_3\beta_2 & \alpha_2\beta_1+\alpha_3\beta_3 \end{bmatrix}

In half the products, the beta term comes before the alpha term. This wouldn’t matter if the alpha and beta terms commuted, e.g. if they were complex numbers this would be ordinary matrix multiplication. But the alphas and betas are quaternions, and so order matters, and the matrix product defined above is not the standard matrix product.

Going back to the idea of matrices of matrices that I wrote about a few days ago, we could represent the octonions as 2 × 2 matrices whose entries are 2 × 2 matrices of complex numbers, etc.

If you look closely at the matrix representations above, you’ll notice that the matrix representations of quaternions and octonions doesn’t quite match the pattern of the complex numbers. There should be a minus sign in the top right corner and not in the bottom left corner. You could do it that way, but there’s a sort of clash of conventions going on here.

Octonions sometimes associate

You can multiply pairs of real numbers using the rules of complex numbers. Complex numbers have all the algebraic structure of the real numbers, i.e. they form a field.

There is a general process, the Cayley-Dickson construction, that let’s you bootstrap multiplication from 1 real number to 2, from 2 to 4, from 4 to 8, etc. You can repeat the process as many times as you like, defining multiplication on lists of 2n numbers, but you lose structure as you go.

Quaternions

Multiplication for 4-tuples gives the quaternions. The quaternions retain most of the structure of the real and complex numbers. Multiplication is associative. Non-zero elements have a multiplicative inverse, i.e. you can divide. And multiplication plays well with the norm:

|| xy || = || x || · || y ||.

But multiplication is not commutative: in general, xy ≠ yx,

Octonions

Multiplication of 8-tuples produces the octonions . It’s still true that non-zero elements have a multiplicative inverse, and multiplication still plays well with the norm as above. But now, not only is multiplication not commutative, it’s not even associative: in general, (xy)z ≠ x(yz). It’s the “in general” part that this post wants to elaborate on.

The subalgebra generated by any two elements is associative. That means, for example, that (xy)xx(yx). If you fix x and y, and look at all the octonions you can form by taking adding, multiplying, conjugating, and inverting these elements, as well as multiplying them by a real number, you get a set of octonions for which multiplication is associative.

In fact, the subalgebra generated by two octonions is isomorphic to either the real numbers, the complex numbers, or the quaternions, depending on the two octonions you start with.

This was brought to my attention by a common on a post on octonions from a few years ago. Someone pointed out that an equation I had written

x* = − (x + (e1 xe1 + … + (e7 xe7) / 6

could be written more simply as

x* = − (x + e1 x e1 + … + e7 x e7) / 6.

because each term only involves two distinct octonions.

Sedenions

The next step, multiplying 16-tuples of real numbers, gives the sedenions [1]. Now we lose even more structure. Multiplication is not commutative, not associative, and it’s possible for two non-zero numbers to have a zero product. That means the norm property

|| xy || = || x || · || y ||

goes out the window since the left size can be zero when the right side is not.

Sedenions, and indeed all Cayley-Dickson algebras, are flexible, which means (xy)xx(yx). But it’s not true more generally that the algebra generated by two sedenions is associative.

Trigintaduonions

The next rung in the Cayley-Dickson ladder is the family of 32-tuples known as the trigintaduonions [2]. The sedenions are a mess, and they’re a subset of the trigintaduonions, so the trigintaduonions are a mess. But at least they’re flexible.

Summary of properties

More octonion posts

[1] From the Latin word sedecim for 16.

[2] From the Latin triginta for 30 and duo for 2.

Multiplying by quaternions on the left and right

The map that takes a quaternion x to the quaternion qx is linear, so it can be represented as multiplication by a matrix. The same is true of the map that takes x to xq, but the two matrices are not the same because quaternion multiplication does not commute.

Let qa + bi + cjdk and let qM be the matrix that represents multiplication on the left by q. Then

_qM = \begin{bmatrix} a & -b & -c & -d \\ b & a & -d & c \\ c & d & a & -b \\ d & -c & b & a \\ \end{bmatrix}

Now let Mq be the matrix that represents multiplication on the right by q. Then

M_q = \begin{bmatrix} a & -b & -c & -d \\ b & a & d & -c \\ c & -d & a & b \\ d & c & -b & a \\ \end{bmatrix}

Can prove both matrix representations are correct by showing that they do the right thing when q = 1, ij, and k. The rest follows by linearity.

You might speculate that the matrix representation for multiplying on the right by q might be the transpose of the matrix representation for multiplying on the left by q. You can look at the matrices above and see that’s not the case.

In this post I talk about how to represent rotations with quaternions, and in this post I give an equation for the equivalent rotation matrix for a rotation described by a quaternion. You can prove that the matrix representation is correct by multiplying out qM and Mq* . Keep in mind that q in that case is a unit quaterion, so the sum of the squares of its components add to 1.

Related posts

Matrix representations of number systems

The previous post discussed complex numbers, dual numbers, and double numbers. All three systems are constructed by adding some element to the real numbers that has some special algebraic property. The complex numbers are constructed by adding an element i such that i² = −1. The dual numbers add an element ε ≠ 0 with ε² = 0, and the double numbers are constructed by adding j ≠ 1 with j² = 1.

If adding special elements seems somehow illegitimate, there is an alternative way to define these number systems that may seem more concrete using 2 × 2 matrices. (A reader from 150 years ago would probably be more comfortable with appending special numbers than with matrices, but now we’re accustomed to matrices.)

The following mappings provide isomorphisms between complex, dual, and double numbers and their embeddings in the ring of 2 × 2 matrices.

\begin{align*} a + ib &\leftrightarrow \begin{pmatrix} a & -b \\ b & a \end{pmatrix} \\ a + \varepsilon b &\leftrightarrow \begin{pmatrix} a & b \\ 0 & a \end{pmatrix} \\ a + jb &\leftrightarrow \begin{pmatrix} a & b \\ b & a \end{pmatrix} \\ \end{align*}

Because the mappings are isomorphisms, you can translate a calculation in one of these number systems into a calculation involving real matrices, then translate the result back to the original number system. This is conceptually interesting, but it could also be useful if you’re using software that supports matrices but does not directly support alternative number systems.

You can also apply the correspondences from right to left. If you need to carry out calculations on matrices of the special forms above, you could move over to complex (or dual, or double) numbers, do your algebra, then convert the result back to matrices.

Functions of a matrix

The previous post looked at variations on Euler’s theorem in complex, dual, and double numbers. You could verify these three theorems by applying exp, sin, cos, sinh, and cosh to matrices. In each case you define the function in terms of its power series and stick in matrices. You should be a little concerned about convergence, but it all works out.

You should also be concerned about commutativity. Multiplication of real numbers is commutative, but multiplication of matrices is not, so you can’t just stick matrices into any equation derived for real numbers and expect it to hold. For example, it’s not true in general that exp(A + B) equals exp(A) exp(B). But it is true if the matrices A and B commute, and the special matrices that represent complex (or dual, or double) numbers do commute.

Related posts