Matrix representations of number systems

The previous post discussed complex numbers, dual numbers, and double numbers. All three systems are constructed by adding some element to the real numbers that has some special algebraic property. The complex numbers are constructed by adding an element i such that i² = −1. The dual numbers add an element ε ≠ 0 with ε² = 0, and the double numbers are constructed by adding j ≠ 1 with j² = 1.

If adding special elements seems somehow illegitimate, there is an alternative way to define these number systems that may seem more concrete using 2 × 2 matrices. (A reader from 150 years ago would probably be more comfortable with appending special numbers than with matrices, but now we’re accustomed to matrices.)

The following mappings provide isomorphisms between complex, dual, and double numbers and their embeddings in the ring of 2 × 2 matrices.

\begin{align*} a + ib &\leftrightarrow \begin{pmatrix} a & -b \\ b & a \end{pmatrix} \\ a + \varepsilon b &\leftrightarrow \begin{pmatrix} a & b \\ 0 & a \end{pmatrix} \\ a + jb &\leftrightarrow \begin{pmatrix} a & b \\ b & a \end{pmatrix} \\ \end{align*}

Because the mappings are isomorphisms, you can translate a calculation in one of these number systems into a calculation involving real matrices, then translate the result back to the original number system. This is conceptually interesting, but it could also be useful if you’re using software that supports matrices but does not directly support alternative number systems.

You can also apply the correspondences from right to left. If you need to carry out calculations on matrices of the special forms above, you could move over to complex (or dual, or double) numbers, do your algebra, then convert the result back to matrices.

Functions of a matrix

The previous post looked at variations on Euler’s theorem in complex, dual, and double numbers. You could verify these three theorems by applying exp, sin, cos, sinh, and cosh to matrices. In each case you define the function in terms of its power series and stick in matrices. You should be a little concerned about convergence, but it all works out.

You should also be concerned about commutativity. Multiplication of real numbers is commutative, but multiplication of matrices is not, so you can’t just stick matrices into any equation derived for real numbers and expect it to hold. For example, it’s not true in general that exp(A + B) equals exp(A) exp(B). But it is true if the matrices A and B commute, and the special matrices that represent complex (or dual, or double) numbers do commute.

Related posts

Euler’s formula for dual numbers and double numbers

The complex numbers are formed by adding an element i to the real numbers such that i² = − 1. We can create other number systems by adding other elements to the reals.

One example is dual numbers. Here we add a number ε ≠ 0 with the property ε² = 0. Dual numbers have been used in numerous applications, most recently in automatic differentiation.

Another example is double numbers [1]. Here we add a number j ≠ ±1 such that j² = 1. (Apologies to electrical engineers and Python programmers. For this post, j is not the imaginary unit from complex numbers.)

(If adding special numbers to the reals makes you uneasy, see the next post for an alternative approach to defining these numbers.)

We can find analogs of Euler’s formula

\exp(i\theta) = \cos(\theta) + i \sin(\theta)

for dual numbers and double numbers by using the power series for the exponential function

\exp(z) = \sum_{k=0}^\infty \frac{z^k}{k!}

to define exp(z) in these number systems.

For dual numbers, the analog of Euler’s theorem is

\exp(\varepsilon x) = 1 + \varepsilon x

because all the terms in the power series after the first two involve powers of ε that evaluate to 0. Although this equation only holds for dual numbers, not real numbers, it is approximately true of ε is a small real number. This is the motivation for using ε as the symbol for the special number added to the reals: Dual numbers can formalize calculations over the reals that are not formally correct.

For double numbers, the analog of Euler’s theorem is

\exp(j x) = \cosh(x) + j \sinh(x)

and the proof is entirely analogous to the proof of Euler’s theorem for complex numbers: Write out the power series, then separate the terms involving even exponents from the terms involving odd exponents.

Related posts

[1] Double numbers have also been called motors, hyperbolic numbers, split-complex numbers, spacetime numbers, …

Tricks for radix conversion by hand

The simplest trick for converting from one base to another is grouping. To convert between base b and base bk, group numbers in sets of k and convert one group at a time. To convert from binary to octal, for instance, group bits in sets of three, starting from the right end, and convert each group to octal.

11110010two → (11)(110)(010) → 362eight

For an example going the other direction, let’s convert 476 in base nine to base three.

476nine → (11)(21)(20) → 112120three

In general, conversion between bases is too tedious to do by hand, but one important case where it’s a little easier than it could be is converting between decimal and octal. In combination with the grouping trick above, this means you could, for example, convert between decimal and binary by first converting decimal to octal. Then the conversion from octal to binary is trivial.

The key to converting between decimal and octal is to exploit the fact that 10 = 8 + 2, so powers of 10 become powers of (8 + 2), or powers of 8 become powers of (10 − 2). These tricks are easier to carry out than to explain. You can find descriptions and examples in Knuth’s TAOCP, volume 2, section 4.4.

Knuth cites a note by Walter Soden from 1953 as the first description of the trick for converting octal to decimal.

The trick for moving between base 9 and base 10 (or by grouping, between base 3 and base 10) is simpler and left as an exercise by Knuth. (Problem 12 in section 4.4, with a solution given at the back of the volume.)

Related posts

Double rounding

The previous post started by saying that rounding has a surprising amount of detail. An example of this is double rounding: if you round a number twice, you might not get the same result as if you rounded directly to the final precision.

For example, let’s say we’ll round numbers ending in 0, 1, 2, 3, or 4 down, and numbers ending in 5, 6, 7, 8, or 9 up. Then if we have a number like 123.45 and round it to one decimal place we have 123.5, and if we round that to an integer we have 124. But if we had rounded 123.45 directly to an integer we would have gotten 123. This is not a mere curiosity; it comes up fairly often and has been an issue in lawsuits.

The double rounding problem cannot happen in odd bases. So, for example, if you have some fraction represented in base 7 and you round it first from three figures past the radix point to two, then from two to one, you’ll get the same result as if you directly rounded from three figures to one. Say we start with 4231.243seven. If we round it to two places we get 4231.24seven, and if we round again to one place we get 4231.3seven, the same result we would get by rounding directly from three places to one.

The reason this works is that you cannot represent ½ by a finite expression in an odd base.

A magical land where rounding equals truncation

Rounding numbers has a surprising amount of detail. It may seem trivial but, as with most things, there is a lot more to consider than is immediately obvious. I expect there have been hundreds if not thousands of pages devoted to rounding in IEEE journals.

An example of the complexity of rounding is what William Kahan called The Tablemaker’s Dilemma: there is no way in general to know in advance how accurately you’ll need to compute a number in order to round it correctly.

Rounding can be subtle in any number system, but there is an alternative number system in which it is a little simpler than in base 10. It’s base 3, but with a twist. Instead of using 0, 1, and 2 as “digits”, we use −1, 0, and 1. This is known as the balanced ternary system: ternary because of base 3, and balanced because the digits are symmetrical about 0.

We need a symbol for −1. A common and convenient choice is to use T. Think of moving the minus sign from in front of a 1 to on top of it. Now we could denote the number of hours in a day as 10T0 because

1 \times 3^3 + 0 \times 3^2 + (-1)\times 3 + 0 = 24

A more formal way of a describing balanced ternary representation of a number x is a set of coefficients tk such that

x = \sum_{k=-\infty}^\infty t_k 3^k

with the restriction that each tk is in the set {−1, 0, 1}.

Balanced ternary representation has many interesting properties. For example, positive and negative numbers can all be represented without a minus sign. See, for example, Brain Hayes’ excellent article Third Base. The property we’re interested in here is that to round a balanced ternary number to the nearest integer, you simply lop off the fractional part. Rounding is the same as truncation. To see this, note that the largest possible fractional part is a sequence of all 1s, which represents ½:

\frac{1}{3} + \frac{1}{3^2} + \frac{1}{3^3} + \cdots = \frac{1}{2}

Similarly, the most negative possible fractional part is a sequence of all Ts, which represents −½. So unless the fractional part is exactly equal to ½, truncating the fractional part rounds to the nearest integer. If the fractional part is exactly ½ then there is no nearest integer but two integers that are equally near.

Related posts

Duplicating Hankel plot from A&S

Abramowitz and Stegun has quite a few intriguing plots. The post will focus on the follow plot, Figure 9.4, available here.

A&S figure 9.4

We will explain what the plot is and approximately reproduce it.

The plot comes from the chapter on Bessel functions, but the caption says it is a plot of the Hankel function H0(1). Why a plot of a Hankel function and not a Bessel function? The Hankel functions are linear combinations of the Bessel functions of the first and second kind:

H0(1) = J0i Y0

More on that Hankel functions and their relations to Bessel functions here.

The plot is the overlay of two kinds of contour plots: one for lines of constant magnitude and one for lines of constant phase. That is, if the function values are written in the form reiθ then one plot shows lines of constant r and one plot shows lines of constant θ.

We can roughly reproduce the plot of magnitude contours with the following Mathematica command:

ContourPlot[Abs[HankelH1[0, x + I y]], {x, -4, 2 }, {y, -1.5 , 1.5 }, 
 Contours -> 20, ContourShading -> None, AspectRatio -> 1/2]

This produces the following plot.

Absolute value contour

Similarly, we can replace Abs with Arg in the Mathematica command and increase Contours to 30 to obtain the following phase contour plot.

Phase contour

Finally, we can stack the two plots on top of each other using Mathematica’s Show command.

Magnitude and phase contours

By the way, you can clearly see the branch cut in the middle. The Hankel function is continuous (even analytic) as you move clockwise from the second quadrant around to the third, but it is discontinuous across the negative real axis because of the branch cut.

Related posts

Area of a quadrilateral from the lengths of its sides

Last week Heron’s formula came up in the post An Unexpected Triangle. Given the lengths of the sides of a triangle, there is a simple expression for the area of the triangle.

A = \sqrt{s(s-a)(s-b)(s-c)}

where the sides are a, b, and c and s is the semiperimeter, half the perimeter.

Is there an analogous formula for the area of a quadrilateral? Yes and no. If the quadrilateral is cyclic, meaning there exists a circle going through all four of its vertices, then Brahmagupta’s formula for the area of a quadrilateral is a direct generalization of Heron’s formula for the area of a triangle. If the sides of the cyclic quadrilateral are a, b, c, and d, then the area of the quadrilateral is

A = \sqrt{(s-a)(s-b)(s-c)(s-d)}

where again s is the semiperimeter.

But in general, the area of a quadrilateral is not determined by the length of its sides alone. There is a more general expression, Bretschneider’s formula, that expresses the area of a general quadrilateral in terms of the lengths of its sides and the sum of two opposite angles. (Either pair of opposite angles lead to the same value.)

A = \sqrt {(s-a)(s-b)(s-c)(s-d) - abcd \, \cos^2 \left(\frac{\alpha + \gamma}{2}\right)}

In a cyclic quadrilateral, the opposite angles α and γ add up to π, and so the cosine term drops out.

The contrast between the triangle and the quadrilateral touches on an area of math called distance geometry. At first this term may sound redundant. Isn’t geometry all about distances? Well, no. It is also about angles. Distance geometry seeks results, like Heron’s theorem, that only depend on distances.

Related posts

Entering Russian characters in Vim with digraphs

The purpose of this post is to expand on the following sentence [1]:

Russian letters are created by entering [Ctrl-k followed by] a corresponding Latin letter followed by an equals sign -, or, in a few places, a percent sign %.

The Russian alphabet has 33 letters, so there couldn’t be a Latin letter for every Russian letter. Also, there are Latin letters that don’t have a Russian counterpart and vice versa. So the mapping can’t be simple. But still, the above summary is nice: try control-k followed by the English analog and the equal sign. If that doesn’t work, try a percent sign instead.

Which Latin letters does Vim chose as corresponding to Russian letters? Does it go by sound or appearance? For example, the Russian letter Н looks like a Latin H but it sounds like a Latin N. Vim goes by sound. You would enter the Russian letter Н by typing Ctrl-k N =.

For full details, see the Vim documentation :h digraph-table. I give a simplified excerpt from the documentation below. I just look at capital letters because the lower case letters are analogous. All the official Unicode names begin with CYRILLIC CAPITAL LETTER and so I cut that part out.

char    digraph hex     official name 
А       A=      0410    A
Б       B=      0411    BE
В       V=      0412    VE
Г       G=      0413    GHE
Д       D=      0414    DE
Е       E=      0415    IE
Ё       IO      0401    IO
Ж       Z%      0416    ZHE
З       Z=      0417    ZE
И       I=      0418    I
Й       J=      0419    SHORT I
К       K=      041A    KA
Л       L=      041B    EL
М       M=      041C    EM
Н       N=      041D    EN
О       O=      041E    O
П       P=      041F    PE
Р       R=      0420    ER
С       S=      0421    ES
Т       T=      0422    TE
У       U=      0423    U
Ф       F=      0424    EF
Х       H=      0425    HA
Ц       C=      0426    TSE
Ч       C%      0427    CHE
Ш       S%      0428    SHA
Щ       Sc      0429    SHCHA
Ъ       ="      042A    HARD SIGN
Ы       Y=      042B    YERU
Ь       %"      042C    SOFT SIGN
Э       JE      042D    E
Ю       JU      042E    YU
Я       JA      042F    YA

Note that the end of the alphabet is more complicated than simply using a Latin letter and either an equal or percent sign. Also, the table is in alphabetical order, which doesn’t quite correspond to Unicode numerical order because of a quirk with the letter Ё (U+0401) explained here.

[1] Arnold Robbins and Elbert Hannah. Learning the vi & Vim Editors, 8th edition

Chebyshev and Russian transliteration

It’s not simple to transliterate Russian names to English. Sometimes there is a unique mapping, or at least a standard mapping, of a particular name, but often there is not.

An example that comes up frequently in mathematics is Pafnuty Lvovich Chebyshev (1821–1894). This Russian mathematician’s name Пафну́тий Льво́вич Чебышёв has been transliterated at Tchebichef, Tchebychev, Tchebycheff, Tschebyschev, Tschebyschef, Tschebyscheff, Čebyčev, Čebyšev, Chebysheff, Chebychov, Chebyshov, etc.

The American Mathematical Society has settled on “Chebyshev” as its standard, and this is now common in English mathematical writing. But things named after Chebyshev, such as Chebyshev polynomials, are often denoted with a T because the French prefer “Tchebyshev.”

There is an ISO standard, ISO 9, for transliterating Cyrillic characters into Latin characters. Under this standard, Чебышёв becomes Čebyšëv. This maps Cyrillic into Latin characters with diacritical marks but not into ASCII. The AMS realized that the vast majority of Americans would not type Čebyšëv into a search bar, for example, and chose Chebyshev instead.

Related posts

An unexpected triangle

Let J(x) be the function plotted below.

This is the Bessel function J1, but we drop the subscript because it’s the only Bessel function we’re interested in for this post. You can think of J as a sort of damped sine.

We can create versions of J with different frequencies by multiplying the argument x by different constants. Suppose we create versions with three different frequencies — J(ax), J(bx), and J(cx) — and integrate their product. This defines a function f of the frequencies.

f(a, b, c) = \int_0^\infty J(ax) \, J(bx) \, J(cx)\, dx

We can evaluate the integral defining f using Sonine’s formula [1]

f(a, b, c) = \frac{2\Delta(a, b, c)}{\pi abc}

where Δ(a, b, c) is the area of a triangle with sides a, b, c, if such a triangle exists, and zero otherwise.

It’s amazing that this formula takes three parameters with no intrinsic geometric meaning and out pops the area of a triangle with such sides.

Numerical (mis)application

It would be ill-advised, but possible, to invert Sonine’s formula and use it to find the area of a triangle; Heron’s formula would be a far better choice. But just for fun, we’ll take the ill-advised route.

from numpy import linspace, pi, sqrt, inf
from scipy.special import j1
from scipy.integrate import quad

def heron(a, b, c):
    s = (a + b + c)/2
    return sqrt(s*(s-a)*(s-b)*(s-c))

def g(a, b, c):
    integrand = lambda x: j1(a*x) * j1(b*x) * j1(c*x)
    i, _ = quad(integrand, 0, inf, limit=500)
    return i

def area(a, b, c):
    return g(a, b, c)*pi*a*b*c/2

print(area(3,4,5), heron(3,4,5))

SciPy’s quad function has difficulty with the integration, and rightfully issues a warning. The code increases the limit parameter from the default value of 50 to 500, improving the accuracy but not eliminating the warning. The area function computes the error of a 3-4-5 triangle to be 5.9984 and the heron function computes the exact value 6.

Update: I tried the numerical integration above in Mathematica and it correctly computed the integral to 10 decimal places with no help. I suspect it is detecting the oscillatory nature of the integral and using Levin’s integration method; when I explicit set the integration to be Levin’s method, I get the same result.

Impossible triangles

You could calculate the area of a triangle from Heron’s theorem or from Sonine’s theorem. The results are identical when the parameters a, b, and c form a valid triangle. But the two approaches diverge when a, b, and c do not form a triangle. If you pass in parameters like (3, 1, 1) then Heron’s theorem gives a complex number and Sonine’s theorem yields zero.

Related posts

[1] Discovered by Nikolay Yakovlevich Sonin (1849–1915). The formula is usually referred to as Sonine’s formula rather than Sonin’s formula, as far as I’ve seen. This variation is minor compared to what transliteration does to other Russian names.

Sonine’s formula is more general, extending to Bessel functions Jν with Re(ν) > ½. I chose ν = 1 for this post because the Sonin’s formula is simplest in this case.