Inverse Gray code

The previous post looked at Gray code, a way of encoding digits so that the encodings of consecutive integers differ in only bit. This post will look at how to compute the inverse of Gray code.

The Gray code of a non-negative integer n is given by

    def gray(n):
        return n ^ (n >> 1)

Bit-level operations

In case you’re not familiar with the notation, the >> operator shifts the bits of its argument. The code above shifts the bits of n one place to the right. In the process, the least significant bit falls off the end. We could replace n >> 1 with n // 2 if we like, i.e. integer division by 2, rounding down if n is odd. The ^ operator stands for XOR, exclusive OR. A bit of x ^ y is 0 if both corresponding bits in x and y are the same, and 1 if they are different.

Computing the inverse

The inverse of Gray code is a more complicated. If we assume n < 232, then we can compute the inverse Gray code of n by

    def inverse_gray32(n):
        assert(0 <= n < 2**32) n = n ^ (n >> 1)
        n = n ^ (n >> 2)
        n = n ^ (n >> 4)
        n = n ^ (n >> 8)
        n = n ^ (n >> 16)
        return n

For n of any size, we can compute its inverse Gray code by

    def inverse_gray(n):
        x = n
        e = 1
        while x:
            x = n >> e
            e *= 2
            n = n ^ x
        return n

If n is a 32-bit integer, inverse_gray32 is potentially faster than inverse_gray because of the loop unrolling.

Plots

Here’s a plot of the Gray code function and its inverse.

Proof via linear algebra

How do we know that what we’re calling the inverse Gray code really is the inverse? Here’s a proof for 32-bit integers n.

    def test_inverse32():
        for i in range(32):
            x = 2**i
            assert(inverse_gray32(gray(x)) == x)
            assert(gray(inverse_gray32(x)) == x)

How is that a proof? Wouldn’t you need to try all possible 32-bit integers if you wanted a proof by brute force?

If we think of 32-bit numbers as vectors in a 32-dimensional vector space over the binary field, addition is defined by XOR. So XOR is linear by definition. It’s easy to see that shifts are also linear, and the composition of linear functions is linear. This means that gray and inverse_gray32 are linear transformations. If the two linear transformations are inverses of each other on the elements of a basis, they are inverses everywhere. The unit basis vectors in our vector space are simply the powers of 2.

Matrix representation

Because Gray code and its inverse are linear transformations, they can be defined by matrix multiplication (over the binary field). So we could come up with 32 × 32 binary matrices for Gray code and its inverse. Matrix multiplication would give us a possible, but inefficient, way to implement these functions. Alternatively, you could think of the code above as clever ways to implement multiplication by special matrices very efficiently!

OK, so what are the matrices? For n-bit numbers, the matrix giving the Gray code transformation has dimension n by n. It has 1’s on the main diagonal, and on the diagonal just below the main diagonal, and 0s everywhere else. The inverse of this matrix, the matrix for the inverse Gray code transformation, has 1s on the main diagonal and everywhere below.

Here are the matrices for n = 4.

\begin{bmatrix} 1 & 0 & 0 & 0 \\ 1 & 1 & 0 & 0 \\ 0 & 1 & 1 & 0 \\ 0 & 0 & 1 & 1 \\ \end{bmatrix} \begin{bmatrix} 1 & 0 & 0 & 0 \\ 1 & 1 & 0 & 0 \\ 1 & 1 & 1 & 0 \\ 1 & 1 & 1 & 1 \\ \end{bmatrix} = \begin{bmatrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \\ \end{bmatrix}

The matrix on the left is for Gray code, the next matrix is for inverse Gray code, and the last matrix is the identity. NB: The equation above only holds when you’re working over the binary field, i.e. addition is carried out mod 2, so 1 + 1 = 0.

To transform a number, represent it as a vector of length n, with the least significant in the first component, and multiply by the appropriate matrix.

Relation to popcount

It’s easy to see by induction that a number is odd if and only if its Gray code has an odd number of 1. The number 1 is its own Gray code, and as we move from the Gray code of n to the Gray code of n+1 we change one bit, so we change the parity of the the number of 1s.

There’s a standard C function popcount that counts the number of 1’s in a number’s binary representation, and the last bit of the popcount is the parity of the number of 1s. I blogged about this before here. If you look at the code at the bottom of that post, you’ll see that it’s the same as gray_inverse32.

The code in that post works because you can compute whether a word has an odd or even number of 1s by testing whether it is the Gray code of an odd or even number.

Permutable polynomials

Two polynomials p(x) and q(x) are said to be permutable if

p(q(x)) = q(p(x))

for all x. It’s not hard to see that Chebyshev polynomials are permutable.

First,

Tn(x) = cos (n arccos(x))

where Tn is the nth Chebyshev polyomial. You can take this as a definition, or if you prefer another approach to defining the Chebyshev polynomials it’s a theorem.

Then it’s easy to show that

Tm(Tn(x)) = Tmn (x).

because

cos(m arccos(cos(n arccos(x)))) = cos(mn arccos(x)).

Then the polynomials Tm and Tn must be permutable because

Tm(Tn(x)) = Tmn (x) = Tn(Tm(x))

for all x.

There’s one more family of polynomials that are permutable, and that’s the power polynomials xk. They are trivially permutable because

(xm)n = (xn)m.

It turns out that the Chebyshev polynomials and the power polynomials are essentially [1] the only permutable sequence of polynomials.

Related posts

[1] Here’s what “essentially” means. A set of polynomials, at least one of each positive degree, that all permute with each other is called a chain. Two polynomials p and q are similar if there is an affine polynomial

λ(x) = ax + b

such that

p(x) = λ-1( q( λ(x) ) ).

Then any permutable chain is similar to either the power polynomials or the Chebyshev polynomials. For a proof, see Chebyshev Polynomials by Theodore Rivlin.

Conceptual vs Numerical

One of the things that makes numerical computation interesting is that it often reverses the usual conceptual order of things, using advanced math to compute things that are introduced earlier.

Here’s an example I recently ran across [1]. The hyperbolic functions are defined in terms of the exponential function:

\begin{align*} \sinh(x) &= \frac{\exp(x) - \exp(-x)}{2} \\ \cosh(x) &= \frac{\exp(x) + \exp(-x)}{2} \\ \tanh(x) &= \frac{\exp(x) - \exp(-x)}{\exp(x) + \exp(-x)} \\ \end{align*}

But it may be more efficient to compute the exponential function in terms of hyperbolic functions. Specifically,

\exp(x) = \sinh(x) + \sqrt{\sinh(x)^2 + 1}

Why would you want to compute the simple thing on the left in terms of the more complicated thing on the right? Because hyperbolic sine is an odd function. This means that half its power series coefficients are zero: an odd function only has odd powers in its power series.

Suppose you need to compute the exponential function in an environment where you only have a limited number of registers to store constants. You can get more accuracy out of the same number of registers by using them to store coefficients in the power series for hyperbolic sine than for exp.

If we store n coefficients for sinh, we can include powers of x up to 2n – 1. And since the coefficient of x2n is zero, the error in our Taylor approximation is O(x2n+1). See this post for more on this trick.

If we stored n coefficients for exp, we could include powers of x up to n-1, and our error would be O(xn).

To make things concrete, suppose n = 3 and x = 0.01. The error in the exp function would be on the order of 10-6, but the error in the sinh function would be on the order of 10-14. That is, we could almost compute exp to single precision and sinh to almost double precision.

(I’m glossing over a detail here. Would you really need to store, for example, the 1 coefficient in front of x in either series? For simplicity in the argument above I’ve implicitly said yes. Whether you’d actually need to store it depends on the details of your implementation.)

The error estimate above uses big-oh arguments. Let’s do an actual calculation to see what we get.

    from math import exp, sinh, sqrt
    
    def exp_taylor(x):
        return 1 + x + 0.5*x**2
    
    def sinh_taylor(x):
        return x + x**3/6 + x**5/120
    
    def exp_via_sinh(x):
        s = sinh_taylor(x)
        return s + sqrt(s*s + 1)
    
    def print_error(approx, exact):
        print(f"Computed: {approx}")
        print(f"Exact:    {exact}")
        print(f"Error:    {approx - exact}")
    
    x = 0.01
    approx1 = exp_taylor(x)
    approx2 = exp_via_sinh(x)
    exact   = exp(x)
    
    print("Computing exp directly:\n")
    print_error(approx1, exact)
    
    print()
    
    print("Computing exp via sinh:\n")
    print_error(approx2, exact)

This produces

    Computing exp directly:

    Computed: 1.0100500000000001
    Exact:    1.010050167084168
    Error:    -1.6708416783473012e-07

    Computing exp via sinh:

    Computed: 1.0100501670841682
    Exact:    1.010050167084168
    Error:    2.220446049250313e-16

Our errors are roughly what we expected from our big-oh arguments.

More numerical analysis posts

[1] I saw this identity in Matters Computational: Ideas, Algorithms, Source Code by Jörg Arndt.

Banned math book

Courant & Hilbert is a classic applied math textbook, still in print nearly a century after the first edition came out. The actual title of the book is Methods of Mathematical Physics, but everyone calls it Courant & Hilbert after the authors, Richard Courant and David Hilbert. I was surprised to find out recently that this was once a banned book. How could there be anything controversial about a math book? It doesn’t get into any controversial applications of math; it applies math to physics problems, but doesn’t apply the physics to anything in particular.

The book was first published in Germany in 1924 under the title Methoden der mathematischen Physik. Courant says in the preface

… I had been forced to leave Germany and was fortunate and grateful to be given the opportunities open in the United States. During the Second World War the German book became unavailable and was even suppressed by the National Socialist rulers of Germany. The survival of the book was secured when the United States Government seized the copyright and licensed a reprint issued by Interscience Publishers.

Courant’s language is remarkably restrained under the circumstances.

I wondered why the book was banned. Was Courant Jewish? I’d never considered this before, because I couldn’t care less about the ethnicity of authors. Jew or Greek, bond or free, male or female, I just care about their content. The Nazis, however, did care. According to his Wikipedia biography, Courant fled Germany not because of his Jewish ancestry but because of his affiliation with the wrong political party.

***

I never had Courant & Hilbert as a textbook, but I was familiar with it as a student. I vaguely remember that the library copy was in high demand and that I considered buying a copy, though it was too expensive for my means at the time. I recently bought a copy now that the book is cheaper and my means have improved.

I covered most of the material in Courant & Hilbert in graduate school, albeit in a more abstract form. As I mentioned the other day, my education was somewhat top-down; I learned about things first in an abstract setting and got down to particulars later, moving from soft analysis to hard analysis.

One quick anecdote along these lines. I read somewhere that David Hilbert was at a conference where someone referred to a Hilbert space and he asked the speaker what such a thing was. Hilbert’s work had motivated the definition of a Hilbert space, but Mr. Hilbert thought in more concrete terms.

Sinc approximation

If a function is smooth and has thin tails, it can be well approximated by sinc functions. These approximations are frequently used in applications, such as signal processing and numerical integration. This post will illustrate sinc approximation with the function exp(-x²).

The sinc approximation for a function f(x) is given by

f(x) \approx \sum_{j=-n}^n f(jh) \, \text{sinc}\left(\frac{x - jh}{h}\right)

where sinc(x) = sin(πx)/πx.

Do you get more accuracy from sampling more densely or by sampling over a wider range? You need to do both. As the number of sample points n increases, you want h to decrease something like 1/√n and the range to increase something like √n.

According to [1], the best trade-off between smaller h and larger n depends on the norm you use to measure approximation error. If you’re measuring error with the Lebesgue p-norm [2], choose h to be

h = (\pi/2) \sqrt{q/(2n)}

where q is the conjugate exponent to p, i.e.

\frac{1}{p} + \frac{1}{q} = 1

When p = 2, q is also 2. For the sup norm, p = ∞ and q = 1.

So the range between the smallest and largest sample will be

\pi \sqrt{nq/2}

Here are a few plots to show how quickly the sinc approximations converge for exp(-x²). We’ll look at n = 1 (i.e. three sample points) , n = 2 (five sample points) and n = 4 (nine sample points). For n > 1, the sinc approximations are so good that the plots are hard to distinguish. So we’ll show the plot of exp(-x²) and its approximation for n = 1 then show the error curves.

And now the error plots.

Note that the vertical scales are different in each subplot. The error for n = 3 is two orders of magnitude smaller than the error for n = 1.

Related posts

[1] Masaaki Sugihara. Near Optimality of the Sinc Approximation. Mathematics of Computation, Vol. 72, No. 242 (Apr., 2003), pp. 767-786.

[2] The reference above doesn’t use the p-norms on the real line but on a strip in the complex plane containing the real line, and with a weight function that penalizes errors exponentially in the tails.

Manipulating sums

This post is a list of five spinoffs of my previous post. Except for the last point it doesn’t build on the previous post per se, but I’ll use a sum from that post to illustrate five things:

  1. Putting multiple things under a summation sign in LaTeX
  2. Simplifying sums by generalizing binomial coefficients
  3. A bit of notation from Iverson
  4. Changing variables in a sum
  5. Chebyshev polynomials come up again.

Let’s get started. The equation I want to focus on is the following.

\cos n\theta + i\sin n\theta = \sum_{j=0}^n {n\choose j} i^j(\cos\theta)^{n-j} (\sin\theta)^j

Putting things under a sum in LaTeX

I said in the previous post that you could equate the real parts of the left and right side to show that cos nθ can be written as sums and products of cos θ and sin θ. To write this in LaTeX we’d say

\cos n\theta = \sum_{\substack{0 \leq j \leq n \\ j \text{ even}}} {n\choose j} i^j(\cos\theta)^{n-j} (\sin\theta)^j

The command that makes it possible to put two lines of stuff under the summation is \substack. Here’s the LaTeX code that produce the summation sign and the text below it.

    \sum_{\substack{0 \leq j \leq n \\ j \text{ even}}}

Binomial coefficients

We can simplify the sum by removing the limits on j and implicitly letting j run over all integers:

\cos n\theta = \sum_{ j \text{ even}} {n\choose j} i^j(\cos\theta)^{n-j} (\sin\theta)^j

This is because the binomial coefficient term is zero when j > n or j < 0. (See this post for an explanation of more general binomial coefficients.)

Indicator functions

It’s often helpful to turn a sum over a complicated region into a more complicated sum over a simpler region. That is, it’s often easier to deal with complications in the summand than in the domain of summation.

In our case, instead of summing over even integers, we could sum over all integers, if we multiply the summand by a function that is 0 on odd numbers and 1 on even numbers. That is, we multiply by the indicator function of the even integers. The indicator function of a set is 1 on that set and 0 everywhere else.

Kenneth Iverson’s notation uses a Boolean expression in brackets to indicate the function that is 1 if the condition is true and 0 otherwise. So [j even] means the function that is 1 when j is even and 0 when j is odd. So we could write our sum as follows.

\cos n\theta = \sum {n\choose j} i^j [j \text{ even}](\cos\theta)^{n-j} (\sin\theta)^j

Change of variables

We could get rid of the requirement that j is even by replacing j with 2k for a new variable k. Now our sum is

\cos n\theta = \sum {n\choose 2k} (-1)^k (\cos\theta)^{n-2k} (\sin\theta)^{2k}

Notice a couple things. For one thing we were table to write (-1)k rather than i2k.

More importantly, the change of variables was trivial because the sum runs over all integers. If we had explicit limits on j, we would have to change them to different explicit limits on k.

Changing limits of summation is error-prone. This happens a lot, for example, when computing power series solutions for differential equations, and there are mnemonics for reducing errors such as “limits and exponents move in opposite directions.” These complications go away when you sum over all integers.

Chebyshev strikes again

GlennF left a comment on the previous post to the effect that the sum we’ve been talking about reduces to a Chebyshev polynomial.

Since the powers of sin θ are all even, we can replace sin²θ with 1 – cos²θ and get the following.

\cos n\theta = \sum {n\choose 2k} (-1)^k (\cos\theta)^{n-2k} (1 - \cos^2\theta)^k

Now the left side is a polynomial in cos θ, call it P(cos θ). Then P = Tn, the nth Chebyshev polynomial because as explained here, one way to define the Chebyshev polynomials is by

\cos n\theta = T_n(\cos\theta)

If you don’t like that definition, you could use another definition and the equation becomes a theorem.

Analogy between Fibonacci and Chebyshev

Quick observation: I recently noticed that Chebyshev polynomials and Fibonacci numbers have analogous formulas.

The nth Chebyshev polynomial satisfies

T_n(x) = \frac{(x + \sqrt{x^2-1})^n + (x - \sqrt{x^2-1})^n }{2}

for |x| ≥ 1, and the nth Fibonacci number is given by

F_n = \frac{(1 + \sqrt{5})^n - (1 - \sqrt{5})^n }{2^n\sqrt 5}

There’s probably a way to explain the similarity in terms of the recurrence relations that both sequences satisfy.

More on Chebyshev polynomials

More on Fibonacci numbers

When is round-trip floating point radix conversion exact?

Suppose you store a floating point number in memory, print it out in human-readable base 10, and read it back in. When can the original number be recovered exactly?

D. W. Matula answered this question more generally in 1968 [1].

Suppose we start with base β with p places of precision and convert to base γ with q places of precision, rounding to nearest, then convert back to the original base β. Matula’s theorem says that if there are no positive integers i and j such that

βi = γj

then a necessary and sufficient condition for the round-trip to be exact (assuming no overflow or underflow) is that

γq-1 > βp.

In the case of floating point numbers (type double in C) we have β = 2 and p = 53. (See Anatomy of a floating point number.) We’re printing to base γ = 10. No positive power of 10 is also a power of 2, so Matula’s condition on the two bases holds.

If we print out q = 17 decimal places, then

1016 > 253

and so round-trip conversion will be exact if both conversions round to nearest. If q is any smaller, some round-trip conversions will not be exact.

You can also verify that for a single precision floating point number (p = 24 bits precision) you need q = 9 decimal digits, and for a quad precision number (p = 113 bits precision) you need q = 36 decimal digits [2].

Looking back at Matula’s theorem, clearly we need

γq ≥ βp.

Why? Because the right side is the number of base β fractions and the left side is the number of base γ fractions. You can’t have a one-to-one map from a larger space into a smaller space. So the inequality above is necessary, but not sufficient. However, it’s almost sufficient. We just need one more base γ figure, i.e. we Matula tells us

γq-1 > βp

is sufficient. In terms of base 2 and base 10, we need at least 16 decimals to represent 53 bits. The surprising thing is that one more decimal is enough to guarantee that round-trip conversions are exact. It’s not obvious a priori that any finite number of extra decimals is always enough, but in fact just one more is enough; there’s no “table maker’s dilemma” here.

Here’s an example to show the extra decimal is necessary. Suppose p = 5. There are more 2-digit numbers than 5-bit numbers, but if we only use two digits then round-trip radix conversion will not always be exact. For example, the number 17/16 written in binary is 1.0001two, and has five significant bits. The decimal equivalent is 1.0625ten, which rounded to two significant digits is 1.1ten. But the nearest binary number to 1.1ten with 5 significant bits is 1.0010two = 1.125ten. In short, rounding to nearest gives

1.0001two -> 1.1ten -> 1.0010two

and so we don’t end up back where we started.

More floating point posts

[1] D. W. Matula. In-and-out conversions. Communications of the ACM, 11(1):47–50. January 1968. Cited in Handbook of Floating-point Arithmetic by Jean-Mihel Muller et al.

[2] The number of bits allocated for the fractional part of a floating point number is 1 less than the precision: the leading figure is always 1, so IEEE formats save one bit by not storing the leading bit, leaving it implicit. So, for example, a C double has 53 bits precision, but 52 bits of the 64 bits in a double are allocated to storing the fraction.

Chebyshev approximation

In the previous post I mentioned that Remez algorithm computes the best polynomial approximation to a given function f as measured by the maximum norm. That is, for a given n, it finds the polynomial p of order n that minimizes the absolute error

|| f – p ||.

The Mathematica function MiniMaxApproximation minimizes the relative error by minimizing

|| (fp) / f ||.

As was pointed out in the comments to the previous post, Chebyshev approximation produces a nearly optimal approximation, coming close to minimizing the absolute error. The Chebyshev approximation can be computed more easily and the results are easier to understand.

To form a Chebyshev approximation, we expand a function in a series of Chebyshev polynomials, analogous to expanding a function in a Fourier series, and keep the first few terms. Like sines and cosines, Chebyshev polynomials are orthogonal functions, and so Chebyshev series are analogous to Fourier series. (If you find it puzzling to hear of functions being orthogonal to each other, see this post.)

Here is Mathematica code to find and plot the Chebyshev approximation to ex over [-1, 1]. First, here are the coefficients.

    weight[x_] := 2/(Pi Sqrt[1 - x^2])
    c = Table[ 
        Integrate[ Exp[x] ChebyshevT[n, x] weight[x], {x, -1, 1}], 
        {n, 0, 5}]

The coefficients turn out to be exactly expressible in terms of Bessel functions, but typically you’d need to do a numerical integration with NIntegrate.

Now we use the Chebyshev coefficients to evaluate the Chebyshev approximation.

    p[x_] := Evaluate[c . Table[ChebyshevT[n - 1, x], {n, Length[c]}]] 
             - First[c]/2

You could see the approximating polynomial with

    Simplify[N[p[x]]]

which displays

    1.00004 + 1.00002 x + 0.499197 x^2 + 0.166489 x^3 + 0.0437939 x^4 + 
 0.00868682 x^5

The code

    Plot[Exp[x] - p[x], {x, -1, 1}]

shows the error in approximating the exponential function with the polynomial above.

Note that the plot has nearly equal ripples; the optimal approximation would have exactly equal ripples. The Chebyshev approximation is not optimal, but it is close. The absolute error is smaller than that of MiniMaxApproximation by a factor of about e.

There are bounds on how different the error can be between the best polynomial approximation and the Chebyshev series approximation. For polynomials of degree n, the Chebyshev error is never more than

4 + 4 log(n + 1)/π

times the Chebyshev series approximation error. See Theorem 16.1 in Approximation Theory and Approximation Practice by Lloyd N. Trefethen.

More Chebyshev posts

Generalization of power polynomials

A while back I wrote about the Mittag-Leffler function which is a sort of generalization of the exponential function. There are also Mittag-Leffler polynomials that are a sort of generalization of the powers of x; more on that shortly.

Recursive definition

The Mittag-Leffler polynomials can be defined recursively by M0(x) = 1
and

M_{n+1}(x) = \frac{x}{2}\left(M_n(x+1) + 2M_n(x) + M_n(x-1) \right )

for n > 0.

Contrast with orthogonal polynomials

This is an unusual recurrence if you’re used to orthogonal polynomials, which come up more often in application. For example, Chebyshev polynomials satisfy

T_{n+1}(x) = 2x T_n(x) - T_{n-1}(x)

and Hermite polynomials satisfy

 H_{n+1}(x) = x H_n(x) - n H_{n-1}(x)

as I used as an example here.

All orthogonal polynomials satisfy a two-term recurrence like this where the value of each polynomial can be found from the value of the previous two polynomials.

Notice that with orthogonal polynomial recurrences the argument x doesn’t change, but the degrees of polynomials do. But with Mittag-Leffler polynomials the opposite is true: there’s only one polynomial on the right side, evaluated at three different points: x+1, x, and x-1.

Generalized binomial theorem

Here’s the sense in which the Mittag-Leffler polynomials generalize the power function. If we let pn(x) = xn be the power function, then the binomial theorem says

p_n(x+y) = \sum_{k=0}^n {n\choose k}\, p_{k}(x)\, p_{n-k}(y).

Something like the binomial theorem holds if we replace pn with Mn:

M_n(x+y) = \sum_{k=0}^n {n\choose k}\, M_{k}(x)\, M_{n-k}(y).

Related posts