Optimization, Dominoes, and Frankenstein

Robert Bosch has a new book coming out next week titled OPT ART: From Mathematical Optimization to Visual Design. This post will look at just one example from the book: creating images of Frankenstein’s monster [1] using dominoes.

Frankenstein made from 3 sets of double nine dominoes

The book includes two images, a low-resolution image made from 3 sets of dominoes and a high resolution image made from 48 sets. These are double nine dominoes rather than the more common double six dominoes because the former allow a more symmetric arrangement of dots. There are 55 dominoes in a double nine set. (Explanation and generalization here.)

Image of Frankenstein's monster made from 48 sets of double nine dominoes

The images are created by solving a constrained optimization problem. You basically want to use dominoes whose brightness is proportional to the brightness of the corresponding section of a photograph. But you can only use the dominoes you have, and you have to use them all.

You would naturally want to use double nines for the brightest areas since they have the most white spots and double blanks for the darkest areas. But you only have three of each in the first image, forty eight of each in the second, and you have to make do with the rest of the available pieces. So you solve an optimization problem to do the best you can.

Each domino can be places horizontally or vertically, as long as they fit the overall shape of the image. In the first image, you can see the orientation of the pieces if you look closely. Here’s a little piece from the top let corner.

Domino orientation

One double six is placed vertically in the top left. Next to it are another double six and a double five placed horizontally, etc.

There are additional constraints on the optimization problem related to contrast. You don’t just want to match the brightness of the photograph as well as you can, you also want to maintain contrast. For example, maybe a five-two domino would match the brightness of a couple adjacent squares in the photograph, but a seven-one would do a better job of making a distinction between contrasting regions of the image.

Bosch explains that the problem statement for the image using three sets of dominoes required 34,265 variables and 385 domino constraints. It was solved using 29,242 iterations of a simplex algorithm and required a little less than 2 seconds.

The optimization problem for the image created from 48 sets of dominoes required 572,660 variables and 5,335 constraints. The solution required 30,861 iterations of a simplex algorithm and just under one minute of computation.

Related posts

[1] In the Shelley’s novel Frankenstein, the doctor who creates the monster is Victor Frankenstein. The creature is commonly called Frankenstein, but strictly speaking that’s not his name. There’s a quote—I haven’t been able to track down its source—that says “Knowledge is knowing that Frankenstein wasn’t the monster. Wisdom is knowing that Frankenstein was the monster.” That is, the monster’s name wasn’t Frankenstein, but what Victor Frankenstein did was monstrous.

More on attacking combination locks

A couple weeks ago I wrote about how De Bruijn sequences can be used to attack locks where there is no “enter” key, i.e. the lock will open once the right symbols have been entered.

Here’s a variation on this theme: what about locks that let you press more than one button at a time? [1]

Lock that lets you push more than one button at a time

You could just treat this as if it were a keypad with more buttons. For example, suppose you can push either one or two buttons at a time on the lock pictured above. Then you could treat this as a lock with 15 buttons: the five actual buttons and 10 virtual buttons corresponding to the ten ways you could choose 2 buttons out of 5 to press at the same time.

Maybe a lock would let you press even more buttons at once. I don’t think I’ve ever used a combination that required more than two buttons at once, but in principle you could push up to five buttons at once in the photo above, for a total of 31 possible button combinations. And since 31 is prime, you could use the algorithm here to generate the corresponding De Bruijn sequence.

If you know how long the password is, you can try the De Bruijn sequence for passwords of that length. But what if you don’t know the length of the password a priori? You could try the De Bruijn sequence for passwords of length 1, then length 2, then length 3, etc. But is there a more efficient way?

If there are k button combinations and the password has length n, then the optimal solution is to start entering a De Bruijn sequence B(k, n) until the lock opens. If you know the password has length no more than 4, then you could try a B(k, 4) sequence, and if the password is actually shorter than 4, say length 3, you’ll still open it.

But what if you’ve tried a B(k, 4) sequence and it turns out the password has length 5? You could do better than starting over with a B(k, 5) sequence, because some of the substrings in that B(k, 5) sequence will have already been tried. But how could you do this systematically? If you don’t know the length of the password, how could you do better than trying B(k, 1), then B(k, 2), then B(k, 3) etc.?

Related posts

[1] For this post, I’m assuming the lock will open as soon as you enter the right sequence of buttons. For example, if the pass code is 345 and you enter 12345 the lock will open. I don’t know whether these locks work that way. Maybe you have to turn the handle, which would effectively act as an enter key. But maybe there’s a way to listen to the lock so that you’ll know when the combination has been entered, before you twist the handle.

Update: According to the first comment below, locks of the kind featured in the photo only allow each button to be used once. That puts severe limits on the number of possible combinations, and the approach outlined here would be unnecessary. However, the post brings up more general questions that could be useful in another setting.

Summing an array of floating point numbers

Adding up an array of numbers is simple: just loop over the array, adding each element to an accumulator.

Usually that’s good enough, but sometimes you need more precision, either because your application need a high-precision answer, or because your problem is poorly conditioned and you need a moderate-precision answer [1].

There are many algorithms for summing an array, which may be surprising if you haven’t thought about this before. This post will look at Kahan’s algorithm, one of the first non-obvious algorithms for summing a list of floating point numbers.

We will sum a list of numbers a couple ways using C’s float type, a 32-bit integer. Some might object that this is artificial. Does anyone use floats any more? Didn’t moving from float to double get rid of all our precision problems? No and no.

In fact, the use of 32-bit float types is on the rise. Moving data in an out of memory is now the bottleneck, not arithmetic operations per se, and float lets you pack more data into a giving volume of memory. Not only that, there is growing interest in floating point types even smaller than 32-bit floats. See the following posts for examples:

And while sometimes you need less precision than 32 bits, sometimes you need more precision than 64 bits. See, for example, this post.

Now for some code. Here is the obvious way to sum an array in C:

    float naive(float x[], int N) {
        float s = 0.0;
        for (int i = 0; i < N; i++) {
            s += x[i];
        }
        return s;
    }

Kahan’s algorithm is not much more complicated, but it looks a little mysterious and provides more accuracy.

    float kahan(float x[], int N) {
        float s = x[0];
        float c = 0.0;
        for (int i = 1; i < N; i++) {
            float y = x[i] - c;
            float t = s + y;
            c = (t - s) - y;
            s = t;
        }
        return s;
    }

Now to compare the accuracy of these two methods, we’ll use a third method that uses a 64-bit accumulator. We will assume its value is accurate to at least 32 bits and use it as our exact result.

    double dsum(float x[], int N) {
        double s = 0.0;
        for (int i = 0; i < N; i++) {
            s += x[i];
        }
        return s;
    }

For our test problem, we’ll use the reciprocals of the first 100,000 positive integers rounded to 32 bits as our input. This problem was taken from [2].

    #include <stdio.h&gt:
    int main() {
        int N = 100000;
        float x[N];
        for (int i = 1; i <= N; i++)
            x[i-1] = 1.0/i;
  
        printf("%.15g\n", naive(x, N));
        printf("%.15g\n", kahan(x, N));  
        printf("%.15g\n", dsum(x, N));
    }

This produces the following output.

    12.0908508300781
    12.0901460647583
    12.0901461953972

The naive method is correct to 6 significant figures while Kahan’s method is correct to 9 significant figures. The error in the former is 5394 times larger. In this example, Kahan’s method is as accurate as possible using 32-bit numbers.

Note that in our example we were not exactly trying to find the Nth harmonic number, the sum of the reciprocals of the first N positive numbers. Instead, we were using these integer reciprocals rounded to 32 bits as our input. Think of them as just data. The data happened to be produced by computing reciprocals and rounding the result to 32-bit floating point accuracy.

But what if we did want to compute the 100,000th harmonic number? There are methods for computing harmonic numbers directly, as described here.

Related posts

[1] The condition number of a problem is basically a measure of how much floating point errors are multiplied in the process of computing a result. A well-conditioned problem has a small condition number, and a ill-conditioned problem has a large condition number. With ill-conditioned problems, you may have to use extra precision in your intermediate calculations to get a result that’s accurate to ordinary precision.

[2] Handbook of Floating-Point Arithmetic by Jen-Michel Muller et al.

Any number can start a factorial

Any positive number can be found at the beginning of a factorial. That is, for every positive positive integer n, there is an integer m such that the leading digits of m! are the digits of n.

There’s a tradition in math to use the current year when you need an arbitrary numbers; you’ll see this in competition problems and recreational math articles. So let’s demonstrate the opening statement with n = 2019. Then the smallest solution is m = 3177, i.e.

3177! = 2019…000

The factorial of 3177 is a number with 9749 digits, the first of which are 2019, and the last 793 of which are zeros.

The solution m = 3177 was only the first. The next solution is 6878, and there are infinitely more.

Not only does every number appear at the beginning of a factorial, it appears at the beginning of infinitely many factorials.

We can say even more. Persi Diaconis proved that factorials obey Benford’s law, and so we can say how often a number n appears at the beginning of a factorial.

If a sequence of numbers, like factorials, obeys Benford’s law, then the leading digit d in base b appears with probability

logb(1 + 1/d).

If we’re interested in 4-digit numbers like 2019, we can think of these as base 10,000 digits [1]. This means that the proportion factorials that begin with 2019 equals

log10000(1 + 1/2019)

or about 1 in every 18,600 factorials.

By the way, powers of 2 also obey Benford’s law, and so you can find any number at the beginning of a power of 2. For example,

22044 = 2019…

Related: Benford’s law blog posts

[1] As pointed out in the comments, this approach underestimates the frequency of factorials that start with 2019. It would count numbers of the form 2019 xxxx xxxx, but not numbers of the form 201 9xxx xxxx etc. The actual frequency would be 4x higher.

Arbitrary precision is not a panacea

Having access to arbitrary precision arithmetic does not necessarily make numerical calculation difficulties go away. You still need to know what you’re doing.

Before you extend precision, you have to know how far to extend it. If you don’t extend it enough, your answers won’t be accurate enough. If you extend it too far, you waste resources. Maybe you’re OK wasting resources, so you extend precision more than necessary. But how far is more than necessary?

As an example, consider the problem of finding values of n such that tan(n) > n discussed a couple days ago. After writing that post, someone pointed out that these values of n are one of the sequences on OEIS. One of the values on the OEIS site is

k = 1428599129020608582548671.

Let’s verify that tan(k) > k. We’ll use our old friend bc because it supports arbitrary precision. Our k is a 25-digit number, so lets tell bc we want to work with 30 decimal places to give ourselves a little room. (bc automatically gives you a little more room than you ask for, but we’ll ask for even more.)

bc doesn’t have a tangent function, but it does have s() for sine and c() for cosine, so we’ll compute tangent as sine over cosine.

    $ bc -l
    scale = 30
    k = 1428599129020608582548671
    s(k)/c(k) - k

We expect this to return a positive value, but instead it returns

-1428599362980017942210629.31…

So is OEIS wrong? Or is bc ignoring our scale value? It turns out neither is true.

You can’t directly compute the tangent of a large number. You use range reduction to reduce it to the problem of computing the tangent of a small angle where your algorithms will work. Since tangent has period π, we reduce k mod π by computing k – ⌊k/π⌋π. That is, we subtract off as many multiples of π as we can until we get a number between 0 and π. Going back to bc, we compute

    pi = 4*a(1)
    k/pi

This returns

    454737226160812402842656.500000507033221370444866152761

and so we compute the tangent of

    t = 0.500000507033221370444866152761*pi
      = 1.570797919686740002588270178941

Since t is slightly larger than π/2, the tangent will be negative. We can’t have tan(t) greater than k because tan(t) isn’t even greater than 0. So where did things break down?

The calculation of pi was accurate to 30 significant figures, and the calculation of k/pi was accurate to 30 significant figures, given our value of pi. So far has performed as promised.

The computed value of k/pi is correct to 29 significant figures, 23 before the decimal place and 6 after. So when we take the fractional part, we only have six significant figures, and that’s not enough. That’s where things go wrong. We get a value ⌊k/π⌋ that’s greater than 0.5 in the seventh decimal place while the exact value is less than 0.5 in the 25th decimal place. We needed 25 – 6 = 19 more significant figures.

This is the core difficulty with floating point calculation: subtracting nearly equal numbers loses precision. If two numbers agree to m decimal places, you can expect to lose m significant figures in the subtraction. The error in the subtraction will be small relative to the inputs, but not small relative to the result.

Notice that we computed k/pi to 29 significant figures, and given that output we computed the fractional part exactly. We accurately computed ⌊k/π⌋π, but lost precision when we computed we subtracted that value from k.

Our error in computing k – ⌊k/π⌋π was small relative to k, but not relative to the final result. Our k is on the order of 1025, and the error in our subtraction is on the order of 10-7, but the result is on the order of 1. There’s no bug in bc. It carried out every calculation to the advertised accuracy, but it didn’t have enough working precision to produce the result we needed.

Related posts

Computed IDs and privacy implications

Thirty years ago, a lot of US states thought it would be a good idea to compute someone’s drivers license number (DLN) from their personal information [1]. In 1991, fifteen states simply used your Social Security Number as your DLN. Eleven other states computed DLNs by applying a hash function to personal information such as name, birth date, and sex. A few other states based DLNs in part but not entirely on personal information.

Presumably things have changed a lot since then. If you know of any states that still do this, please let me know in the comments. Even if states have stopped computing DLNs from personal data, I’m sure many organizations still compute IDs this way.

The article I stumbled on from 1991 gave no hint perhaps encoding personal information into an ID number could be a problem. And at the time it wasn’t as much of a problem as it would be now.

Why is it a problem if IDs are computed from personal data? People don’t realize what information they’re giving away. Maybe they would be willing to give someone their personal information, but not their DLN, or vice versa, not realizing that the two are equivalent. They also don’t realize what information about them someone may already have; a little bit more info may be all an attacker needs. And they don’t realize the potential consequences of their loss of privacy.

In some cases the hashing functions were complicated, but not too complicated to carry out by hand. And even if states were applying a cryptographic hash function, which they certainly were not, this would still be a problem for reasons explained here. If you have a database of personal information, say from voter registration records, you could compute the hash value of everyone in the state, or at least a large enough portion that you stand a good chance of being able to reverse a hashed value.

Related posts

[1] Joseph A. Gallian. Assigning Driver’s License Numbers. Mathematics Magazine, Vol. 64, No. 1 (Feb., 1991), pp. 13-22.

Generating Python code from SymPy

Yesterday I wrote about Householder’s higher-order generalizations of Newton’s root finding method. For n at least 2, define

H_n(x) = x + (n-1) \frac{\left( \frac{1}{f(x)}\right)^{(n-2)}}{\left( \frac{1}{f(x)}\right)^{(n-1)}}

and iterate Hn to find a root of f(x). When n = 2, this is Newton’s method. In yesterday’s post I used Mathematica to find expressions for H3 and H4, then used Mathematica’s FortranForm[] function to export Python code. (Mathematica doesn’t have a function to export Python code per se, but the Fortran syntax was identical in this case.)

Aaron Muerer pointed out that it would have been easier to generate the Python code in Python using SymPy to do the calculus and labdify() to generate the code. I hadn’t heard of lambdify before, so I tried out his suggestion. The resulting code is nice and compact.

    from sympy import diff, symbols, lambdify

    def f(x, a, b):
        return x**5 + a*x + b

    def H(x, a, b, n):
        x_, a_, b_ = x, a, b
        x, a, b = symbols('x a b')
        expr = diff(1/f(x,a,b), x, n-2) / \
               diff(1/f(x,a,b), x, n-1)
        g = lambdify([x,a,b], expr)
        return x_ + (n-1)*g(x_, a_, b_)

This implements all the Hn at once. The previous post implemented three of the Hn separately.

The first couple lines of H require a little explanation. I wanted to use the same names for the numbers that the function H takes and the symbols that SymPy operated on, so I saved the numbers to local variables.

This code is fine for a demo, but in production you’d want to generate the function g once (for each n) and save the result rather than generating it on every call to H.

A simple unsolved problem

Are there infinitely many positive integers n such that tan(n) > n?

David P. Bellamy and Felix Lazebnik [1] asked in 1998 whether there were infinitely many solutions to |tan(n)| > n and tan(n) > n/4. In both cases the answer is yes. But at least as recently as 2014 the problem at the top of the page was still open [2].

It seems that tan(n) is not often greater than n. There are only five instances for n less than one billion:

1
260515
37362253
122925461
534483448

In fact, there are no more solutions if you search up to over two billion as the following C code shows.

    #include <math.h>
    #include <stdio.h>
    #include <limits.h>

    int main() {
       for (int n = 0; n < INT_MAX; n++) 
          if (tan(n) > n)
             printf("%d\n", n);
    }

If there are infinitely many solutions, it would be interesting to know how dense they are.

Update: The known numbers for which tan(n) > n are listed in OEIS sequence A249836. According to the OEIS site it is still unknown whether the complete list is finite or infinite.

***

[1] David P. Bellamy, Felix Lazebnik, and Jeffrey C. Lagarias, Problem E10656: On the number of positive integer solutions of tan(n) > n, Amer. Math. Monthly 105 (1998) 366.

[2] Felix Lazebnik. Surprises. Mathematics Magazine , Vol. 87, No. 3 (June 2014), pp. 212-22

Higher order Newton-like methods for root finding

There are variations on Newton’s root finding method that use higher derivatives and converge faster. Alston Householder developed a sequence of such methods of the form

H_n(x) = x + (n-1) \frac{\left( \frac{1}{f(x)}\right)^{(n-2)}}{\left( \frac{1}{f(x)}\right)^{(n-1)}}

where the superscripts in parentheses indicate derivatives. When n = 2, Householder’s method reduces to Newton’s method.

Once Newton’s method is close enough to a root, the error is squared at each iteration. Said another way, the number of correct decimal places roughly doubles. With the Householder method of order n, the number of correct decimal places roughly increases by a factor of n.

The Householder method Hn can be written as a rational function of f and its derivatives up to order n-1, though the formulas are complicated.

Maybe it’s time for Householder’s methods to be more widely used: with automatic differentiation [1], the complexity of computing higher order derivatives is not the drawback it was when these computations were done solely by hand.

Demonstration

We’ll illustrate the convergence of the Householder methods by finding roots of the quintic polynomial

f(x) = x5 + ax + b.

I used Mathematica to calculate expressions for the Householder methods and export the results as code. Mathematica does not have a function to export Python code, but it does have a function CForm to export expressions as C code. I started to use this, converting exponents from C to Python form with search and replace. Then I tried using FortranForm instead, and that was much easier because Fortran and Python both represent powers with a ** operator.

def f(x, a, b):
    return x**5 + a*x + b

def H2(x, a, b):
    return x - (b + a*x + x**5)/(a + 5*x**4)
    
def H3(x, a, b):
    num = (a + 5*x**4)*(b + a*x + x**5)
    den = a**2 - 10*b*x**3 + 15*x**8
    return x - num/den

def H4(x, a, b):
    num = (b + a*x + x**5)*(-a**2 + 10*b*x**3 - 15*x**8)
    den = a**3 + 10*b**2*x**2 + 5*a**2*x**4 - 80*b*x**7 - 25*a*x**8 + 35*x**12
    return x + num/den

To set values of a and b, I arbitrarily chose a = 2 and solved for b so that π is a root. That way we have a known value of the root for comparison. I used x = 4 as my starting guess at the root.

I first used -log10|x – π| to measure the number of correct decimals. That didn’t quite work because the method would converge exactly (to the limits of machine precision) which led to taking the log of zero. So I modified my accuracy function to the following.

def decimals(x):
    d = abs(x - pi)
    if d != 0:
        return( round(-log10(d),2) )
    else:
        return "inf"

Then I tested each of the Householder methods with the following.

epsilon = 1e-14
for H in [H2, H3, H4]:
    x = 4
    for i in range(1, 10):
        if abs(f(x,a,b)) > epsilon:
            x = H(x,a,b)
            print(i, x, decimals(x))

Here are the results in tabular form. I replaced all the decimals after the first incorrect decimal with underscores to make it easier to see the convergence.

|---+-------------------+--------|
| i |                 x | digits |
|---+-------------------+--------|
| 1 | 3.4______________ |   0.53 |
| 2 | 3.18_____________ |   1.33 |
| 3 | 3.142____________ |   2.87 |
| 4 | 3.141593_________ |   5.93 |
| 5 | 3.141592653590___ |  12.07 |
| 6 | 3.141592653589793 |    inf |
|---+-------------------+--------|
| 1 | 3.2______________ |   1.11 |
| 2 | 3.1416___________ |   4.03 |
| 3 | 3.1415926535899__ |  12.79 |
| 4 | 3.141592653589793 |    inf |
|---+-------------------+--------|
| 1 | 3.15_____________ |   1.84 |
| 2 | 3.141592654______ |   8.85 |
| 3 | 3.141592653589793 |    inf |
|---+-------------------+--------|

Note that, as promised, the number of correct decimal places for Hn increases by roughly a factor of n with each iteration.

Update: See this post where the symbolic calculations are done in SymPy rather than Mathematica. Everything stays in Python, and there’s no need to edit the code for the Householder methods.

Related posts

[1] Note that in this post I used Mathematica to do symbolic differentiation, not automatic differentiation. Automatic differentiation differs from both symbolic and numerical differentiation. See an explanation here.

Formatting numbers at the command line

The utility numfmt, part of Gnu Coreutils, formats numbers. The main uses are grouping digits and converting to and from unit suffixes like k for kilo and M for mega. This is somewhat useful for individual invocations, but like most command line utilities, the real value is using it as part of pipeline.

The --grouping option will separate digits according to the rules of your locale. So on my computer

    numfmt --grouping 123456789

returns 123,456,789. On a French computer, it would return 123.456.789 because the French use commas as decimal separators and use periods to group digits [1].

You can also use numfmt to convert between ordinary numbers and numbers with units. Unfortunately, there’s some ambiguity regarding what units like kilo and mega mean. A kilogram is 1,000 grams, but a kilobyte is 210 = 1,024 bytes. (Units like kibi and mebi were introduced to remove this ambiguity, but the previous usage is firmly established.)

If you want to convert 2M to an ordinary number, you have to specify whether you mean 2 × 106 or 2 × 220. For the former, use --from=si (for Système international d’unités) and for the latter use --from=iec (for International Electrotechnical Commission).

    $ numfmt --from=si 2M
    2000000
    $ numfmt --from=iec 2M
    2097152 

One possible gotcha is that the symbol for kilo is capital K rather than lower case k; all units from kilo to Yotta use a capital letter. Another is that there must be no space between the numerals and the suffix, e.g. 2G is legal but 2 G is not.

You can use Ki for kibi, Mi for mebi etc. if you use --from=iec-i.

    $ numfmt --from=iec-i 2Gi  
    2147483648   

To convert from ordinary numbers to numbers with units use the --to option.

    $ numfmt --to=iec 1000000 
    977K  

Related links

[1] I gave a presentation in France years ago. Much to my chagrin, the software I was demonstrating had a hard-coded assumption that the decimal separator was a period. I had tested the software on a French version of Windows, but had only entered integers and so I didn’t catch the problem.

To make matters worse, there was redundant input validation. Entering 3.14 rather than 3,14 would satisfy the code my team wrote, but the input was first validated by a Windows API which rejected 3.14 as invalid in the user’s locale.