John D. Cook https://www.johndcook.com/blog Applied Mathematics Consulting Mon, 23 Nov 2020 14:21:15 +0000 en-US hourly 1 https://www.johndcook.com/blog/wp-content/uploads/2020/01/cropped-favicon_512-32x32.png John D. Cook https://www.johndcook.com/blog 32 32 Refinements to the prime number theorem https://www.johndcook.com/blog/2020/11/23/refined-pnt-bound/ https://www.johndcook.com/blog/2020/11/23/refined-pnt-bound/#respond Mon, 23 Nov 2020 14:00:06 +0000 https://www.johndcook.com/blog/?p=65903 Let π(x) be the number of primes less than x. The simplest form of the prime number theorem says that π(x) is asymptotically equal to x/log(x), where log means natural logarithm. That is, This means that in the limit as x goes to infinity, the relative error in approximating π(x) with x/log(x) goes to 0. […]

The post Refinements to the prime number theorem first appeared on John D. Cook.

]]>
Let π(x) be the number of primes less than x. The simplest form of the prime number theorem says that π(x) is asymptotically equal to x/log(x), where log means natural logarithm. That is,

This means that in the limit as x goes to infinity, the relative error in approximating π(x) with x/log(x) goes to 0. However, there is room for improvement. The relative approximation error goes to 0 faster if we replace x/log(x) with li(x) where

The prime number theorem says that for large x, the error in approximating π(x) by li(x) is small relative to π(x) itself. It would appear that li(x) is not only an approximation for π(x), but it is also an upper bound. That is, it seems that li(x) > π(x). However, that’s not true for all x.

Littlewood proved in 1914 that there is some x for which π(x) > li(x). We still don’t know a specific number x for which this holds, though we know such numbers exist. The smallest such x is the definition of Skewes’ number. The number of digits in Skewes’ number is known to be between 20 and 317, and is believed to be close to the latter.

Littlewood not only proved that li(x) – π(x) is sometimes negative, he proved that it changes sign infinitely often. So naturally there is interest in estimating li(x) – π(x) for very large values of x.

A new result was published a few days ago [1] refining previous bounds to prove that

for all x > exp(2000).

When x = exp(2000), the right side is roughly 10857 and π(x) is roughly 10865, and so the relative error is roughly 10-8. That is, the li(x) approximation to π(x) is accurate to 8 significant figures, and the accuracy increases as x gets larger.

***

[1] Platt and Trudgian. The error term in the prime number theorem. Mathematics of Computation. November 16, 2020. https://doi.org/10.1090/mcom/3583

The post Refinements to the prime number theorem first appeared on John D. Cook.

]]>
https://www.johndcook.com/blog/2020/11/23/refined-pnt-bound/feed/ 0
Minimizing random Boolean expressions https://www.johndcook.com/blog/2020/11/22/random-boolean-expressions/ https://www.johndcook.com/blog/2020/11/22/random-boolean-expressions/#respond Mon, 23 Nov 2020 01:53:01 +0000 https://www.johndcook.com/blog/?p=65890 The previous post looked at all Boolean expressions on three or four variables and how much they can be simplified. The number of Boolean expressions on n variables is and so the possibilities explode as n increases. We could do n = 3 and 4, but 5 would be a lot of work, and 6 […]

The post Minimizing random Boolean expressions first appeared on John D. Cook.

]]>
The previous post looked at all Boolean expressions on three or four variables and how much they can be simplified. The number of Boolean expressions on n variables is

and so the possibilities explode as n increases. We could do n = 3 and 4, but 5 would be a lot of work, and 6 is out of the question.

So we do what we always do when a space is too big to explore exhaustively: we explore at random.

The Python module we’ve been using, qm, specifies a function of n Boolean variables in terms of the set of product terms on which the function evaluates to 1. These product terms can be encoded as integers, and so a Boolean function of n variables corresponds to a subset of the integers 0 through 2n – 1.

We can generate a subset of these numbers by generating a random mask consisting of 0s and 1s, and keeping the numbers where the mask value is 1. We could do this with code like the following.

     N= 2**n
x = np.arange(N)


There’s a small problem with this approach: the set ones always contains 0. We want it to contain 0 if and only if the 0th mask value is a 1.

The following code generates a Boolean expression on n variables, simplifies it, and returns the length of the simplified expression [1].

    def random_sample(n):
N = 2**n
x = np.arange(N)
ones.remove(0)
return len(qm(ones=ones, dc={}))


We can create several random samples and make a histogram with the following code.

    def histogram(n, reps):
counts = np.zeros(2**n+1, dtype=int)
for _ in range(reps):
counts[random_sample(n)] += 1
return counts


The data in the following graph comes from calling histogram(5, 1000).

Note that the length of the random expressions is distributed symmetrically around 16 (half of 25). So minimization turns a distribution centered around 16 into a distribution centered around 8.

The code is slow because the Quine-McCluskey algorithm is slow, and our Python implementation of the algorithm isn’t as fast as it could be. But Boolean minimization is an NP problem, so no exact algorithm is going to scale well. To get faster results, we could switch to something like the Expresso Heuristic Logic Minimizer, which often gets close to a minimum expression.

***

[1] The code above will fail if the set of terms where the function is 1 is empty. However this is extremely unlikely: we’d expect it to happen once in every 2^(2^n) times and so when n = 5 this is less than one time in four billion. The fully correct approach would be to call qm with zeros=x when ones is empty.

The post Minimizing random Boolean expressions first appeared on John D. Cook.

]]>
https://www.johndcook.com/blog/2020/11/22/random-boolean-expressions/feed/ 0
How much can Boolean expressions be simplified? https://www.johndcook.com/blog/2020/11/19/boolean-expression-compression/ https://www.johndcook.com/blog/2020/11/19/boolean-expression-compression/#respond Thu, 19 Nov 2020 14:15:29 +0000 https://www.johndcook.com/blog/?p=65766 In the previous post we looked at how to minimize Boolean expressions using a Python module qm. In this post we’d like to look at how much the minimization process shortens expressions. Witn n Boolean variables, you can create 2^n terms that are a product of distinct variables. You can specify a Boolean function by […]

The post How much can Boolean expressions be simplified? first appeared on John D. Cook.

]]>
In the previous post we looked at how to minimize Boolean expressions using a Python module qm. In this post we’d like to look at how much the minimization process shortens expressions.

Witn n Boolean variables, you can create 2^n terms that are a product of distinct variables. You can specify a Boolean function by specifying the subset of such terms on which it takes the value 1, and so there are 2^(2^n) Boolean functions on n variables. For very small values of n we can can minimize every possible Boolean function.

To do this, we need a way to iterate through the power set (set of all subsets) of the integers up to 2^n. Here’s a function to do that, borrowed from itertools recipes.

    from itertools import chain, combinations
def powerset(iterable):
xs = list(iterable)
return chain.from_iterable(
combinations(xs, n) for n in range(len(xs) + 1))


Next, we use this code to run all Boolean functions on 3 variables through the minimizer. We use a matrix to keep track of how long the input expressions are and how long the minimized expressions are.

    from numpy import zeros
from qm import q

n = 3
N = 2**n
tally = zeros((N,N), dtype=int)
for p in powerset(range(N)):
if not p:
continue # qm can't take an empty set
i = len(p)
j = len(qm(ones=p, dc={}))
tally[i-1, j-1] += 1


Here’s a table summarizing the results [1].

The first column gives the number of product terms in the input expression and the subsequent columns give the number of product terms in the output expressions.

For example, of the expressions of length 2, there were 12 that could be reduced to expressions of length 1 but the remaining 16 could not be reduced. (There are 28 possible input expressions of length 2 because there are 28 ways to choose 2 items from a set of 8 things.)

There are no nonzero values above the main diagonal, i.e. no expression got longer in the process of minimization. Of course that’s to be expected, but it’s reassuring that nothing went obviously wrong.

We can repeat this exercise for expressions in 4 variables by setting n = 4 in the code above. This gives the following results.

We quickly run into a wall as n increases. Not only does the Quine-McCluskey algorithm take about twice as long every time we add a new variable, the number of possible Boolean functions grows even faster. There were 2^(2^3) = 256 possibilities to explore when n = 3, and 2^(2^4) = 65,536 when n = 4.

If we want to explore all Boolean functions on five variables, we need to look at 2^(2^5) = 4,294,967,296 possibilities. I estimate this would take over a year on my laptop. The qm module could be made more efficient, and in fact someone has done that. But even if you made the code a billion times faster, six variables would still be out of the question.

To explore functions of more variables, we need to switch from exhaustive enumeration to random sampling. I may do that in a future post. (Update: I did.)

***

[1] The raw data for the tables presented as images is available here.

The post How much can Boolean expressions be simplified? first appeared on John D. Cook.

]]>
https://www.johndcook.com/blog/2020/11/19/boolean-expression-compression/feed/ 0
Minimizing boolean expressions https://www.johndcook.com/blog/2020/11/19/minimizing-boolean-expressions/ https://www.johndcook.com/blog/2020/11/19/minimizing-boolean-expressions/#comments Thu, 19 Nov 2020 12:32:11 +0000 https://www.johndcook.com/blog/?p=65713 This post will look at how to take an expression for a Boolean function and look for a simpler expression that corresponds to the same function. We’ll show how to use a Python implementation of the Quine-McCluskey algorithm. Notation We will write AND like multiplication, OR like addition, and use primes for negation. For example, […]

The post Minimizing boolean expressions first appeared on John D. Cook.

]]>
This post will look at how to take an expression for a Boolean function and look for a simpler expression that corresponds to the same function. We’ll show how to use a Python implementation of the Quine-McCluskey algorithm.

## Notation

We will write AND like multiplication, OR like addition, and use primes for negation. For example,

wx + z

denotes

(w AND x) OR (NOT z).

## Minimizing expressions

You may notice that the expression

wxz + wxz

can be simplified to wz, for example, but it’s not feasible to simplify complicated expressions without a systematic approach.

One such approach is the Quine-McCluskey algorithm. Its run time increases exponentially with the problem size, but for a small number of terms it’s quick enough [1]. We’ll show how to use the Python module qm which implements the algorithm.

## Specifying functions

How are you going to pass a Boolean expression to a Python function? You could pass it an expression as a string and expect the function to parse the string, but then you’d have to specify the grammar of the little language you’ve created. Or you could pass in an actual Python function, which is more work than necessary, especially if you’re going to be passing in a lot of expressions.

A simpler way is pass in the set of places where the function evaluates to 1, encoded as numbers.

For example, suppose your function is

wxyz + wxyz

This function evaluates to 1 when either the first term evaluates to 1 or the second term evaluates to 1. That is, when either

(w, x, y, z) = (1, 1, 0, 1)

or

(w, x, y, z) = (0, 1, 1, 0).

Interpreting the left sides as binary numbers, you could specify the expression with the set {13, 6} which describes where the function is 1.

If you prefer, you could express your numbers in binary to make the correspondence to terms more explicit, i.e. {0b1101,0b110}.

## Using qm

One more thing before we use qm: your Boolean expression might not be fully specified. Maybe you want it to be 1 on some values, 0 on others, and you don’t care what it equals on the rest.

The qm module lets you specify these with arguments ones, zeroes, and dc. If you specify two out of these three sets, qm will infer the third one.

For example, in the code below

    from qm import qm
print(qm(ones={0b111, 0b110, 0b1101}, dc={}))


we’re asking qm to minimize the expression

xyz + xyz‘ + wxyz.

Since the don’t-care set is empty, we’re saying our function equals 0 everywhere we haven’t said that it equals 1. The function prints

    ['1101', '011X']

which corresponds to

wxyz + wxy,

the X meaning that the fourth variable, z, is not part of the second term.

Note that the minimized expression is not unique: we could tell by inspection that

xyz + xyz‘ + wxyz.

could be reduced to

xy + wxyz.

Also, our code defines a minimum expression to be one with the fewest sums. Both simplifications in this example have two sums. But xy + wxyz is simpler than wxyz + wxy in the sense of having one less term, so there’s room for improvement, or at least discussion, as to how to quantify the complexity of an expression.

In the next post I use qm to explore how much minimization reduces the size of Boolean expressions.

***

[1] The Boolean expression minimization problem is in NP, and so no known algorithm that always produces an exact answer will scale well. But there are heuristic algorithms like Espresso and its variations that usually provide optimal or near-optimal results.

The post Minimizing boolean expressions first appeared on John D. Cook.

]]>
https://www.johndcook.com/blog/2020/11/19/minimizing-boolean-expressions/feed/ 3
Rotating symbols in LaTeX https://www.johndcook.com/blog/2020/11/18/rotating-symbols-in-latex/ https://www.johndcook.com/blog/2020/11/18/rotating-symbols-in-latex/#respond Wed, 18 Nov 2020 16:45:16 +0000 https://www.johndcook.com/blog/?p=65678 Linear logic uses an unusual symbol, an ampersand rotated 180 degrees, for multiplicative disjunction. The symbol is U+214B in Unicode. I was looking into how to produce this character in LaTeX when I found that the package cmll has two commands that produce this character, one semantic and one descriptive: \parr and \invamp [1]. This […]

The post Rotating symbols in LaTeX first appeared on John D. Cook.

]]>
Linear logic uses an unusual symbol, an ampersand rotated 180 degrees, for multiplicative disjunction.

The symbol is U+214B in Unicode.

I was looking into how to produce this character in LaTeX when I found that the package cmll has two commands that produce this character, one semantic and one descriptive: \parr and \invamp [1].

This got me to wondering how you might create a symbol like the one above if there wasn’t one built into a package. You can do that by using the graphicx package and the \rotatebox command. Here’s how you could roll your own par operator:

    \rotatebox[origin=c]{180}{\&}

There’s a backslash in front of the & because it’s a special character in LaTeX. If you wanted to rotate a K, for example, there would be no need for a backslash.

The \rotatebox command can rotate any number of degrees, and so you could rotate an ampersand 30° with

    \rotatebox[origin=c]{30}{\&}

to produce a tilted ampersand.

## Related posts

[1] The name \parr comes from the fact that the operator is sometimes pronounced “par” in linear logic. (It’s not simply \par because LaTeX already has a command \par for inserting a paragraph break.)

The name \invamp is short for “inverse ampersand.” Note however that the symbol is not an inverted ampersand in the sense of being a reflection; it is an ampersand rotated 180°.

The post Rotating symbols in LaTeX first appeared on John D. Cook.

]]>
https://www.johndcook.com/blog/2020/11/18/rotating-symbols-in-latex/feed/ 0
The smallest number with a given number of divisors https://www.johndcook.com/blog/2020/11/18/smallest-n-with-h-divisors/ https://www.johndcook.com/blog/2020/11/18/smallest-n-with-h-divisors/#respond Wed, 18 Nov 2020 14:54:49 +0000 https://www.johndcook.com/blog/?p=65667 Suppose you want to find the smallest number with 5 divisors. After thinking about it a little you might come up with 16, because 16 = 24 and the divisors of 16 are 2k where k = 0, 1, 2, 3, or 4. This approach generalizes: For any prime q, the smallest number with q divisors […]

The post The smallest number with a given number of divisors first appeared on John D. Cook.

]]>
Suppose you want to find the smallest number with 5 divisors. After thinking about it a little you might come up with 16, because

16 = 24

and the divisors of 16 are 2k where k = 0, 1, 2, 3, or 4.

This approach generalizes: For any prime q, the smallest number with q divisors is 2q-1.

Now suppose you want to find the smallest number with 6 divisors. One candidate would be 32 = 25, but you could do better. Instead of just looking at numbers divisible by the smallest prime, you could consider numbers that are divisible by the two smallest primes. And in fact

12 = 22 3

is the smallest number with 6 divisors.

This approach also generalizes. If h is the product of 2 primes, say h = pq where pq, then the smallest number with h divisors is

2p-1 3q-1.

The divisors come from letting the exponent on 2 range from 0 to p-1 and letting the exponent on 3 range from 0 to q-1.

For example, the smallest number with 35 divisors is

5184 = 27-1 35-1.

Note that we did not require p and q to be different. We said pq, and not p > q. And so, for example, the smallest number with 25 divisors is

1296 = 25-1 35-1.

Now, suppose we want to find the smallest number with 1001 divisors. The number 1001 factors as 7*11*13, which has some interesting consequences. It turns out that the smallest number with 1001 divisors is

213-1 311-1 57-1.

Does this solution generalize? Usually, but not always.

Let h = pqr where p, q, and r are primes with pqr. Then the smallest number with h divisors is

2p-1 3q-1 5r-1

with one exception. The smallest number with 8 divisors would be 30 = 2*3*5 if the theorem always held, but in fact the smallest number with 8 divisors is 24.

In [1] M. E. Gorst examines the exceptions to the general pattern. We’ve looked at the smallest number with h divisors when h is the product of 1, or 2, or 3 (not necessarily distinct) primes. Gorst considers values of h equal to the product of up to 6 primes.

We’ve said that the pattern above holds for all h the product of 1 or 2 primes, and for all but one value of h the product of 3 primes. There are two exceptions for h the product of 4 primes. That is, if h = pqrs where pqrs are primes, then the smallest number with h divisors is

2p-1 3q-1 5r-1 7s-1

with two exceptions. The smallest number with 24 divisors is 23 × 3 × 5, and the smallest number with 3 × 23 divisors is 23 × 32 × 5.

When h is the product of 5 or 6 primes, there are infinitely many exceptions, but they have a particular form given in [1].

The result discussed here came up recently in something I was working on, but I don’t remember now what. If memory serves, which it may not, I wanted to assume something like what is presented here but wasn’t sure it was true.

***

[1] M. E. Grost. The Smallest Number with a Given Number of Divisors. The American Mathematical Monthly, September 1968, pp. 725-729.

The post The smallest number with a given number of divisors first appeared on John D. Cook.

]]>
https://www.johndcook.com/blog/2020/11/18/smallest-n-with-h-divisors/feed/ 0
Good news from Pfizer and Moderna https://www.johndcook.com/blog/2020/11/16/good-news-from-pfizer-and-moderna/ https://www.johndcook.com/blog/2020/11/16/good-news-from-pfizer-and-moderna/#comments Mon, 16 Nov 2020 17:44:12 +0000 https://www.johndcook.com/blog/?p=65530 Both Pfizer and Moderna have announced recently that their SARS-COV2 vaccine candidates reduce the rate of infection by over 90% in the active group compared to the control (placebo) group. That’s great news. The vaccines may turn out to be less than 90% effective when all is said and done, but even so they’re likely […]

The post Good news from Pfizer and Moderna first appeared on John D. Cook.

]]>
Both Pfizer and Moderna have announced recently that their SARS-COV2 vaccine candidates reduce the rate of infection by over 90% in the active group compared to the control (placebo) group.

That’s great news. The vaccines may turn out to be less than 90% effective when all is said and done, but even so they’re likely to be far more effective than expected.

But there’s other good news that might be overlooked: the subjects in the control groups did well too, though not as well as in the active groups.

The infection rate was around 0.4% in the Pfizer control group and around 0.6% in the Moderna control group.

There were 11 severe cases of COVID in the Moderna trial, out of 30,000 subjects, all in the control group.

There were 0 severe cases of COVID in the Pfizer trial in either group, out of 43,000 subjects.

The post Good news from Pfizer and Moderna first appeared on John D. Cook.

]]>
https://www.johndcook.com/blog/2020/11/16/good-news-from-pfizer-and-moderna/feed/ 3
I think I’ll pass https://www.johndcook.com/blog/2020/11/16/i-think-ill-pass/ https://www.johndcook.com/blog/2020/11/16/i-think-ill-pass/#comments Mon, 16 Nov 2020 15:07:10 +0000 https://www.johndcook.com/blog/?p=65509 The other day I saw an article about some math test and thought “I bet I’d blow that away now.” Anyone who has spent a career using some skill ought to blow away an exam intended for people who have been learning that skill for a semester. However, after thinking about it more, I’m pretty […]

The post I think I'll pass first appeared on John D. Cook.

]]>
The other day I saw an article about some math test and thought “I bet I’d blow that away now.”

Anyone who has spent a career using some skill ought to blow away an exam intended for people who have been learning that skill for a semester.

However, after thinking about it more, I’m pretty sure I’d pass the test in question, but I’m not at all sure I’d ace it. Academic exams often test unimportant material that is in the short term memory of both the instructor and the students.

## From Timbuktu to …

When I was in middle school, I remember a question that read

It is a long way from ________ to ________.

My teacher was looking for a direct quote from a photo caption in our textbook that said it was a long way from Timbuktu to some place I can’t remember.

That stuck in my mind as the canonical example of a question that doesn’t test subject matter knowledge but tests the incidental minutia of the course itself [1]. A geography professor would stand no better chance of giving the expected answer than I did.

## The three reasons …

Almost any time you see a question asking for “the 3 reasons” for something or “the 5 consequences” of this or that, it’s likely a Timbuktu question. In open-world contexts [2], I’m suspicious whenever I see “the” followed by a specific number.

In some contexts you can make exhaustive lists—it makes sense to talk about the 3 branches of the US government or the 5 Platonic solids, but it doesn’t make sense to talk about the 4 causes of World War I. Surely historians could come up with more than 4 causes, and there’s probably no consensus regarding what the 4 most important causes are.

There’s a phrase teaching to the test for when the goal is not to teach the subject per se but to prepare the students to pass a standardized test related to the subject. The phenomena discussed here is sort of the opposite, testing to the teaching.

When you ask students for the 4 causes of WWI, you’re asking for the 4 causes given in lecture or the 4 causes in the text book. You’re not testing knowledge of WWI per se but knowledge of the course materials.

## Related posts

[1] Now that I’m in middle age rather than middle school, I could say that the real question was not geography but psychology. The task was to reverse-engineer from an ambiguous question what someone was thinking. That is an extremely valuable skill, but not one I possessed in middle school.

[2] A closed world is one in which the rules are explicitly known, finite, and exhaustive. Chess is a closed world. Sales is not. Academia often puts a box around some part of an open world so it can think of it as a closed world.

The post I think I'll pass first appeared on John D. Cook.

]]>
https://www.johndcook.com/blog/2020/11/16/i-think-ill-pass/feed/ 1
Probability of commuting https://www.johndcook.com/blog/2020/11/11/probability-of-commuting/ https://www.johndcook.com/blog/2020/11/11/probability-of-commuting/#comments Thu, 12 Nov 2020 00:11:56 +0000 https://www.johndcook.com/blog/?p=65145 A couple years ago I wrote a blog post looking at how close the quaternions come to commuting. That is, the post looked at the average norm of xy – yx. A related question would be to ask how often quaternions do commute, i.e. the probability that xy – yx = 0 for randomly chosen […]

The post Probability of commuting first appeared on John D. Cook.

]]>
A couple years ago I wrote a blog post looking at how close the quaternions come to commuting. That is, the post looked at the average norm of xyyx.

A related question would be to ask how often quaternions do commute, i.e. the probability that xyyx = 0 for randomly chosen x and y.

There’s a general theorem for this [1]. For a discrete non-abelian group, the probability that two elements commute, chosen uniformly at random, is never more than 5/8 for any group.

To put it another way, in a finite group either all pairs of elements commute with each other or no more than 5/8 of all pairs commute, with no possibilities in between. You can’t have a group, for example, in which exactly 3 out of 4 pairs commute.

What if we have an infinite group like the quaternions?

Before we can answer that, we’ve got to say how we’d compute probabilities. With a finite group, the natural thing to do is make every point have equal probability. For a (locally compact) infinite group the natural choice is Haar measure.

Subject to some technical conditions, Haar measure is the only measure that interacts as expected with the group structure. It’s unique up to a constant multiple, and so it’s unique when we specify that the measure of the whole group has to be 1.

For compact non-abelian groups with Haar measure, we again get the result that no more than 5/8 of pairs commute.

[1] W. H. Gustafson. What is the Probability that Two Group Elements Commute? The American Mathematical Monthly, Nov., 1973, Vol. 80, No. 9, pp. 1031-1034.

The post Probability of commuting first appeared on John D. Cook.

]]>
https://www.johndcook.com/blog/2020/11/11/probability-of-commuting/feed/ 6
Test for divisibility by 13 https://www.johndcook.com/blog/2020/11/10/test-for-divisibility-by-13/ https://www.johndcook.com/blog/2020/11/10/test-for-divisibility-by-13/#comments Tue, 10 Nov 2020 14:43:40 +0000 https://www.johndcook.com/blog/?p=65101 There are simple rules for telling whether a number is divisible by 2, 3, 4, 5, and 6. A number is divisible by 2 if its last digit is divisible by 2. A number is divisible by 3 if the sum of its digits is divisible by 3. A number is divisible by 4 if […]

The post Test for divisibility by 13 first appeared on John D. Cook.

]]>
There are simple rules for telling whether a number is divisible by 2, 3, 4, 5, and 6.

• A number is divisible by 2 if its last digit is divisible by 2.
• A number is divisible by 3 if the sum of its digits is divisible by 3.
• A number is divisible by 4 if the number formed by its last two digits is divisible by 4.
• A number is divisible by 5 if its last digit is divisible by 5.
• A number is divisible by 6 if it is divisible by 2 and by 3.

There is a rule for divisibility by 7, but it’s a little wonky. Let’s keep going.

• A number is divisible by 8 if the number formed by its last three digits is divisible by 8.
• A number is divisible by 9 if the sum of its digits is divisible by 9.
• A number is divisible by 10 if its last digit is 0.

There’s a rule for divisibility by 11. It’s a little complicated, though not as complicated as the rule for 7. I describe the rule for 11 in the penultimate paragraph here.

A number is divisible by 12 if it’s divisible by 3 and 4. (It matters here that 3 and 4 are relatively prime. It’s not true, for example, that a number is divisible by 12 if it’s divisible by 2 and 6.)

But what do you do when you get to 13?

## Testing divisibility by 7, 11, and 13

We’re going to kill three birds with one stone by presenting a rule for testing divisibility by 13 that also gives new rules for testing divisibility by 7 and 11. So if you’re trying to factor a number by hand, this will give a way to test three primes at once.

To test divisibility by 7, 11, and 13, write your number with digits grouped into threes as usual. For example,

11,037,989

Then think of each group as a separate number — e.g. 11, 37, and 989 — and take the alternating sum, starting with a + sign on the last term.

989 – 37 + 11

The original number is divisible by 7 (or 11 or 13) if this alternating sum is divisible by 7 (or 11 or 13 respectively).

The alternating sum in our example is 963, which is clearly 9*107, and not divisible by 7, 11, or 13. Therefore 11,037,989 is not divisible by 7, 11, or 13.

4,894,498,518

The alternating sum is

518 – 498 + 894 – 4 = 910

The sum takes a bit of work, but less work than dividing a 10-digit number by 7, 11, and 13.

The sum 910 factors into 7*13*10, and so it is divisible by 7 and by 13, but not by 11. That tells us 4,894,498,518 is divisible by 7 and 13 but not by 11.

## Why this works

The heart of the method is that 7*11*13 = 1001. If I subtract a multiple of 1001 from a number, I don’t change its divisibility by 7, 11, or 13. More than that, I don’t change its remainder by 7, 11, or 13.

The steps in the method amount to adding or subtracting multiples of 1001 and dividing by 1000. The former doesn’t change the remainder by 7, 11, or 13, but the latter multiplies the remainder by -1, hence the alternating sum. (1000 is congruent to -1 mod 7, mod 11, and mod 13.) See more formal argument in footnote [1].

So not only can we test for divisibility by 7, 11, and 13 with this method, we can also find the remainders by 7, 11, and 13. The original number and the alternating sum are congruent mod 1001, so they are congruent mod 7, mod 11, and mod 13.

In our first example, n = 11,037,989 and the alternating sum was m = 963. The remainder when m is divided by 7 is 4, so the remainder when n is divided by 7 is also 4. That is, m is congruent to 4 mod 7, and so n is congruent to 4 mod 7. Similarly, m is congruent to 6 mod 11, and so n is congruent to 6 mod 11. And finally m is congruent to 1 mod 13, so n is congruent to 1 mod 13.

## Related posts

[1] The key calculation is

The post Test for divisibility by 13 first appeared on John D. Cook.

]]>
https://www.johndcook.com/blog/2020/11/10/test-for-divisibility-by-13/feed/ 5
Some mathematical art https://www.johndcook.com/blog/2020/11/09/some-mathematical-art/ https://www.johndcook.com/blog/2020/11/09/some-mathematical-art/#comments Tue, 10 Nov 2020 04:16:32 +0000 https://www.johndcook.com/blog/?p=65064 This evening I ran across a paper on an unusual coordinate system that creates interesting graphs based from simple functions. It’s called “circular coordinates,” but this doesn’t mean polar coordinates; it’s more complicated than that. [1] Here’s a plot reproduced from [1], with some color added (the default colors matplotlib uses for multiple plots). The […]

The post Some mathematical art first appeared on John D. Cook.

]]>
This evening I ran across a paper on an unusual coordinate system that creates interesting graphs based from simple functions. It’s called “circular coordinates,” but this doesn’t mean polar coordinates; it’s more complicated than that. [1]

Here’s a plot reproduced from [1], with some color added (the default colors matplotlib uses for multiple plots).

The plot above was based on a the gamma function. Here are a few plots replacing the gamma function with another function.

Here’s x/sin(x):

Here’s x5:

And here’s tan(x):

Here’s how the plots were created. For a given function f, plot the parametric curves given by

See [1] for what this has to do with circles and coordinates.

The plots based on a function g(x) are given by setting f(x) = g(x) + c where c = -10, -9, -8, …, 10.

## Related posts

[1] Elliot Tanis and Lee Kuivinen, Circular Coordinates and Computer Drawn Designs. Mathematics Magazine. Vol 52 No 3. May, 1979.

The post Some mathematical art first appeared on John D. Cook.

]]>
https://www.johndcook.com/blog/2020/11/09/some-mathematical-art/feed/ 2
Counting triangles with integer sides https://www.johndcook.com/blog/2020/11/08/integer-triangles/ https://www.johndcook.com/blog/2020/11/08/integer-triangles/#comments Sun, 08 Nov 2020 19:05:58 +0000 https://www.johndcook.com/blog/?p=64750 Let T(N) be the number of distinct (non-congruent) triangles with integer sides and perimeter N.  For example, T(12) = 3 because there are three distinct triangles with integer sides and perimeter 12. There’s the equilateral triangle with sides 4 : 4 : 4, and the Pythagorean triangle 3 : 4 : 5. With a little […]

The post Counting triangles with integer sides first appeared on John D. Cook.

]]>
Let T(N) be the number of distinct (non-congruent) triangles with integer sides and perimeter N.  For example, T(12) = 3 because there are three distinct triangles with integer sides and perimeter 12. There’s the equilateral triangle with sides 4 : 4 : 4, and the Pythagorean triangle 3 : 4 : 5. With a little more work we can find 2 : 5 : 5.

The authors in [1] developed an algorithm for finding T(N). The following Python code is a direct implementation of that algorithm.

    def T(N :int):
if N < 3:
return 0
base_cases = {4:0, 6:1, 8:1, 10:2, 12:3, 14:4}
if N in base_cases:
return base_cases[N]
if N % 2 == 0:
R = N % 12
if R < 4:
R += 12
return (N**2 - R**2)//48 + T(R)
if N % 2 == 1:
return T(N+3)


If you’re running a version of Python that doesn’t support type hinting, just delete the :int in the function signature.

Since this is a recursive algorithm, we should convince ourselves that it terminates. In the branch for even N, the number R is an even number between 4 and 14 inclusive, and so it’s in the base_cases dictionary.

In the odd branch, we recurse on N+3, which is a little unusual since typically recursive functions decrease their argument. But since N is odd, N+3 is even, and we’ve already shown that the even branch terminates.

The code (N**2 - R**2)//48 raises a couple questions. Is the numerator divisible by 48? And if so, why specify integer division (//) rather than simply division (/)?

First, the numerator is indeed divisible by 48. N is congruent to R mod 12 by construction, and so NM is divisible by 12. Furthermore,

N² – R² = (NR)(N + R).

The first term on the right is divisible by 12, so if the second term is divisible by 4, the product is divisible by 48.  Since N and R are congruent mod 12, N + R is congruent to 2R mod 12, and since R is even, 2R is a multiple of 4 mod 12. That makes it a multiple of 4 since 12 is a multiple of 4.

So if (N² – R²)/48 is an integer, why did I write Python code that implies that I’m taking the integer part of the result? Because otherwise the code would sometimes return a floating point value. For example, T(13) would return 5.0 rather than 5.

Here’s a plot of T(N).

[1] J. H. Jordan, Ray Walch and R. J. Wisner. Triangles with Integer Sides. The American Mathematical Monthly, Vol. 86, No. 8 (Oct., 1979), pp. 686-689

The post Counting triangles with integer sides first appeared on John D. Cook.

]]>
https://www.johndcook.com/blog/2020/11/08/integer-triangles/feed/ 1
Ripples and hyperbolas https://www.johndcook.com/blog/2020/11/06/ripples-and-hyperbolas/ https://www.johndcook.com/blog/2020/11/06/ripples-and-hyperbolas/#comments Fri, 06 Nov 2020 14:03:10 +0000 https://www.johndcook.com/blog/?p=64742 I ran across a paper [1] this morning on the differential equation y‘ = sin(xy). The authors recommend having students explore numerical solutions to this equation and discover theorems about its solutions. Their paper gives numerous theorems relating solutions and the hyperbolas xy = a: how many times a solution crosses a hyperbola, at what […]

The post Ripples and hyperbolas first appeared on John D. Cook.

]]>
I ran across a paper [1] this morning on the differential equation

y‘ = sin(xy).

The authors recommend having students explore numerical solutions to this equation and discover theorems about its solutions.

Their paper gives numerous theorems relating solutions and the hyperbolas xy = a: how many times a solution crosses a hyperbola, at what angle, under what conditions a solution can be tangent to a hyperbola, etc.

The plot above is based on a plot in the original paper, but easier to read. It wasn’t so easy to make nice plots 40 years ago. In the original plot the solutions and the asymptotes were plotted with the same thickness and color, making them hard to tell apart.

## More differential equation posts

[1] Wendell Mills, Boris Weisfeiler and Allan M. Krall. Discovering Theorems with a Computer: The Case of y‘ = sin(xy). The American Mathematical Monthly, Nov., 1979, Vol. 86, No. 9 (Nov., 1979), pp. 733-739

The post Ripples and hyperbolas first appeared on John D. Cook.

]]>
https://www.johndcook.com/blog/2020/11/06/ripples-and-hyperbolas/feed/ 1
Informative stopping https://www.johndcook.com/blog/2020/11/04/informative-stopping/ https://www.johndcook.com/blog/2020/11/04/informative-stopping/#comments Wed, 04 Nov 2020 13:59:55 +0000 https://www.johndcook.com/blog/?p=64603 When the rule for stopping an experiment depends on the data in the experiment, the results could be biased if the stopping rule isn’t taken into account in the analysis [1]. For example, suppose Alice wants to convince Bob that π has a greater proportion of even digits than odd digits. Alice: I’ll show you […]

The post Informative stopping first appeared on John D. Cook.

]]>
When the rule for stopping an experiment depends on the data in the experiment, the results could be biased if the stopping rule isn’t taken into account in the analysis [1].

For example, suppose Alice wants to convince Bob that π has a greater proportion of even digits than odd digits.

Alice: I’ll show you that π has more even digits than odd digits by looking at the first N digits. How big would you like N to be?

Bob: At least 1,000. Of course more data is always better.

Alice: Right. And how many more even than odd digits would you find convincing?

Bob: If there are at least 10 more evens than odds, I’ll believe you.

Alice: OK. If you look at the first 2589 digits, there are 13 more even digits than odd digits.

Now if Alice wanted to convince Bob that there are more odd digits, she could do that too. If you look at the first 2077 digits, 13 more are odd than even.

No matter what two numbers Bob gives, Alice can find find a sample size that will give the result she wants. Here’s Alice’s Python code.

    from mpmath import mp
import numpy as np

N = 3000
mp.dps = N+2
digits = str(mp.pi)[2:]

parity = np.ones(N, dtype=int)
for i in range(N):
if digits[i] in ['1', '3', '5', '7', '9']:
parity[i] = -1
excess = parity.cumsum()
print(excess[-1])
print(np.where(excess == 13))
print(np.where(excess == -13))


The number N is a guess at how far out she might have to look. If it doesn’t work, she increases it and runs the code again.

The array parity contains a 1 in positions where the digits of π (after the decimal point) are even and a -1 where they are odd. The cumulative sum shows how many more even than odd digits there have been up to a given point, a negative number meaning there have been more odd digits.

Alice thought that stopping when there are exactly 10 more of the parity she wants would look suspicious, so she looked for places where the difference was 13.

Here are the results:

    [ 126,  128,  134,  …,  536, 2588, … 2726]
[ 772,  778,  780,  …,  886, 2076, … 2994]


There’s one minor gotcha. The array excess is indexed from zero, so Alice reports 2589 rather than 2588 because the 2589th digit has index 2588.

Bob’s mistake was that he specified a minimum sample size. By saying “at least 1,000” he gave Alice the freedom to pick the sample size to get the result she wanted. If he specified an exact sample size, there probably would be either more even digits or more odd digits, but there couldn’t be both. And if he were more sophisticated enough, he could pick an excess value that would be unlikely given that sample size.

## Related posts

[1] This does not contradict the likelihood principle; it says that informative stopping rules should be incorporated into the likelihood function.

The post Informative stopping first appeared on John D. Cook.

]]>
https://www.johndcook.com/blog/2020/11/04/informative-stopping/feed/ 3
Expert determination for CCPA https://www.johndcook.com/blog/2020/11/04/expert-determination-for-ccpa/ https://www.johndcook.com/blog/2020/11/04/expert-determination-for-ccpa/#respond Wed, 04 Nov 2020 13:00:53 +0000 https://www.johndcook.com/blog/?p=64613 California’s CCPA regulation has been amended to say that data considered deidentified under HIPAA is considered deidentified under CCPA. The amendment was proposed last year and was finally signed into law on September 25, 2020. This is good news because it’s relatively clear what deidentification means under HIPAA compared to CCPA. In particular, HIPAA has […]

The post Expert determination for CCPA first appeared on John D. Cook.

]]>

California’s CCPA regulation has been amended to say that data considered deidentified under HIPAA is considered deidentified under CCPA. The amendment was proposed last year and was finally signed into law on September 25, 2020.

This is good news because it’s relatively clear what deidentification means under HIPAA compared to CCPA. In particular, HIPAA has two well-established alternatives for determining that data have been adequately deidentified:

1. Safe Harbor, or
2. Expert determination.

The latter is especially important because most useful data doesn’t meet the requirements of Safe Harbor.

I provide companies with HIPAA expert determination. And now by extension I can provide expert determination under CCPA.

I’m not a lawyer, and so nothing I write should be considered legal advice. But I work closely with lawyers to provide expert determination. If you would like to discuss how I could help you, let’s talk.

The post Expert determination for CCPA first appeared on John D. Cook.

]]>
https://www.johndcook.com/blog/2020/11/04/expert-determination-for-ccpa/feed/ 0
Category theory for programmers made easier https://www.johndcook.com/blog/2020/11/02/category-theory-for-programmers/ https://www.johndcook.com/blog/2020/11/02/category-theory-for-programmers/#comments Mon, 02 Nov 2020 18:16:29 +0000 https://www.johndcook.com/blog/?p=64471 I imagine most programmers who develop an interest in category theory do so after hearing about monads. They ask someone what a monad is, and they’re told that if they really want to know, they need to learn category theory. Unfortunately, there are couple unnecessary difficulties anyone wanting to understand monads etc. is likely to […]

The post Category theory for programmers made easier first appeared on John D. Cook.

]]>
I imagine most programmers who develop an interest in category theory do so after hearing about monads. They ask someone what a monad is, and they’re told that if they really want to know, they need to learn category theory.

Unfortunately, there are couple unnecessary difficulties anyone wanting to understand monads etc. is likely to face immediately. One is some deep set theory.

“A category is a collection of objects …”

“You mean like a set?”

“Ah, well, no. You see, Bertrand Russell showed that …”

There are reasons for such logical niceties, but they don’t matter to someone who wants to understand programming patterns.

Another complication is morphisms.

“As I was saying, a category is a collection of objects and morphisms between objects …”

“You mean like functions?”

“Well, they might be functions, but more generally …”

Yes, Virginia, morphisms are functions. It’s true that they might not always be functions, but they will be functions in every example you care about, at least for now.

Category theory is a framework for describing patterns in function composition, and so that’s why things like monads find their ultimate home in category theory. But doing category theory rigorously requires some setup that people eager to get into applications don’t have to be concerned with.

Patrick Honner posted on Twitter recently that his 8-year-old child asked him what area is. My first thought on seeing that was that a completely inappropriate answer would be that this is a deep question that wasn’t satisfactorily settled until the 20th century using measure theory. My joking response to Patrick was

Well, first we have to define σ-algebras. They’re kinda like topologies, but closed under countable union and intersection instead of arbitrarily union and finite intersection. Anyway, a measure is a …

It would be ridiculous to answer a child this way, and it is nearly as ridiculous to burden a programmer with unnecessary logical nuance when they’re trying to find out why something is called a functor, or a monoid, or a monad, etc.

I saw an applied category theory presentation that began with “A category is a graph …” That sweeps a lot under the rug, but it’s not a bad conceptual approximation.

So my advice to programmers learning category theory is to focus on the arrows in the diagrams. Think of them as functions; they probably are in your application [1]. Think of category theory as a framework for describing patterns. The rigorous foundations can be postponed, perhaps indefinitely, just as an 8-year-old child doesn’t need to know measure theory to begin understanding area.

## More category theory posts

[1] The term “contravariant functor” has unfortunately become deprecated. In more modern presentations, all functors are covariant, but some are covariant in an opposite category. That does make the presentation more slick, but at the cost of turning arrows around that used to represent functions and now don’t really. In my opinion, category theory would be more approachable if we got rid of all “opposite categories” and said that functors come in two flavors, covariant and contravariant, at least in introductory presentations.

The post Category theory for programmers made easier first appeared on John D. Cook.

]]>
https://www.johndcook.com/blog/2020/11/02/category-theory-for-programmers/feed/ 6
Is every number a random Fibonacci number? https://www.johndcook.com/blog/2020/10/31/random-fibonacci-conjecture/ https://www.johndcook.com/blog/2020/10/31/random-fibonacci-conjecture/#comments Sat, 31 Oct 2020 22:03:55 +0000 https://www.johndcook.com/blog/?p=64297 The previous post looked at random Fibonacci sequences. These are defined by f1 = f2 = 1, and fn = fn-1 ± fn-2 for n > 2, where the sign is chosen randomly to be +1 or -1. Conjecture: Every integer can appear in a random Fibonacci sequence. Here’s why I believe this might be […]

The post Is every number a random Fibonacci number? first appeared on John D. Cook.

]]>
The previous post looked at random Fibonacci sequences. These are defined by

f1 = f2 = 1,

and

fn = fn-1 ± fn-2

for n > 2, where the sign is chosen randomly to be +1 or -1.

Conjecture: Every integer can appear in a random Fibonacci sequence.

Here’s why I believe this might be true. The values in a random Fibonacci sequence of length n are bound between –Fn-3 and Fn.[1] This range grows like On) where φ is the golden ratio. But the number of ways to pick + and – signs in a random Fibonacci equals 2n.

By the pigeon hole principle, some choices of signs must lead to the same numbers: if you put 2n balls in φn boxes, some boxes get more than one ball since φ < 2. That’s not quite rigorous since the range is  On) rather than exactly φn, but that’s the idea. The graph included in the previous post shows multiple examples where different random Fibonacci sequences overlap.

Now the pigeon hole principle doesn’t show that the conjecture is true, but it suggests that there could be enough different sequences that it might be true. The fact that the ratio of balls to boxes grows exponentially doesn’t hurt either.

Empirically, it appears that as you look at longer and longer random Fibonacci sequences, gaps in the range are filled in.

The following graphs consider all random Fibonacci sequences of length n, plotting the smallest positive integer and the largest negative integer not in the range. For the negative integers, we take the absolute value. Both plots are on a log scale.

First positive number missing:

Absolute value of first negative number missing:

The span between the largest and smallest possible random Fibonacci sequence value is growing exponentially with n, and the range of consecutive numbers in the range is apparently also growing exponentially with n.

The following Python code was used to explore the gaps.

    import numpy as np
from itertools import product

def random_fib_range(N):
r = set()
x = np.ones(N, dtype=int)
for signs in product((-1,1), repeat=(N-2)):
for i in range(2, N):
b = signs[i-2]
x[i] = x[i-1] + b*x[i-2]
return sorted(list(r))

def stats(r):
zero_location = r.index(0)

# r is sorted, so these are the min and max values
neg_gap = r[0]  // minimum
pos_gap = r[-1] // maximum

for i in range(zero_location-1, -1, -1):
if r[i] != r[i+1] - 1:
neg_gap = r[i+1] - 1
break

for i in range(zero_location+1, len(r)):
if r[i] != r[i-1] + 1:
pos_gap = r[i-1] + 1
break

return  (neg_gap, pos_gap)

for N in range(5,25):
r = random_fib_range(N)
print(N, stats(r))


## Proof

Update: Nathan Hannon gives a simple proof of the conjecture by induction in the comments.

You can create the series (1, 2) and (1, 3). Now assume you can create (1, n). Then you can create (1, n+2) via (1, n, n+1, 1, n+2). So you can create any positive even number starting from (1, 2) and any odd positive number from (1, 3).

You can do something analogous for negative numbers via (1, n, n-1, -1, n-2, n-3, -1, 2-n, 3-n, 1, n-2).

This proof can be used to create an upper bound on the time required to hit a given integer, and a lower bound on the probability of hitting a given integer during a random Fibonacci sequence.

Nathan’s construction requires more steps to produce new negative numbers, but that is consistent with the range of random Fibonacci sequences being wider on the positive side, [-Fn-3, Fn].

***

[1] To minimize the random Fibonacci sequence, you can chose the signs so that the values are 1, 1, 0, -1, -1, -2, -3, -5, … Note that absolute value of this sequence is the ordinary Fibonacci sequence with 3 extra terms spliced in. That’s why the lower bound is –Fn-3.

The post Is every number a random Fibonacci number? first appeared on John D. Cook.

]]>
https://www.johndcook.com/blog/2020/10/31/random-fibonacci-conjecture/feed/ 3
Random Fibonacci numbers https://www.johndcook.com/blog/2020/10/31/random-fibonacci-numbers/ https://www.johndcook.com/blog/2020/10/31/random-fibonacci-numbers/#respond Sat, 31 Oct 2020 18:43:11 +0000 https://www.johndcook.com/blog/?p=64281 The Fibonacci numbers are defined by F1 = F2 = 1, and for n > 2, Fn = Fn-1 + Fn-2. A random Fibonacci sequence f is defined similarly, except the addition above is replaced with a subtraction with probability 1/2. That is, f1 = f2 = 1, and for n > 2, fn = […]

The post Random Fibonacci numbers first appeared on John D. Cook.

]]>
The Fibonacci numbers are defined by F1 = F2 = 1, and for n > 2,

Fn = Fn-1 + Fn-2.

A random Fibonacci sequence f is defined similarly, except the addition above is replaced with a subtraction with probability 1/2. That is, f1 = f2 = 1, and for n > 2,

fn = fn-1 + b fn-2

where b is +1 or -1, each with equal probability.

Here’s a graph a three random Fibonacci sequences.

Here’s the Python code that was used to produce the sequences above.

    import numpy as np

def rand_fib(length):
f = np.ones(length)
for i in range(2, length):
b = np.random.choice((-1,1))
f[i] = f[i-1] + b*f[i-2]
return f


It’s easy to see that the nth random Fibonacci number can be as large as the nth ordinary Fibonacci number if all the signs happen to be positive. But the numbers are typically much smaller.

The nth (ordinary) Fibonacci number asymptotically approaches φn is the golden ratio, φ = (1 + √5)/2 = 1.618…

Another way to say this is that

The nth random Fibonacci number does not have an asymptotic value—it wanders randomly between positive and negative values—but with probability 1, the absolute values satisfy

This was proved in 1960 [1].

Here’s a little Python code to show that we get results consistent with this result using simulation.

    N = 500
x = [abs(rand_fib(N)[-1])**(1/N) for _ in range(10)]
print(f"{np.mean(x)} ± {np.std(x)}")


This produced

1.1323 ± 0.0192

which includes the theoretical value 1.1320.

Update: The next post looks at whether every integer appear in a random Fibonacci sequence. Empirical evidence suggests the answer is yes.

## Related posts

[1] Furstenberg and Kesten. Products of random matrices. Ann. Math. Stat. 31, 457-469.

The post Random Fibonacci numbers first appeared on John D. Cook.

]]>
https://www.johndcook.com/blog/2020/10/31/random-fibonacci-numbers/feed/ 0
Edsger Dijkstra, blogger https://www.johndcook.com/blog/2020/10/30/dijkstras-blog/ https://www.johndcook.com/blog/2020/10/30/dijkstras-blog/#comments Fri, 30 Oct 2020 13:30:58 +0000 https://www.johndcook.com/blog/?p=64209 I’ve been thinking about Edsger Dijkstra lately because I suspect some of the ideas he developed will be useful for a project I’m working on. While searching for some of Dijkstra’s writings I ran across the article Edsger Dijkstra: The Man Who Carried Computer Science on His Shoulders. It occurred while reading this article that […]

The post Edsger Dijkstra, blogger first appeared on John D. Cook.

]]>

I’ve been thinking about Edsger Dijkstra lately because I suspect some of the ideas he developed will be useful for a project I’m working on.

While searching for some of Dijkstra’s writings I ran across the article Edsger Dijkstra: The Man Who Carried Computer Science on His Shoulders. It occurred while reading this article that Dijkstra was essentially a blogger before there were blogs.

Here is a description of his writing from the article:

… Dijkstra’s research output appears respectable, but otherwise unremarkable by current standards. In this case, appearances are indeed deceptive. Judging his body of work in this manner misses the mark completely. Dijkstra was, in fact, a highly prolific writer, albeit in an unusual way.

In 1959, Dijkstra began writing a series of private reports. Consecutively numbered and with his initials as a prefix, they became known as EWDs. He continued writing these reports for more than forty years. The final EWD, number 1,318, is dated April 14, 2002. In total, the EWDs amount to over 7,700 pages. Each report was photocopied by Dijkstra himself and mailed to other computer scientists.

His large collection of small articles sounds a lot like a blog to me.

You can find Dijkstra’s “blog” here.

The post Edsger Dijkstra, blogger first appeared on John D. Cook.

]]>
https://www.johndcook.com/blog/2020/10/30/dijkstras-blog/feed/ 3
Gruntled vs disgruntled https://www.johndcook.com/blog/2020/10/29/gruntled-vs-disgruntled/ https://www.johndcook.com/blog/2020/10/29/gruntled-vs-disgruntled/#comments Thu, 29 Oct 2020 13:16:27 +0000 https://www.johndcook.com/blog/?p=64133 My wife and I were talking this morning and the phrase”less disingenuous” came up. I thought about how sometimes a positive word fades into obscurity while the negative form lives on. The first example that came to mind is gruntled vs disgruntled. Yes, the former is an English word, but a rare one. Here’s a […]

The post Gruntled vs disgruntled first appeared on John D. Cook.

]]>
My wife and I were talking this morning and the phrase”less disingenuous” came up. I thought about how sometimes a positive word fades into obscurity while the negative form lives on. The first example that came to mind is gruntled vs disgruntled. Yes, the former is an English word, but a rare one.

Here’s a comparison of the frequency of gruntled vs disgruntled from 1860 to 2000.

In 2000, disgruntled was about 200x more common than gruntled in the books in Google’s English corpus.

But if you look further back, gruntled was used a little more often.

But it turns out that the people who were gruntled in the 19th century were chiefly British. If we look at just the American English corpus, no one was gruntled.

There’s a rise in the frequency of disgruntled as you look backward from 1815, which prompted me to look further back. Looking at just the American English corpus, a lot of people were disgruntled between 1766 and 1776 for some reason.

## More word frequency comparisons

The post Gruntled vs disgruntled first appeared on John D. Cook.

]]>
https://www.johndcook.com/blog/2020/10/29/gruntled-vs-disgruntled/feed/ 3
Doing well https://www.johndcook.com/blog/2020/10/27/doing-well/ https://www.johndcook.com/blog/2020/10/27/doing-well/#comments Tue, 27 Oct 2020 13:20:08 +0000 https://www.johndcook.com/blog/?p=63993 The first time I went a few days without blogging, someone sent me a concerned email asking whether I was OK. And in case anyone has had similar thoughts this week, I’m doing well. I’m busy, though my rate of blogging is fairly independent of how busy I am. Sometimes being busy gives me lots […]

The post Doing well first appeared on John D. Cook.

]]>
The first time I went a few days without blogging, someone sent me a concerned email asking whether I was OK. And in case anyone has had similar thoughts this week, I’m doing well.

I’m busy, though my rate of blogging is fairly independent of how busy I am. Sometimes being busy gives me lots of ideas of things to blog about.

Many small businesses have been crushed this year, but I’m grateful that my business has grown despite current events. For now I have all the work I care to do, and a promising stream of projects in the pipeline. Of course things could change suddenly, but ever was it so.

On a more personal note, my family is also doing well and growing. My daughter is getting married soon. It will be a small wedding with live streaming, quite different from our latest family wedding but we’re looking forward to it just as much.

The post Doing well first appeared on John D. Cook.

]]>
https://www.johndcook.com/blog/2020/10/27/doing-well/feed/ 4
More fun with quatrefoils https://www.johndcook.com/blog/2020/10/22/more-fun-with-quatrefoils/ https://www.johndcook.com/blog/2020/10/22/more-fun-with-quatrefoils/#comments Thu, 22 Oct 2020 12:00:51 +0000 https://www.johndcook.com/blog/?p=63584 In a comment to my previous post on quatrefoils, Jan Van lint suggested a different equation for quatrefoils: r = a + |cos(2θ)| Here are some examples of how these curves look for varying values of a. As a increases, the curves get rounder. We can quantify this by looking at the angle between the […]

The post More fun with quatrefoils first appeared on John D. Cook.

]]>
In a comment to my previous post on quatrefoils, Jan Van lint suggested a different equation for quatrefoils:

r = a + |cos(2θ)|

Here are some examples of how these curves look for varying values of a.

As a increases, the curves get rounder. We can quantify this by looking at the angle between the tangents on either side of the cusps. By symmetry, we can pick any one of the four cusps, so we’ll work with the one at θ = π/4 for convenience.

The slopes of the tangent lines are the left and right derivatives

Now the derivative of

a + |cos(2θ)|

with respect to θ at θ = π/4 is 2 from one size and -2 from the other.

Sine and cosine are equal at π/4, they cancel out in the ratio above and so the two derivatives, the slopes of the two tangent lines, are (2+a)/(2-a) and (2-a)/(2+a). The slopes are reciprocals of each other, which is what we’d expect since the quatrefoils are symmetric about the line θ = π/4.

The angles of the two tangent lines are the inverse tangents of the slopes, and so the angle between the two tangent lines is

Note that as a goes to zero, so does the angle between the tangent lines.

Here’s a plot of the angle as a function of a.

You could start with a desired angle and solve the equation above numerically for the value of a that gives the angle. From the graph above, it looks like if we wanted the curves to intersect at 90° we should pick a around 2. In fact, we should pick a exactly equal to 2. There the slopes are (2+2)/(2-2) = ∞ and (2-2)/(2+2) = 0, i.e. one tangent line is perfectly vertical and the other is perfectly horizontal.

The post More fun with quatrefoils first appeared on John D. Cook.

]]>
https://www.johndcook.com/blog/2020/10/22/more-fun-with-quatrefoils/feed/ 3
The word problem https://www.johndcook.com/blog/2020/10/19/the-word-problem/ https://www.johndcook.com/blog/2020/10/19/the-word-problem/#comments Tue, 20 Oct 2020 00:19:04 +0000 https://www.johndcook.com/blog/?p=63409 Most people have heard of word problems, but not as many have heard of the word problem. If you’re imagining that the word problem is some superlatively awful word problem, I can assure you it’s not. It’s both simpler and weirder than that. The word problem is essentially about whether you can always apply algebraic […]

The post The word problem first appeared on John D. Cook.

]]>
Most people have heard of word problems, but not as many have heard of the word problem. If you’re imagining that the word problem is some superlatively awful word problem, I can assure you it’s not. It’s both simpler and weirder than that.

The word problem is essentially about whether you can always apply algebraic rules in an automated way. The reason it is called the word problem is that you start by a description of your algebraic system in terms of symbols (“letters”) and concatenations of symbols (“words”) subject to certain rules, also called relations.

## The word problem for groups

For example, you can describe a group by saying it contains a and b, and it satisfies the relations

a² = b²

and

a-1ba = b-1.

A couple things are implicit here. We’ve said this a group, and since every element in a group has an inverse, we’ve implied that a-1 and b-1 are in the group as well. Also from the definition of a group comes the assumption that multiplication is associative, that there’s an identity element, and that inverses work like they’re supposed to.

In the example above, you could derive everything about the group from the information given. In particular, someone could give you two words—strings made up of a, b, a-1, and b-1—and you could determine whether they are equal by applying the rules. But in general, this is not possible for groups.

## Undecidable

The bad news is that in general this isn’t possible. In computer science terminology, the word problem is undecidable. There is no algorithm that can tell whether two words are equal given a list of relations, at least not in general. There are special cases where the word problem is solvable, but a general algorithm is not possible.

## The word problem for semigroups

I presented the word problem above in the context of groups, but you could look at the word problem in more general contexts, such as semigroups. A semigroup is closed under some associative binary operation, and that’s it. There need not be any inverses or even an identity element.

Here’s a concrete example of a semigroup whose word problem has been proven to be undecidable. As before we have two symbols, a and b. And because we are in a semigroup, not a group, there are no inverses. Our semigroup consists of all finite sequences make out of a‘s and b‘s, subject to these five relations.

aba2b2 = b2a2ba

a2bab2a = b2a3ba

aba3b2 = ab2aba2

b3a2b2a2ba = b3a2b2a4

a4b2a2ba = b2a4

Source: Term Rewriting and All That

## Experience

When I first saw groups presented this as symbols and relations, I got my hopes up that a large swath of group theory could be automated. A few minutes later my naive hopes were dashed. So in my mind I thought “Well, then this is hopeless.”

But that is not true. Sometimes the word problem is solvable. It’s like many other impossibility theorems. There’s no fifth degree analog of the quadratic equation in general, but there are fifth degree polynomials whose roots can be found in closed form. There’s no program that can tell whether any arbitrary program will halt, but that doesn’t mean you can’t tell whether some programs halt.

It didn’t occur to me at the time that it would be worthwhile to explore the boundaries, learning which word problems can or cannot be solved. It also didn’t occur to me that I would run into things like the word problem in practical applications, such as simplifying symbolic expressions and optimizing their evaluation. Undecidable problems lurk everywhere, but you can often step around them.

The post The word problem first appeared on John D. Cook.

]]>
https://www.johndcook.com/blog/2020/10/19/the-word-problem/feed/ 3
Real-time analytics https://www.johndcook.com/blog/2020/10/19/real-time-analytics/ https://www.johndcook.com/blog/2020/10/19/real-time-analytics/#comments Mon, 19 Oct 2020 13:50:18 +0000 https://www.johndcook.com/blog/?p=63388 There’s an ancient saying “Whom the gods would destroy they first make mad.” (Mad as in crazy, not mad as in angry.) I wrote a variation of this on Twitter: Whom the gods would destroy, they first give real-time analytics. Having more up-to-date information is only valuable up to a point. Past that point, you’re […]

The post Real-time analytics first appeared on John D. Cook.

]]>
There’s an ancient saying “Whom the gods would destroy they first make mad.” (Mad as in crazy, not mad as in angry.) I wrote a variation of this on Twitter:

Whom the gods would destroy, they first give real-time analytics.

Having more up-to-date information is only valuable up to a point. Past that point, you’re more likely to be distracted by noise. The closer you look at anything, the more irregularities you see, and the more likely you are to over-steer [1].

I don’t mean to imply that the noise isn’t real. (More on that here.) But there’s a temptation to pay more attention to the small variations you don’t understand than the larger trends you believe you do understand.

I became aware of this effect when simulating Bayesian clinical trial designs. The more often you check your stopping rule, the more often you will stop [2]. You want to monitor a trial often enough to shut it down, or at least pause it, if things change for the worse. But monitoring too often can cause you to stop when you don’t want to.

## Flatter than glass

A long time ago I wrote about the graph below.

The graph looks awfully jagged, until you look at the vertical scale. The curve represents the numerical difference between two functions that are exactly equal in theory. As I explain in that post, the curve is literally smoother than glass, and certainly flatter than a pancake.

## Notes

[1] See The Logic of Failure for a discussion of how over-steering is a common factor in disasters such as the Chernobyl nuclear failure.

[2] Bayesians are loathe to talk about things like α-spending, but when you’re looking at stopping frequencies, frequentist phenomena pop up.

The post Real-time analytics first appeared on John D. Cook.

]]>
https://www.johndcook.com/blog/2020/10/19/real-time-analytics/feed/ 6
Naive modeling https://www.johndcook.com/blog/2020/10/18/naive-modeling/ https://www.johndcook.com/blog/2020/10/18/naive-modeling/#comments Sun, 18 Oct 2020 23:44:16 +0000 https://www.johndcook.com/blog/?p=63362 In his book The Algorithm Design Manual, Steven Skiena has several sections called “War Stories” where he talks about his experience designing algorithms for clients. Here’s an excerpt of a story about finding the best airline ticket prices. “Look,” I said at the start of the first meeting. “This can’t be so hard. Consider a […]

The post Naive modeling first appeared on John D. Cook.

]]>

In his book The Algorithm Design Manual, Steven Skiena has several sections called “War Stories” where he talks about his experience designing algorithms for clients.

Here’s an excerpt of a story about finding the best airline ticket prices.

“Look,” I said at the start of the first meeting. “This can’t be so hard. Consider a graph … The path/fare can be found with Dijkstra’s shorted path algorithm. Problem solved!” I announced waving my hand with a flourish.

The assembled cast of the meeting nodded thoughtfully, then burst out laughing.

Skiena had greatly underestimated the complexity of the problem, but he learned, and was able to deliver a useful solution.

This reminds me of a story about a calculus professor who wrote a letter to a company that sold canned food explaining how they could use less metal for the same volume by changing the dimensions of their can. Someone wrote back thanking him for his suggestion listing reasons why the optimization problem was far more complicated than he had imagined. If anybody has a link to that story, please let me know.

Related post: Bring out your equations!

The post Naive modeling first appeared on John D. Cook.

]]>
https://www.johndcook.com/blog/2020/10/18/naive-modeling/feed/ 7
Opening Windows files from bash and eshell https://www.johndcook.com/blog/2020/10/17/windows-files-from-eshell/ https://www.johndcook.com/blog/2020/10/17/windows-files-from-eshell/#comments Sat, 17 Oct 2020 15:06:10 +0000 https://www.johndcook.com/blog/?p=63265 I often work in a sort of amphibious environment, using Unix software on Windows. As you can well imagine, this causes headaches. But I’ve found such headaches are generally more manageable than the headaches from alternatives I’ve tried. On the Windows command line, you can type the name of a file and Windows will open […]

The post Opening Windows files from bash and eshell first appeared on John D. Cook.

]]>
I often work in a sort of amphibious environment, using Unix software on Windows. As you can well imagine, this causes headaches. But I’ve found such headaches are generally more manageable than the headaches from alternatives I’ve tried.

On the Windows command line, you can type the name of a file and Windows will open the file with the default application associated with its file extension. For example, typing foo.docx and pressing Enter will open the file by that name using Microsoft Word, assuming that is your default application for .docx files.

Unix shells don’t work that way. The first thing you type at the command prompt must be a command, and foo.docx is not a command. The Windows command line generally works this way too, but it makes an exception for files with recognized extensions; the command is inferred from the extension and the file name is an argument to that command.

## WSL bash

When you’re running bash on Windows, via WSL (Windows Subsystem for Linux), you can run the Windows utility start which will open a file according to its extension. For example,

    cmd.exe /C start foo.pdf

will open the file foo.pdf with your default PDF viewer.

You can also use start to launch applications without opening a particular file. For example, you could launch Word from bash with

    cmd.exe /C start winword.exe

## Emacs eshell

Eshell is a shell written in Emacs Lisp. If you’re running Windows and you do not have access to WSL but you do have Emacs, you can run eshell inside Emacs for a Unix-like environment.

If you try running

    start foo.pdf

that will probably not work because eshell does not use the windows PATH environment.

I got around this by creating a Windows batch file named mystart.bat and put it in my path. The batch file simply calls start with its argument:

    start %

Now I can open foo.pdf from eshell with

    mystart foo.pdf

The solution above for bash

    cmd.exe /C start foo.pdf

also works from eshell.

(I just realized I said two contradictory things: that eshell does not use your path, and that it found a bash file in my path. I don’t know why the latter works. I keep my batch files in c:/bin, which is a Unix-like location, and maybe eshell looks there, not because it’s in my Windows path, but because it’s in what it would expect to be my path based on Unix conventions. I’ve searched the eshell documentation, and I don’t see how to tell what it uses for a path.)

## More shell posts

The post Opening Windows files from bash and eshell first appeared on John D. Cook.

]]>
https://www.johndcook.com/blog/2020/10/17/windows-files-from-eshell/feed/ 4
Generating all primitive Pythagorean triples with linear algebra https://www.johndcook.com/blog/2020/10/16/primitive-pythagorean-triples/ https://www.johndcook.com/blog/2020/10/16/primitive-pythagorean-triples/#comments Fri, 16 Oct 2020 13:38:46 +0000 https://www.johndcook.com/blog/?p=63187 A Pythagorean triple is a set of positive integers that can be the lengths of sides of a right triangle, i.e. numbers a, b, and c such that a² + b² = c². A primitive Pythagorean triple (PPT) is a Pythagorean triple whose elements are relatively prime. For example, (50, 120, 130) is a Pythagorean […]

The post Generating all primitive Pythagorean triples with linear algebra first appeared on John D. Cook.

]]>
A Pythagorean triple is a set of positive integers that can be the lengths of sides of a right triangle, i.e. numbers a, b, and c such that

a² + b² = c².

A primitive Pythagorean triple (PPT) is a Pythagorean triple whose elements are relatively prime. For example, (50, 120, 130) is a Pythagorean triple, but it’s not primitive because all the entries are divisible by 10. But (5, 12, 13) is a primitive Pythagorean triple.

A method of generating all PPTs has been known since the time of Euclid, but I recently ran across a different approach to generating all PPTs [1].

Let’s standardize things a little by assuming our triples have the form (a, b, c) where a is odd, b is even, and c is the hypotenuse [2]. In every PPT one of the sides is even and one is odd, so we will assume the odd side is listed first.

It turns out that all PPTs can be found by multiplying the column vector [3, 4, 5] repeatedly by matrices M0, M1, or M2. In [1], Romik uses the sequence of matrix multiplications needed to create a PPT as trinary number associated with the PPT.

The three matrices are given as follows.

Note that all three matrices have the same entries, though with different signs. If you number the columns starting at 1 (as mathematicians commonly do and computer scientists may not) then Mk is the matrix whose kth column is negative. There is no 0th column, so M0 is the matrix with no negative columns. The numbering I’ve used here differs from that used in [1].

For example, the primitive Pythagorean triple [5, 12, 13] is formed by multiplying [3, 4, 5] on the left by M2. The PPT [117, 44, 125] is formed by multiplying [3, 4, 5] by
M1 M1 M2.

## More Pythagorean posts

[1] The dynamics of Pythagorean triples by Dan Romik

[2] Either a is odd and b is even or vice versa, so we let a be the odd one.

If a and b were both even, c would be even, and the triple would not be primitive. If a and b were both odd, c² would be divisible by 2 but not by 4, and so it couldn’t be a square.

The post Generating all primitive Pythagorean triples with linear algebra first appeared on John D. Cook.

]]>
https://www.johndcook.com/blog/2020/10/16/primitive-pythagorean-triples/feed/ 1
Playing around with a rational rose https://www.johndcook.com/blog/2020/10/14/playing-around-with-a-rational-rose/ https://www.johndcook.com/blog/2020/10/14/playing-around-with-a-rational-rose/#comments Wed, 14 Oct 2020 14:55:45 +0000 https://www.johndcook.com/blog/?p=62993 A “rose” in mathematics is typically a curve with polar equation r = cos(kθ) where k is a positive integer. If k is odd, the resulting graph has k “petals” and if k is even, the plot has 2k petals. Sometimes the term rose is generalized to the case of non-integer k. This is the […]

The post Playing around with a rational rose first appeared on John D. Cook.

]]>
A “rose” in mathematics is typically a curve with polar equation

r = cos(kθ)

where k is a positive integer. If k is odd, the resulting graph has k “petals” and if k is even, the plot has 2k petals.

Sometimes the term rose is generalized to the case of non-integer k. This is the sense in which I’m using the phrase “rational rose.” I’m not referring to an awful piece of software by that name [1]. This post will look at a particular rose with k = 2/3.

My previous post looked at

r = cos(2θ/3)

and gave the plot below.

Unlike the case where k is an integer, the petals overlap.

In this post I’d like to look at two things:

1. The curvature in the figure above, and
2. Differences between polar plots in Python and Mathematica

## Curvature

The graph above has radius 1 since cosine ranges from -1 to 1. The curve is made of arcs that are approximately circular, with the radius of these approximating circles being roughly 1/2, sometimes bigger and sometimes smaller. So we would expect the curvature to oscillate roughly around 2. (The curvature of a circle of radius r is 1/r.)

The curvature can be computed in Mathematica as follows.

    numerator = D[x[t], {t, 1}] D[y[t], {t, 2}] -
D[x[t], {t, 2}] D[y[t], {t, 1}]
denominator = (D[x[t], t]^2 + D[y[t], t]^2)^(3/2)
Simplify[numerator / denominator]


This produces

A plot shows that the curvature does indeed oscillate roughly around 2.

The minimum curvature is 13/9, which the curve takes on at polar coordinate (1, 0), as well as at other points. That means that the curve starts out like a circle of radius 9/13 ≈ 0.7.

The maximum curvature is 3 and occurs at the origin. There the curve is approximately a circle of radius 1/3.

## Matplotlib vs Mathematica

To make the plot we’ve been focusing on, I plotted

r = cos(2θ/3)

in Mathematica, but in matplotlib I had to plot

r = |cos(2θ/3)|.

In both cases, θ runs from 0 to 8π. To highlight the differences in the way the two applications make polar plots, let’s plot over 0 to 2π with both.

Mathematica produces what you might expect.

    PolarPlot[Cos[2 t/3], {t, 0, 2 Pi}]

Matplotlib produces something very different. It handles negative r values by moving the point r = 0 to a circle in the middle of the plot. Notice the r-axis labels at about 22° running from -1 to 1.

    theta = linspace(0, 2*pi, 1000)
plt.polar(theta, cos(2*theta/3))


Note also that in Mathematica, the first argument to PolarPlot is r(θ) and the second is the limits on θ. In matplotlib, the first argument is θ and the second argument is r(θ).

Note that in this particular example, taking the absolute value of the function being plotted was enough to make matplotlib act like I expected. That’s only happened true when plotted over the entire range 0 to 8π. In general you have to do more work than this. If we insert absolute value in the plot above, still plotting from 0 to 2π, we do not reproduce the Mathematca plot.

    plt.polar(theta, abs(cos(2*theta/3)))

## More polar coordinate posts

[1] Rational Rose was horribly buggy when I used it in the 1990s. Maybe it’s not so buggy now. But I imagine I still wouldn’t like the UML-laden style of software development it was built around.

The post Playing around with a rational rose first appeared on John D. Cook.

]]>
https://www.johndcook.com/blog/2020/10/14/playing-around-with-a-rational-rose/feed/ 2
Quatrefoils https://www.johndcook.com/blog/2020/10/13/quatrefoils/ https://www.johndcook.com/blog/2020/10/13/quatrefoils/#comments Wed, 14 Oct 2020 02:06:21 +0000 https://www.johndcook.com/blog/?p=62971 I was reading The 99% Invisible City this evening, and there was a section on quatrefoils. Here’s an example of a quatrefoil from Wikipedia. There’s no single shape known as a quatrefoil. It’s a family of shapes that look something like the figure above. I wondered how you might write a fairly simple mathematical equation […]

The post Quatrefoils first appeared on John D. Cook.

]]>
I was reading The 99% Invisible City this evening, and there was a section on quatrefoils. Here’s an example of a quatrefoil from Wikipedia.

There’s no single shape known as a quatrefoil. It’s a family of shapes that look something like the figure above.

I wondered how you might write a fairly simple mathematical equation to draw a quatrefoil. Some quatrefoils are just squares with semicircles glued on their edges. That’s no fun.

Here’s a polar equation I came up with that looks like a quatrefoil, if you ignore the interior lines.

This is the plot of r = cos(2θ/3).

Update: Based on a suggestion in the comments, I’ve written another post on quatrefoils using an equation that has a parameter to control the shape.

The post Quatrefoils first appeared on John D. Cook.

]]>
https://www.johndcook.com/blog/2020/10/13/quatrefoils/feed/ 3
Kronecker sum https://www.johndcook.com/blog/2020/10/11/kronecker-sum/ https://www.johndcook.com/blog/2020/10/11/kronecker-sum/#comments Sun, 11 Oct 2020 23:25:11 +0000 https://www.johndcook.com/blog/?p=62850 I’m working on a project these days that uses four different kinds of matrix product, which made me wonder if there’s another kind of product out there that I could find some use for. In the process of looking around for other matrix products, I ran across the Kronecker sum. I’ve seen Kronecker products many […]

The post Kronecker sum first appeared on John D. Cook.

]]>
I’m working on a project these days that uses four different kinds of matrix product, which made me wonder if there’s another kind of product out there that I could find some use for.

In the process of looking around for other matrix products, I ran across the Kronecker sum. I’ve seen Kronecker products many times, but I’d never heard of Kronecker sums.

The Kronecker sum is defined in terms of the Kronecker product, so if you’re not familiar with the latter, you can find a definition and examples here. Essentially, you multiply each scalar element of the first matrix by the second matrix as a block matrix.

The Kronecker product of an m × n matrix A and a p × q matrix B is a mp × nq matrix KA B. You could think of K as an m × n matrix whose entries are p × q blocks.

So, what is the Kronecker sum? It is defined for two square matrices, an n × n matrix A and an m × m matrix B. The sizes of the two matrices need not match, but the matrices do need to be square.  The Kronecker sum of A and B is

AB = AIm + InB

where Im and In are identity matrices of size m and n respectively.

Does this make sense dimensionally? The left side of the (ordinary) matrix addition is nm × nm, and so is the right side, so the addition makes sense.

However, the Kronecker sum is not commutative, and usually things called “sums” are commutative. Products are not always commutative, but it goes against convention to call a non-commutative operation a sum. Still, the Kronecker sum is kinda like a sum, so it’s not a bad name.

I don’t have any application in mind (yet) for the Kronecker sum, but presumably it was defined for a good reason, and maybe I’ll run an application, maybe even on the project alluded to at the beginning.

There are several identities involving Kronecker sums, and here’s one I found interesting:

exp( A ) ⊗ exp( B ) = exp( A B ).

If you haven’t seen the exponential of a matrix before, basically you stick your matrix into the power series for the exponential function.

## Examples

First, let’s define a couple matrices A and B.

We can compute the Kronecker sums

S = AB

and

T = B ⊕ A

with Mathematica to show they are different.

    A = {{1, 2}, {3, 4}}
B = {{1, 0, 1}, {1, 2, 0}, {2, 0, 3}}
S = KroneckerProduct[A, IdentityMatrix[3]] +
KroneckerProduct[IdentityMatrix[2], B]
T = KroneckerProduct[B, IdentityMatrix[2]] +
KroneckerProduct[IdentityMatrix[3], A]


This shows

and so the two matrices are not equal.

We can compute the matrix exponentials of A and B with the Mathematica function MatrixExp to see that

(I actually used MatrixExp[N[A]] and similarly for B so Mathematica would compute the exponentials numerically rather than symbolically. The latter takes forever and it’s hard to read the result.)

Now we have

and so the two matrices are equal.

Even though the identity

exp( A ) ⊗ exp( B ) = exp( A B )

may look symmetrical, it’s not. The matrices on the left do not commute in general. And not only are AB and BA different in general, their exponentials are also different. For example

## Related posts

The post Kronecker sum first appeared on John D. Cook.

]]>
https://www.johndcook.com/blog/2020/10/11/kronecker-sum/feed/ 3