John D. Cook https://www.johndcook.com/blog Applied Mathematics Consulting Thu, 17 Sep 2020 16:03:40 +0000 en-US hourly 1 https://www.johndcook.com/blog/wp-content/uploads/2020/01/cropped-favicon_512-32x32.png John D. Cook https://www.johndcook.com/blog 32 32 There’s more going on here https://www.johndcook.com/blog/2020/09/17/theres-more-going-on-here/ https://www.johndcook.com/blog/2020/09/17/theres-more-going-on-here/#respond Thu, 17 Sep 2020 16:02:35 +0000 https://www.johndcook.com/blog/?p=61135 At a new faculty orientation, a professor encouraged us rookies to teach intro courses and to keep coming back to teach them periodically. I didn’t fully appreciate what he said at the time, though I remembered it, even though I left academia a couple years later. Now I think I have an idea what he […]

The post There's more going on here first appeared on John D. Cook.

]]>
At a new faculty orientation, a professor encouraged us rookies to teach intro courses and to keep coming back to teach them periodically. I didn’t fully appreciate what he said at the time, though I remembered it, even though I left academia a couple years later.

Now I think I have an idea what he was referring to. There’s a LOT of stuff swept under the rug, out of necessity, when teaching intro courses. The students think they’re starting at the beginning, and maybe junior faculty think the same thing, but they’re really starting in medias res.

For example, Michael Spivak’s Physics for Mathematicians makes explicit many of the implicit assumptions in a freshman mechanics class. Hardly anyone could learn physics if they had to start with Spivak. Instead, you do enough homework problems that you intuitively get a feel for things you can’t articulate and don’t fully understand. But it’s satisfying to read Spivak later and feel justified in thinking that things didn’t quite add up.

When you learn to read English, you’re told a lot of half-truths or quarter-truths. You’re told, for example, that English has 10 vowel sounds, when in reality it has more. Depending on how you count them, there are more than 20 vowel sounds in English. A child learning to read shouldn’t be burdened with a college-level course in phonetics, so it’s appropriate not to be too candid about the complexities of language at first.

It would have been easier for me to teach statistics when I was fresh out of college rather than teaching a few courses while I was working at MD Anderson. As a fresh graduate I could have taught out of a standard textbook in good conscience. By the time I did teach statistics classes, I was aware of how much material was not completely true or not practical.

I was thinking this morning about how there’s much more going on in a simple change of coordinates than is apparent at first. Tensor calculus is essentially the science of changing coordinates. It points out hidden structure, and creates conventions for making calculations manageable and for reducing errors. That’s not to say tensor calculus is easy but rather to say that changes of coordinates are hard.

Related post: Coming full circle

The post There's more going on here first appeared on John D. Cook.

]]>
https://www.johndcook.com/blog/2020/09/17/theres-more-going-on-here/feed/ 0
Superfactorial https://www.johndcook.com/blog/2020/09/14/superfactorial/ https://www.johndcook.com/blog/2020/09/14/superfactorial/#comments Tue, 15 Sep 2020 01:14:40 +0000 https://www.johndcook.com/blog/?p=60965 The factorial of a positive integer n is the product of the numbers from 1 up to and including n: n! = 1 × 2 × 3 × … × n. The superfactorial of n is the product of the factorials of the numbers from 1 up to and including n: S(n) = 1! × […]

The post Superfactorial first appeared on John D. Cook.

]]>
The factorial of a positive integer n is the product of the numbers from 1 up to and including n:

n! = 1 × 2 × 3 × … × n.

The superfactorial of n is the product of the factorials of the numbers from 1 up to and including n:

S(n) = 1! × 2! × 3! × … × n!.

For example,

S(5) = 1! 2! 3! 4! 5! = 1 × 2 × 6 × 24 × 120 = 34560.

Here are three examples of where superfactorial pops up.

Vandermonde determinant

If V is the n by n matrix whose ij entry is ij-1 then its determinant is S(n-1). For instance,

\begin{vmatrix} 1 & 1 & 1 & 1 \\ 1 & 2 & 4 & 8 \\ 1 & 3 & 9 & 27 \\ 1 & 4 & 16& 64 \end{vmatrix} = S(3) = 3!\, 2!\, 1! = 12

V is an example of a Vandermonde matrix.

Permutation tensor

One way to define the permutation symbol uses superfactorial:

\epsilon_{i_1 i_2 \cdots i_n} = \frac{1}{S(n-1)} \prod_{1 \leq k < j \leq n} (i_j - i_k)

Barnes G-function

The Barnes G-function extends superfactorial to the complex plane analogously to how the gamma function extends factorial. For positive integers n,

G(n) = \prod_{k=1}^{n-1}\Gamma(k) = S(n-2)

Here’s plot of G(x)

produced by

    Plot[BarnesG[x], {x, -2, 4}]

in Mathematica.

More posts related to factorial

The post Superfactorial first appeared on John D. Cook.

]]>
https://www.johndcook.com/blog/2020/09/14/superfactorial/feed/ 2
Symplectic Euler https://www.johndcook.com/blog/2020/09/12/symplectic-euler/ https://www.johndcook.com/blog/2020/09/12/symplectic-euler/#comments Sat, 12 Sep 2020 22:37:52 +0000 https://www.johndcook.com/blog/?p=60830 This post will look at simple numerical approaches to solving the predator-prey (Lotka-Volterra) equations. It turns out that the simplest approach does poorly, but a slight variation does much better. Following [1] we will use the equations u‘ = u (v – 2) v‘ = v (1 – u) Here u represents the predator population over […]

The post Symplectic Euler first appeared on John D. Cook.

]]>
This post will look at simple numerical approaches to solving the predator-prey (Lotka-Volterra) equations. It turns out that the simplest approach does poorly, but a slight variation does much better.

Following [1] we will use the equations

u‘ = u (v – 2)
v‘ = v (1 – u)

Here u represents the predator population over time and v represents the prey population. When the prey v increase, the predators u increase, leading to a decrease in prey, which leads to a decrease in predators, etc. The exact solutions are periodic.

Euler’s method replaces the derivatives with finite difference approximations to compute the solution in increments of time of size h. The explicit Euler method applied to our example gives

u(th) = u(t) + h u(t) (v(t) – 2)
v(th) = v(t) + h v(t) (1 – u(t)).

The implicit Euler method gives

u(th) = u(t) + h u(t + h) (v(t + h) – 2)
v(th) = v(t) + h v(t + h) (1 – u(t + h)).

This method is implicit because the unknowns, the value of the solution at the next time step, appear on both sides of the equation. This means we’d either need to do some hand calculations first, if possible, to solve for the solutions at time t + h, or use a root-finding method at each time step to solve for the solutions.

Implicit methods are more difficult to implement, but they can have better numerical properties. See this post on stiff equations for an example where implicit Euler is much more stable than explicit Euler. I won’t plot the implicit Euler solutions here, but the implicit Euler method doesn’t do much better than the explicit Euler method in this example.

It turns out that a better approach than either explicit Euler or implicit Euler in our example is a compromise between the two: use explicit Euler to advance one component and use implicit Euler on the other. This is known as symplectic Euler for reasons I won’t get into here but would like to discuss in a future post.

If we use explicit Euler on the predator equation but implicit Euler on the prey equation we have

u(th) = u(t) + h u(t) (v(t + h) – 2)
v(th) = v(t) + h v(t + h) (1 – u(t)).

Conversely, if we use implicit Euler on the predator equation but explicit Euler on the prey equation we have

u(th) = u(t) + h u(t + h) (v(t) – 2)
v(th) = v(t) + h v(t) (1 – u(t + h)).

Let’s see how explicit Euler compares to either of the symplectic Euler methods.

First some initial setup.

    import numpy as np

    h  = 0.08  # Step size
    u0 = 6     # Initial condition
    v0 = 2     # Initial condition
    N  = 100   # Numer of time steps

    u = np.empty(N)
    v = np.empty(N)
    u[0] = u0
    v[0] = v0

Now the explicit Euler solution can be computed as follows.

    for n in range(N-1):
        u[n+1] = u[n] + h*u[n]*(v[n] - 2)
        v[n+1] = v[n] + h*v[n]*(1 - u[n])

The two symplectic Euler solutions are

    
    for n in range(N-1):
        v[n+1] = v[n]/(1 + h*(u[n] - 1))
        u[n+1] = u[n] + h*u[n]*(v[n+1] - 2)

and

    for n in range(N-1):
        u[n+1] = u[n] / (1 - h*(v[n] - 2))
        v[n+1] = v[n] + h*v[n]*(1 - u[n+1])

Now let’s see what our solutions look like, plotting (u(t), v(t)). First explicit Euler applied to both components:

And now the two symplectic methods, applying explicit Euler to one component and implicit Euler to the other.

Next, let’s make the step size 10x smaller and the number of steps 10x larger.

Now the explicit Euler method does much better, though the solutions are still not quite periodic.

The symplectic method solutions hardly change. They just become a little smoother.

More differential equations posts

[1] Ernst Hairer, Christian Lubich, Gerhard Wanner. Geometric Numerical Integration: Structure-Preserving Algorithms for Ordinary Differential Equations.

The post Symplectic Euler first appeared on John D. Cook.

]]>
https://www.johndcook.com/blog/2020/09/12/symplectic-euler/feed/ 2
Identifying someone from their heart beat https://www.johndcook.com/blog/2020/09/11/identification-ekg-ecg/ https://www.johndcook.com/blog/2020/09/11/identification-ekg-ecg/#respond Fri, 11 Sep 2020 13:35:13 +0000 https://www.johndcook.com/blog/?p=60772 How feasible would it be to identify someone based from electrocardiogram (EKG, ECG) data? (Apparently the abbreviation “EKG” is more common in America and “ECG” is more common in the UK.) Electrocardiograms are unique, but unique doesn’t necessarily mean identifiable. Unique data isn’t identifiable without some way to map it to identities. If you shuffle […]

The post Identifying someone from their heart beat first appeared on John D. Cook.

]]>
electrocardiogram of a toddler

How feasible would it be to identify someone based from electrocardiogram (EKG, ECG) data? (Apparently the abbreviation “EKG” is more common in America and “ECG” is more common in the UK.)

Electrocardiograms are unique, but unique doesn’t necessarily mean identifiable. Unique data isn’t identifiable without some way to map it to identities. If you shuffle a deck of cards, you will probably produce an arrangement that has never occurred before. But without some sort of registry mapping card deck orders to their shufflers, there’s no chance of identification. (For identification, you’re better off dusting the cards for fingerprints, because there are registries of fingerprints.)

According to one survey [1], researchers have tried a wide variety of methods for identifying people from electrocardiograms. They’ve used time-domain features such as peak amplitudes, slopes, variances, etc., as well as a variety of frequency-domain (DFT) features. It seems that all these methods work moderately well, but none are great, and there’s no consensus regarding which approach is best.

If you have two EKGs on someone, how readily can you tell that they belong to the same person? The answer depends on the size of the set of EKGs you’re comparing it to. The studies surveyed in [1] do some sort of similarity search, comparing a single EKG to tens of candidates. The methods surveyed had an overall success rate of around 95%. But these studies were based on small populations; at least at the time of publication no one had looked at matching an single EKG against thousands of possible matches.

In short, an electrocardiogram can identify someone with high probability once you know that they belong to a relatively small set of people for which you you have electrocardiograms.

More identification posts

[1] Antonio Fratini et al. Individual identification via electrocardiogram analysis. Biomed Eng Online. 2015; 14: 78. doi 10.1186/s12938-015-0072-y

The post Identifying someone from their heart beat first appeared on John D. Cook.

]]>
https://www.johndcook.com/blog/2020/09/11/identification-ekg-ecg/feed/ 0
Overestimating the number of idiots https://www.johndcook.com/blog/2020/09/09/overestimating-idiots/ https://www.johndcook.com/blog/2020/09/09/overestimating-idiots/#comments Wed, 09 Sep 2020 14:41:16 +0000 https://www.johndcook.com/blog/?p=60570 A comment on one of my recent blog posts on Gray codes led me to an article by Mark Dominus Gray code at the pediatrician’s office, which led me to his article explaining why the pediatrician used what was apparently an unnecessarily sophisticated piece of equipment. Mark segues from appreciating the pediatrician’s stadiometer purchase to […]

The post Overestimating the number of idiots first appeared on John D. Cook.

]]>
A comment on one of my recent blog posts on Gray codes led me to an article by Mark Dominus Gray code at the pediatrician’s office, which led me to his article explaining why the pediatrician used what was apparently an unnecessarily sophisticated piece of equipment.

Mark segues from appreciating the pediatrician’s stadiometer purchase to appreciating source code that he initially thought was idiotic.

Time and time again people would send me perfectly idiotic code, and when I asked why they had done it that way the answer was not that they were idiots, but that there was some issue I had not appreciated, some problem they were trying to solve that was not apparent. … These appeared at first to be insane, but on investigation turned out to be sane but clumsy. … [A]ssume that bad technical decisions are made rationally, for reasons that are not apparent.

The last sentence deserves to be widely used. I’d suggest calling it Dominus’s law, but unfortunately Mark’s name ends in “s”, and that lowers the probability of a possessive form of his name catching on as an eponymous law. However, there is a Gauss’s law and a few other similar examples, so maybe the name will catch on.

The post Overestimating the number of idiots first appeared on John D. Cook.

]]>
https://www.johndcook.com/blog/2020/09/09/overestimating-idiots/feed/ 3
Inverse Gray code https://www.johndcook.com/blog/2020/09/08/inverse-gray-code/ https://www.johndcook.com/blog/2020/09/08/inverse-gray-code/#respond Tue, 08 Sep 2020 23:08:10 +0000 https://www.johndcook.com/blog/?p=60526 The previous post looked at Gray code, a way of encoding digits so that the encodings of consecutive integers differ in only bit. This post will look at how to compute the inverse of Gray code. The Gray code of a non-negative integer n is given by def gray(n): return n ^ (n >> 1) […]

The post Inverse Gray code first appeared on John D. Cook.

]]>
The previous post looked at Gray code, a way of encoding digits so that the encodings of consecutive integers differ in only bit. This post will look at how to compute the inverse of Gray code.

The Gray code of a non-negative integer n is given by

    def gray(n):
        return n ^ (n >> 1)

Bit-level operations

In case you’re not familiar with the notation, the >> operator shifts the bits of its argument. The code above shifts the bits of n one place to the right. In the process, the least significant bit falls off the end. We could replace n >> 1 with n // 2 if we like, i.e. integer division by 2, rounding down if n is odd. The ^ operator stands for XOR, exclusive OR. A bit of x ^ y is 0 if both corresponding bits in x and y are the same, and 1 if they are different.

Computing the inverse

The inverse of Gray code is a more complicated. If we assume n < 232, then we can compute the inverse Gray code of n by

    def inverse_gray32(n):
        assert(0 <= n < 2**32) n = n ^ (n >> 1)
        n = n ^ (n >> 2)
        n = n ^ (n >> 4)
        n = n ^ (n >> 8)
        n = n ^ (n >> 16)
        return n

For n of any size, we can compute its inverse Gray code by

    def inverse_gray(n):
        x = n
        e = 1
        while x:
            x = n >> e
            e *= 2
            n = n ^ x
        return n

If n is a 32-bit integer, inverse_gray32 is potentially faster than inverse_gray because of the loop unrolling.

Plots

Here’s a plot of the Gray code function and its inverse.

Proof via linear algebra

How do we know that what we’re calling the inverse Gray code really is the inverse? Here’s a proof for 32-bit integers n.

    def test_inverse32():
        for i in range(32):
            x = 2**i
            assert(inverse_gray32(gray(x)) == x)
            assert(gray(inverse_gray32(x)) == x)

How is that a proof? Wouldn’t you need to try all possible 32-bit integers if you wanted a proof by brute force?

If we think of 32-bit numbers as vectors in a 32-dimensional vector space over the binary field, addition is defined by XOR. So XOR is linear by definition. It’s easy to see that shifts are also linear, and the composition of linear functions is linear. This means that gray and inverse_gray32 are linear transformations. If the two linear transformations are inverses of each other on the elements of a basis, they are inverses everywhere. The unit basis vectors in our vector space are simply the powers of 2.

Matrix representation

Because Gray code and its inverse are linear transformations, they can be defined by matrix multiplication (over the binary field). So we could come up with 32 × 32 binary matrices for Gray code and its inverse. Matrix multiplication would give us a possible, but inefficient, way to implement these functions. Alternatively, you could think of the code above as clever ways to implement multiplication by special matrices very efficiently!

OK, so what are the matrices? For n-bit numbers, the matrix giving the Gray code transformation has dimension n by n. It has 1’s on the main diagonal, and on the diagonal just below the main diagonal, and 0s everywhere else. The inverse of this matrix, the matrix for the inverse Gray code transformation, has 1s on the main diagonal and everywhere below.

Here are the matrices for n = 4.

\begin{bmatrix} 1 & 0 & 0 & 0 \\ 1 & 1 & 0 & 0 \\ 0 & 1 & 1 & 0 \\ 0 & 0 & 1 & 1 \\ \end{bmatrix} \begin{bmatrix} 1 & 0 & 0 & 0 \\ 1 & 1 & 0 & 0 \\ 1 & 1 & 1 & 0 \\ 1 & 1 & 1 & 1 \\ \end{bmatrix} = \begin{bmatrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \\ \end{bmatrix}

The matrix on the left is for Gray code, the next matrix is for inverse Gray code, and the last matrix is the identity. NB: The equation above only holds when you’re working over the binary field, i.e. addition is carried out mod 2, so 1 + 1 = 0.

To transform a number, represent it as a vector of length n, with the least significant in the first component, and multiply by the appropriate matrix.

Relation to popcount

It’s easy to see by induction that a number is odd if and only if its Gray code has an odd number of 1. The number 1 is its own Gray code, and as we move from the Gray code of n to the Gray code of n+1 we change one bit, so we change the parity of the the number of 1s.

There’s a standard C function popcount that counts the number of 1’s in a number’s binary representation, and the last bit of the popcount is the parity of the number of 1s. I blogged about this before here. If you look at the code at the bottom of that post, you’ll see that it’s the same as gray_inverse32.

The code in that post works because you can compute whether a word has an odd or even number of 1s by testing whether it is the Gray code of an odd or even number.

The post Inverse Gray code first appeared on John D. Cook.

]]>
https://www.johndcook.com/blog/2020/09/08/inverse-gray-code/feed/ 0
Gray code https://www.johndcook.com/blog/2020/09/08/gray-code/ https://www.johndcook.com/blog/2020/09/08/gray-code/#comments Tue, 08 Sep 2020 12:56:20 +0000 https://www.johndcook.com/blog/?p=60506 Suppose you want to list the numbers from 0 to N in such a way that only one bit at a time changes between consecutive numbers. It’s not clear that this is even possible, but in fact it’s easy using Gray code, a method named after Frank Gray. To convert an integer to its Gray […]

The post Gray code first appeared on John D. Cook.

]]>
Suppose you want to list the numbers from 0 to N in such a way that only one bit at a time changes between consecutive numbers. It’s not clear that this is even possible, but in fact it’s easy using Gray code, a method named after Frank Gray.

To convert an integer to its Gray code, take the XOR of the number with the number shifted right by 1.

    def gray(n): return n ^ n >> 1

Here’s a little Python code to demonstrate this.

    for i in range(8): print(format(gray(i), '03b'))

produces

    000
    001
    011
    010
    110
    111
    101
    100

Note that the numbers from 0 to 7 appear once, and each differs in exactly one bit from the numbers immediately above and below.

Let’s visualize the bits in Gray code with black squares for 1s and white squares for 0s.

The code that produced the plot is very brief.

    N = 4
    M = 2**N
    for i in range(N):
        for j in range(M):
            if gray(j) & 1<<i: # ith bit set
                plt.fill_between([j, j+1], i, i+1, color="k")
    plt.axes().set_aspect(1)
    plt.ylabel("Gray code bit")

The bits are numbered from least significant to most significant, starting with 1.

If you want to see the sequence carried out further, increase N. If you do, you may want to comment out the line that sets the aspect ratio to be square because otherwise your plot will be much longer than tall.

Note that the Gray code of 2n-1 is 2n-1, i.e. it only has one bit set. So if you were to wrap the sequence around, making a cylinder, you’d only change one bit at a time while going around the cylinder.

This could make a simple classroom activity. Students could reproduce the plot above by filling in squares on graph paper, then cut out the rectangle and tape the ends together.

See the next post for how to compute the inverse Gray code.

Related posts

The post Gray code first appeared on John D. Cook.

]]>
https://www.johndcook.com/blog/2020/09/08/gray-code/feed/ 3
Memorizing numbers and enumerating possibilities https://www.johndcook.com/blog/2020/09/07/major-system/ https://www.johndcook.com/blog/2020/09/07/major-system/#respond Mon, 07 Sep 2020 20:00:47 +0000 https://www.johndcook.com/blog/?p=60482 This post will illustrate two things: how to memorize numbers, and how to enumerate products of sets in Python. Major system There’s a way of memorizing numbers by converting digits to consonant sounds, then adding vowels to make memorable words. It’s called the “major” mnemonic system, though it’s not certain where the system or the […]

The post Memorizing numbers and enumerating possibilities first appeared on John D. Cook.

]]>
This post will illustrate two things: how to memorize numbers, and how to enumerate products of sets in Python.

Major system

There’s a way of memorizing numbers by converting digits to consonant sounds, then adding vowels to make memorable words. It’s called the “major” mnemonic system, though it’s not certain where the system or the name came from.

I learned the major system as a child, but never used it. I thought about it more recently when I ran across the article Harry Potter and the Mnemonic Major System by Kris Fris.

Here is the encoding system in the form of a Python dictionary, using IPA symbols for the consonant sounds.

    digit_encodings = {
        0: ['s', 'z'],
        1: ['t', 'd', 'ð', 'θ'],
        2: ['n', 'ŋ'],
        3: ['m'],
        4: ['r'],
        5: ['l'],
        6: ['ʤ', 'ʧ', 'ʃ', 'ʒ'],
        7: ['k', 'g'],
        8: ['f', 'v'],
        9: ['p', 'b']
    }

The method is based on consonant sounds, not spelling. For example, the word “circle” would be a possible encoding of the number 0475. Note that the soft ‘c’ in circle encodes a 0 and the hard ‘c’ encodes a 7.

It’s curious that some digits have only one associated consonant sound, while others have up to four. The method was not originally designed for English, at least not modern English, and there may be some historical vestiges of other languages in the system. Even so, the sounds are fairly evenly spread out phonetically. It may not be optimal for English speakers, but some care went into designing it.

For more on the major system, see its Wikipedia page.

Enumerating set products

There are multiple consonant sound choices for a given digit, and so it would be natural to enumerate the possibilities when searching for a suitable encoding of a number.

If you only ever worked with two digit numbers, you could use a pair of for loops. But then if you wanted to work with three digit numbers, you’d need three nested for loops. For a fixed number of digits, this is messy, and for a variable number of digits it’s unworkable.

The function product from itertools gives an elegant solution. Pass in any number of iterable objects, and it will enumerate their Cartesian product.

    from itertools import product

    def enumerate_ipa(number):
        encodings = [digit_encodings[int(d)] for d in str(number)]
        for p in product(*encodings):
            print("".join(p))

In the code above, encodings is a list of lists of phonetic symbols. Unfortunately product takes lists as its argument, not a list of lists. The * operator takes care of unpacking encodings into the form product wants.

Example

Suppose you wanted to use the major system to memorize

1/π = .31830988

I’m not recommending that you memorize this number, or that you use the major system if you do want to memorize it, but let’s suppose that’s what you’d like to do.

You could break the digits into groups however you like, but here’s what you get if you divide them into 318, 309, and 88.

The digits in 318 could be encoded as mtf, mtv, mdf, mdv, mðf, mðv, mθf, or mθv.

The digits in 309 could be encoded as msp, msb, mzp, or mzb.

The digits in 88 could be encoded as ff, fv, vf, or vv.

So one possibility would be to encode 31830988 as “midwife mishap fife.” You could think of a pie that got turned upside down. It was the midwife’s mishap, knocked over while she was playing a fife.

Notice that the “w” sound in “midwife” doesn’t encode anything, nor does the “h” sound in “mishap.”

Related posts

The post Memorizing numbers and enumerating possibilities first appeared on John D. Cook.

]]>
https://www.johndcook.com/blog/2020/09/07/major-system/feed/ 0
Looking at the bits of a Unicode (UTF-8) text file https://www.johndcook.com/blog/2020/09/06/unicode-file-bits/ https://www.johndcook.com/blog/2020/09/06/unicode-file-bits/#respond Sun, 06 Sep 2020 20:38:02 +0000 https://www.johndcook.com/blog/?p=60431 Suppose you type a little text into a text file, say “123”. If you open this file in a hex editor you’ll see 3132 33 because the ASCII value for the character ‘1’ is 0x31 in hex, ‘2’ corresponds to 0x32, and ‘3’ corresponds to 0x33. If your file is saved as utf-8 rather than […]

The post Looking at the bits of a Unicode (UTF-8) text file first appeared on John D. Cook.

]]>
Suppose you type a little text into a text file, say “123”. If you open this file in a hex editor you’ll see

    3132 33

because the ASCII value for the character ‘1’ is 0x31 in hex, ‘2’ corresponds to 0x32, and ‘3’ corresponds to 0x33. If your file is saved as utf-8 rather than ASCII, it makes absolutely no difference, as long as the file is UTF-8 encoded. By design, UTF-8 is backward compatible with the first 128 ASCII characters.

Next, let’s add some Greek letters. Now our file contains “123 αβγ”. The lower-case Greek alphabet starts at 0x03B1, so these three characters are 0x03B1, 0x03B2, and 0x03B3. Now let’s look at the file in our hex editor.

    3132 3320 CEB1 CEB2 CEB3

The B1, B2, and B3 look familiar, but why do they have “CE” in front rather than “03”? This has to do with the details of UTF-8 encoding. If we looked at the same file with UTF-16 encoding, representing each character with 16 bits, the results look more familiar.

    FEFF 0031 0032 0033 0020 03B1 03B2 03B3

So our ASCII characters—1, 2, 3, and space—are padded with a couple zeros, and we see the Unicode values of our Greek letters as we expect. But what’s the FEFF at the beginning? That’s a byte order mark (BOM) that my text editor inserted. This is an invisible marker saying that the bytes are stored in big-endian mode.

Going back to UTF-8, the ASCII characters are more compact, i.e. no zero padding, but why to the Greek letters start with “CE”?

    3132 3320 CEB1 CEB2 CEB3

As I go into detail here, UTF-8 is a clever way to save space when representing mostly ASCII text. Since ASCII bytes start with 0, a byte starting with 1 signals that something special is happening and that the following bytes are to be interpreted differently.

In binary, 0xCE expands to

    11001110

I’ll color-code the bits to make it easier to talk about them.

    1 1 0 01110

The first 1 says that this byte does not simply represent a single character but is part of the encoding of a sequence of bytes encoding a character. The first 1 and the first 0, colored red, are bookends. The number of 1s in between, colored blue, says how many of the next bytes are part of this character. The bits after the first 0, colored black, are part of the character, and the rest follow in the next byte.

The continuation bytes begin with 10, and the remaining six bits are parts of a character. You know they’re not the start of a new character because there are no 1s between the first 1 and the first 0. With UTF-8, you can look at a byte in isolation and know whether it is an ASCII character, the beginning of a non-ASCII character, or the continuation of a non-ASCII character.

So now let’s look at 0xCEB1, with some spaces and colors added.

    1 1 0 01110 10 110001

The black bits, 01110110001, are the bits of our character, and the binary number 1110110001 is 0x03B1 in hex. So we get the Unicode value for α. Similarly the rest of the bytes encode β and γ.

It’s was a coincidence that the last two hex characters of our Greek letters were recognizable in the hex dump of the UTF-8 encoding. We’ll always see the last hex character of the Unicode value in the hex dump, but not always the last two.

For another example, let’s look at a higher Unicode value, U+FB31. This is בּ, the Hebrew letter bet with a dot in the middle. This shows up in a hex editor as

    EFAC B1

or in binary as

    111011111010110010110001

Let’s break this up as before.

    1 11 0 1111 10 101100 10 110001

The first bit is a 1, so we know we have some decoding to do. There are two 1s, colored blue, between the first 1 and the first 0, colored red. This says that the bits for our character, colored black, are stored in the remainder of the first byte and in the following two bytes.

So the bits of our character are

    1111101100110001

which in hex is 0xFB31, the Unicode value of our character.

More Unicode posts

The post Looking at the bits of a Unicode (UTF-8) text file first appeared on John D. Cook.

]]>
https://www.johndcook.com/blog/2020/09/06/unicode-file-bits/feed/ 0
How much do you really use? https://www.johndcook.com/blog/2020/09/03/how-much-do-you-really-use/ https://www.johndcook.com/blog/2020/09/03/how-much-do-you-really-use/#comments Thu, 03 Sep 2020 13:53:02 +0000 https://www.johndcook.com/blog/?p=60263 I’ve been doing a little introspection lately about what software I use, not at an application level but at a feature level. LaTeX It started with looking at what parts of LaTeX I use. I wrote about this in April, and I revisited it this week in response to some client work [1]. LaTeX is […]

The post How much do you really use? first appeared on John D. Cook.

]]>
I’ve been doing a little introspection lately about what software I use, not at an application level but at a feature level.

LaTeX

It started with looking at what parts of LaTeX I use. I wrote about this in April, and I revisited it this week in response to some client work [1]. LaTeX is second nature for me, so I figure that by now I’ve used a lot of its features.

Except I haven’t. When I searched my hard disk I found I’ve used a few hundred commands and a couple dozen packages. I’m fluent in LaTeX after using it for most of my life, but I don’t know it exhaustively. Far from it. Also, Pareto stuck his head in yet again: the 20% of commands I use most often account for about 80% of commands in my files.

Python

So next I started looking at what parts of Python I use by searching for import statements. I make heavy use of the usual suspects for applied math — numpy, scipy, sympy, matplotlib, etc. — but other than those I don’t use a lot of packages.

I was a little surprised to see that collections and functools are the non-mathematical packages I’ve used the most. A lot of the packages I’ve used have been hapax legomena, specialized packages I use one time.

Other tools

I imagine I’d get similar results if I looked at what parts of other programming languages and applications I use often. And for things I use less often, the percentage of features I use must be tiny.

When I needed to learn SQL, I took at look at the language standard. There were a bazillion statements defined in the standard, of which I may have used 10 by now. I imagine there are database gurus who haven’t used more than a few dozen statements.

Encouragement

I find all this encouraging because it means the next big, intimidating thing I need to learn probably won’t be that big in practice.

If you need to learn a new programming language and see that the “nutshell” book on the language is 1,500 pages, don’t be discouraged. The part you need to know may be fragments that amount to 30 pages, though it’s impossible to know in advance which 30 pages they will be.

Related posts

[1] I have a client that does a lot with org-mode. Rather than writing LaTeX files directly, they write org-mode files and export them to LaTeX. Although you can use LaTeX generously inside org-mode, you do have to do a few things a little differently. But it has its advantages.

  • Org has lighter markup and is easier to read.
  • You can run code inside an org file and have the source and/or the results inserted into the document.
  • Org cleans up after itself, deleting .aux files etc.
  • Org will automatically rerun the LaTeX compiler if necessary to get cross references right.
  • The outline features in org give you code folding.
  • You can export the same source to HTML or plain text if you’d like.

The post How much do you really use? first appeared on John D. Cook.

]]>
https://www.johndcook.com/blog/2020/09/03/how-much-do-you-really-use/feed/ 2
Make boring work harder https://www.johndcook.com/blog/2020/09/01/make-boring-work-harder/ https://www.johndcook.com/blog/2020/09/01/make-boring-work-harder/#comments Tue, 01 Sep 2020 13:28:06 +0000 https://www.johndcook.com/blog/?p=60130 I was searching for something this morning and ran across several pages where someone blogged about software they wrote to help write their dissertations. It occurred to me that this is a pattern: I’ve seen a lot of writing tools that came out of someone writing a dissertation or some other book. The blog posts […]

The post Make boring work harder first appeared on John D. Cook.

]]>

I was searching for something this morning and ran across several pages where someone blogged about software they wrote to help write their dissertations. It occurred to me that this is a pattern: I’ve seen a lot of writing tools that came out of someone writing a dissertation or some other book.

The blog posts leave the impression that the tools required more time to develop than they would save. This suggests that developing the tools was a form of moral compensation, procrastinating by working on something that feels like it’s making a contribution to what you ought to be doing.

Even so, developing the tools may have been a good idea. As with many things in life, it makes more sense when you ask “Compared to what“? If the realistic alternative to futzing around with scripts was to write another chapter of the dissertation, then developing the tools was not the best use of time, assuming they don’t actually save more time than they require.

But if the realistic alternative was binge watching some TV series, then writing the tools may have been a very good use of time. Any time the tools save is profit if the time that went into developing them would have otherwise have been wasted.

Software developers are often criticized for developing tools rather than directly developing the code they’re paid to write. Sometimes these tools really are a good investment. But even when they’re not, they may be better than the realistic alternative. They may take time away from Facebook rather than time away from writing production code.

Another advantage to tool building, aside from getting some benefit from time that otherwise would have been wasted, is that it builds momentum. If you can’t bring yourself to face the dissertation, but you can bring yourself to write a script for writing your dissertation, you might feel more like facing the dissertation afterward.

Related post: Automate to save mental energy, not time

The post Make boring work harder first appeared on John D. Cook.

]]>
https://www.johndcook.com/blog/2020/09/01/make-boring-work-harder/feed/ 5
Sum of divisor powers https://www.johndcook.com/blog/2020/08/30/sum-of-divisor-powers/ https://www.johndcook.com/blog/2020/08/30/sum-of-divisor-powers/#comments Mon, 31 Aug 2020 02:01:39 +0000 https://www.johndcook.com/blog/?p=60021 The function σk takes an integer n and returns the sum of the kth powers of divisors of n. For example, the divisors of 14 are 1, 2, 4, 7, and 14. If we set k = 3 we get σ3(n) = 1³ + 2³ + 4³ + 7³ + 14³ = 3096. A couple […]

The post Sum of divisor powers first appeared on John D. Cook.

]]>
The function σk takes an integer n and returns the sum of the kth powers of divisors of n. For example, the divisors of 14 are 1, 2, 4, 7, and 14. If we set k = 3 we get

σ3(n) = 1³ + 2³ + 4³ + 7³ + 14³ = 3096.

A couple special cases may use different notation.

  • σ1(n) is the sum of the divisors of n and the function is usually written σ(n) with no subscript.

In Python you can compute σk(n) using divisor_sigma from SymPy. You can get a list of the divisors of n using the function divisors, so the bit of code below illustrates that divisor_sigma computes what it’s supposed to compute.

    n, k = 365, 4
    a = divisor_sigma(n, k)
    b = sum(d**k for d in divisors(n))
    assert(a == b)

The Wikipedia article on σk gives graphs for k = 1, 2, and 3 and these graphs imply that σk gets smoother as k increases. Here is a similar graph to those in the article.

The plots definitely get smoother as k increases, but the plots are not on the same vertical scale. In order to make the plots more comparable, let’s look at the kth root of σk(n). This amounts to taking the Lebesgue k norm of the divisors of n.

Now that the curves are on a more similar scale, let’s plot them all on a single plot rather than in three subplots.

If we leave out k = 1 and add k = 4, we get a similar plot.

The plot for k = 2 that looked smooth compared to k = 1 now looks rough compared to k = 3 and 4.

The post Sum of divisor powers first appeared on John D. Cook.

]]>
https://www.johndcook.com/blog/2020/08/30/sum-of-divisor-powers/feed/ 2
An almost integer pattern in many bases https://www.johndcook.com/blog/2020/08/30/almost-integer/ https://www.johndcook.com/blog/2020/08/30/almost-integer/#respond Sun, 30 Aug 2020 17:20:14 +0000 https://www.johndcook.com/blog/?p=59845 A few days ago I mentioned a passing comment in video by Richard Boucherds. This post picks up on another off-hand remark from that post. Boucherds was discussing why exp(π √67) and exp(π √163) are nearly an integer. exp(π √67) = 147197952744 – ε1 exp(π √163) = 262537412640768744 – ε2 where ε1 and ε2 and […]

The post An almost integer pattern in many bases first appeared on John D. Cook.

]]>
A few days ago I mentioned a passing comment in video by Richard Boucherds. This post picks up on another off-hand remark from that post.

Boucherds was discussing why exp(π √67) and exp(π √163) are nearly an integer.

exp(π √67) = 147197952744 – ε1
exp(π √163) = 262537412640768744 – ε2

where ε1 and ε2 and on the order of 10-6 and 10-12 respectively.

He called attention to the 744 at the end and commented that this isn’t just an artifact of base 10. In many other bases, these numbers end in that bases’ representation of 744. This is what I wanted to expand on.

To illustrate Boucherds’ remark in hexadecimal, note that

    147197952744 -> 0x2245ae82e8 
    262537412640768744 -> 0x3a4b862c4b402e8 
    744 -> 0x2e8 

Boucherds’ comment is equivalent to saying

147197952744 = 744 mod m

and

262537412640768744 = 744 mod m

for many values of m. Equivalently 147197952000 and 262537412640768000 have a lot of factors; every factor of these numbers is a base where the congruence holds.

So for how many bases m are these numbers congruent to 744?

The number of factors of a number n is written d(n). This is a multiplicative function, meaning that for relatively prime numbers a and b,

d(ab) = d(a) d(b).

Note that

147197952000 = 215 33 53 113

and

262537412640768000 = 218 33 53 233 293

It follows that

d(147197952000) =
d(215 33 53 113) =
d(215) d(33) d(53) d(113).

Now for any prime power pk

d(pk) = k + 1,

and so

d(147197952000) = 16 × 4 × 4 × 4.

Similarly

d(262537412640768000) = 19 × 4 × 4 × 4 × 4.

For more about almost integers, watch Boucherds’ video and see this post.

The post An almost integer pattern in many bases first appeared on John D. Cook.

]]>
https://www.johndcook.com/blog/2020/08/30/almost-integer/feed/ 0
Org entities https://www.johndcook.com/blog/2020/08/28/org-entities/ https://www.johndcook.com/blog/2020/08/28/org-entities/#comments Fri, 28 Aug 2020 14:03:31 +0000 https://www.johndcook.com/blog/?p=59897 This morning I found out that Emacs org-mode has its own markdown entities, analogous to HTML entities or LaTeX commands. Often they’re identical to LaTeX commands. For example, \approx is the approximation symbol ≈, exactly as in LaTeX. So what’s the advantage of org-entities? In fact, how does Emacs even know whether \approx is a […]

The post Org entities first appeared on John D. Cook.

]]>
This morning I found out that Emacs org-mode has its own markdown entities, analogous to HTML entities or LaTeX commands. Often they’re identical to LaTeX commands. For example, \approx is the approximation symbol ≈, exactly as in LaTeX.

So what’s the advantage of org-entities? In fact, how does Emacs even know whether \approx is a LaTeX command or an org entity?

If you use the command C-c C-x \ , Emacs will show you the compiled version of the entity, i.e. ≈ rather than the command \approx. This is global: all entities are displayed. The org entities would be converted to symbols if you export the file to HTML or LaTeX, but this gives you a way to see the symbols before exporting.

Here something that’s possibly surprising, possibly useful. The symbol you see is for display only. If you copy and paste it to another program, you’ll see the entity text, not the symbol. And if you C-c C-x \ again, you’ll see the command again, not the symbol; Note that the full name of the command is org-toggle-pretty-entities with “toggle” the middle.

If you use set-input-method to enter symbols using LaTeX commands or HTML entities as I described here, Emacs inserts a Unicode character and is irreversible. Once you type the LaTeX command \approx or the corresponding HTML entity &asymp;, any knowledge of how that character was entered is lost. So org entities are useful when you want to see Unicode characters but want your source file to remain strictly ASCII.

Incidentally, there are org entities for Hebrew letters, but only the first four, presumably because these are the only ones used as mathematical symbols.

To see a list of org entities, use the command org-entities-help. Even if you never use org entities, the org entity documentation makes a nice reference for LaTeX commands and HTML entities. Here’s a screenshot of the first few lines.

First few lines of org-entities-help

Related posts

The post Org entities first appeared on John D. Cook.

]]>
https://www.johndcook.com/blog/2020/08/28/org-entities/feed/ 2
How big is the monster? https://www.johndcook.com/blog/2020/08/27/how-big-is-the-monster/ https://www.johndcook.com/blog/2020/08/27/how-big-is-the-monster/#comments Thu, 27 Aug 2020 12:25:04 +0000 https://www.johndcook.com/blog/?p=59831 Symmetries are captured by mathematical groups. And just as you can combine kinds symmetry to form new symmetries, you can combine groups to form new groups. So-called simple groups are the building blocks of groups as prime numbers are the building blocks of integers [1]. Finite simple groups have been fully classified, and they fall […]

The post How big is the monster? first appeared on John D. Cook.

]]>
Voyager 1 photo of Jupiter. Jan. 6, 1979

Symmetries are captured by mathematical groups. And just as you can combine kinds symmetry to form new symmetries, you can combine groups to form new groups.

So-called simple groups are the building blocks of groups as prime numbers are the building blocks of integers [1].

Finite simple groups have been fully classified, and they fall into several families, with 26 exceptions that fall outside any of these families. The largest of these exceptional groups is called the monster.

The monster is very large, containing approximately 8 × 1053 elements. I saw a video by Richard Boucherds where he mentioned in passing that the number of elements in the group is a few orders of magnitude larger than the number of atoms in earth.

I tried to find a comparison that is closer to 8 × 1053 and settled on the number of atoms in Jupiter.

The mass of Jupiter is about 2 × 1027 kg. The planet is roughly 3/4 hydrogen and 1/4 helium by mass, and from that you can calculate that Jupiter contains about 1054 atoms.

Before doing my calculation with Jupiter, I first looked at lengths such as the diameter of the Milky Way in angstroms. But even the diameter of the observable universe in angstroms is far too small, only about 9 × 1036.

More on finite simple groups

[1] The way you build up groups from simple groups is more complicated than the way you build integers out of primes, but there’s still an analogy there.

The photo at the top of the post was taken by Voyager 1 on January 6, 1979.

The post How big is the monster? first appeared on John D. Cook.

]]>
https://www.johndcook.com/blog/2020/08/27/how-big-is-the-monster/feed/ 1
Square waves and cobwebs https://www.johndcook.com/blog/2020/08/26/square-waves-and-cobwebs/ https://www.johndcook.com/blog/2020/08/26/square-waves-and-cobwebs/#respond Wed, 26 Aug 2020 12:41:28 +0000 https://www.johndcook.com/blog/?p=59778 This is a follow-up to yesterday’s post. In that post we looked at iterates of the function f(x) = exp( sin(x) ) and noticed that even iterations of f converged to a square wave. Odd iterates also converge to a square wave, but a different one. The limit of odd iterations is the limit of […]

The post Square waves and cobwebs first appeared on John D. Cook.

]]>
This is a follow-up to yesterday’s post. In that post we looked at iterates of the function

f(x) = exp( sin(x) )

and noticed that even iterations of f converged to a square wave. Odd iterates also converge to a square wave, but a different one. The limit of odd iterations is the limit of even iterations turned upside down.

Jan Van lint correctly pointed out in the comments

If you plot y=f(f(x)) and y=x, you can see that there are two stable fixed points, with one unstable fixed point in between.

Here’s the plot he describes.

You can see that there are fixed points between 1.5 and 2.0, between 2.0 and 2.5, and between 2.5 and 3. The following Python code will find these fixed points for us.

    from numpy import exp, sin
    from scipy.optimize import brentq

    def f(x): return exp(sin(x))
    def g(x): return f(f(x)) - x

    brackets = [(1.5, 2,0), (2.0, 2.5), (2.5, 3)]
    roots = [brentq(g, *b) for b in brackets]

This shows that the fixed points of g are

    1.514019042996987
    2.219107148913746
    2.713905124458644

If we apply f to each of these fixed points, we get the same numbers again, but in the opposite order. This is why the odd iterates and even iterates are upside-down from each other.

Next we’ll show a couple cobweb plots to visualize the convergence of the iterates. We’ll also show that the middle fixed point is unstable by looking at two iterations, one starting slightly below it and the other starting slightly above it.

The first starts at 2.2191 and progressed down toward the lower fixed point.

The second starts at 2.2192 and progresses up toward the upper fixed point.

Related posts

The post Square waves and cobwebs first appeared on John D. Cook.

]]>
https://www.johndcook.com/blog/2020/08/26/square-waves-and-cobwebs/feed/ 0
Unexpected square wave https://www.johndcook.com/blog/2020/08/25/unexpected-square-wave/ https://www.johndcook.com/blog/2020/08/25/unexpected-square-wave/#comments Tue, 25 Aug 2020 13:56:46 +0000 https://www.johndcook.com/blog/?p=59732 Last night a friend from Vanderbilt, Father John Rickert, sent me a curious math problem. (John was a PhD student in the math department while I was a postdoc. He went on to be a Catholic priest after graduating.) He said that if you look at iterates of f(x) = exp( sin(x) ) the plots […]

The post Unexpected square wave first appeared on John D. Cook.

]]>
Last night a friend from Vanderbilt, Father John Rickert, sent me a curious math problem. (John was a PhD student in the math department while I was a postdoc. He went on to be a Catholic priest after graduating.) He said that if you look at iterates of

f(x) = exp( sin(x) )

the plots become square.

Here’s a plot to start the discussion, looking at f(x), f(f(x)), f(f(f(x))), and f(f(f(f(x)))).

One thing that is apparent is that there’s a big difference between applying f once and applying it at least two times. The former can range between 1/e and e, but the latter must be between exp(1/e) and e.

The plots overlap in a way that’s hard to understand, so let’s spread them out, plotting one iteration at a time.

Now we can see the plots becoming flatter as the number of iterations increases. We can also see that even and odd iterations are roughly mirror images of each other. This suggests we should make a plot of at just even or just odd iterates. Here we go:

This shows more clearly how the plots are squaring off pretty quickly as the number of iterations increases.

My first thought when John showed me his plot was that it must have something to do with the contraction mapping theorem. And I suppose it does, but not in a simple way. The function f(x) is not a contraction mapping for any x. But f(f(x)) is a contraction mapping for some x, and further iterates are contractions for more values of x.

Update: See the next post calculates the fixed points of f(f(x)) and demonstrates how the stable fixed points converge and the unstable fixed point does not.

My second thought was that it would be interesting to look at the Fourier transform of these iterates. For any finite number of iterations, the result is a periodic, analytic function, and so eventually the Fourier coefficients must go to zero rapidly. But I suspect they initially go to zero slowly, like those of a square wave would. I intend to update this post after I’ve had a chance to look at the Fourier coefficients.

Update: As expected, the Fourier coefficients decay slowly. I plotted the Fourier sine coefficients for f(f(f(f(x)))) using Mathematica.

    f[x_] := Exp[Sin[x]]
    g[x_] := f[f[f[f[x]]]]
    ListPlot[ Table[
        NFourierSinCoefficient[g[x], x, i], 
        {i, 30}]
    ]

This produced the plot below.

The even-numbered coefficients are zero, but the odd-numbered coefficients are going to zero very slowly. Since the function g(x) is infinitely differentiable, for any k you can go far enough out in the Fourier series that the coefficients go to zero faster than nk. But initially the Fourier coefficients decay like 1/n, which is typical of a discontinuous function, like a square wave, rather than an infinitely differentiable function.

By the way, if you replace sine with cosine, you get similar plots, but with half as many flat spots per period.

Related posts

The post Unexpected square wave first appeared on John D. Cook.

]]>
https://www.johndcook.com/blog/2020/08/25/unexpected-square-wave/feed/ 1
Not quite going in circles https://www.johndcook.com/blog/2020/08/24/not-quite-going-in-circles/ https://www.johndcook.com/blog/2020/08/24/not-quite-going-in-circles/#comments Mon, 24 Aug 2020 11:56:08 +0000 https://www.johndcook.com/blog/?p=59670 Sometimes you feel like you’re running around in circles, not making any progress, when you’re on the verge of a breakthrough. An illustration of this comes from integration by parts. A common example in calculus classes is to integrate ex sin(x) using integration by parts (IBP). After using IBP once, you get an integral similar […]

The post Not quite going in circles first appeared on John D. Cook.

]]>
foggy path

Sometimes you feel like you’re running around in circles, not making any progress, when you’re on the verge of a breakthrough. An illustration of this comes from integration by parts.

A common example in calculus classes is to integrate ex sin(x) using integration by parts (IBP). After using IBP once, you get an integral similar to the one you started with. And if you apply IBP again, you get exactly the integral you started with.

At this point you believe all is lost. Apparently IBP isn’t going to work and you’ll need to try another approach.

\begin{align*} \int e^x \sin x \, dx &= -e^x \cos x + \int e^x \cos x \, dx \\ &= e^x(\sin x - \cos x) - \int e^x \sin x \, dx\end{align*}

But then the professor pulls a rabbit out of a hat. By moving the integral on the right side to the left, you can solve for the unknown integral in terms of the debris IBP left along the way.

\int e^x \sin x \, dx = \frac{e^x}{2}(\sin x - \cos x)

So you weren’t going in circles after all. You were making progress when it didn’t feel like it.

It’s not that gathering unknowns to one side is a new idea; you would have seen that countless times before you get to calculus. But that’s not how integration usually works. You typically start with the assigned integral and steadily chip away at it, progressing in a straight line toward the result. The trick is seeing that a simple idea from another context can be applied in this context.

In the calculation above we first let u = ex and v’ = sin(x) on the left. Then when we come to the first integral on the right, we again set u = ex and this time v’ = cos(x).

But suppose you come to the second integral and think “This is starting to look futile. Maybe I should try something different. This time I’ll let ex be the v‘ term rather than the u term.” In that case you really will run in circles. You’ll get exactly the same expression on both sides.

It’s hard to say in calculus, or in life in general, whether persistence or creativity is called for. But in this instance, persistence pays off. If you set u to the exponential term both times, you win; if you switch horses mid stream, you lose.

Another way to look at the problem is that the winning strategy is to be persistent in your approach to IBP, but use your creativity at the end to realize that you’re nearly done, just when it may seem you need to start over.

If you’d like to read more about coming back where you started, see Coming full circle and then this example from signal processing.

The post Not quite going in circles first appeared on John D. Cook.

]]>
https://www.johndcook.com/blog/2020/08/24/not-quite-going-in-circles/feed/ 4
Transliterating Hebrew https://www.johndcook.com/blog/2020/08/23/transliterating-hebrew/ https://www.johndcook.com/blog/2020/08/23/transliterating-hebrew/#comments Sun, 23 Aug 2020 17:54:03 +0000 https://www.johndcook.com/blog/?p=59632 Yesterday I wrote about cjhebrew, a LaTeX package that lets you insert Hebrew text by using a sort of transliteration scheme. That reminded me of unidecode, a Python package for transliterating Unicode to ASCII, that I wrote about before. I wondered how the two compare, and so this post will answer that question. Transliteration is […]

The post Transliterating Hebrew first appeared on John D. Cook.

]]>
Yesterday I wrote about cjhebrew, a LaTeX package that lets you insert Hebrew text by using a sort of transliteration scheme. That reminded me of unidecode, a Python package for transliterating Unicode to ASCII, that I wrote about before. I wondered how the two compare, and so this post will answer that question.

Transliteration is a crude approximation. I started to say it’s no substitute for a proper translation, but in fact sometimes it is a substitute for a proper translation. It takes in the smallest context possible—one character—and is utterly devoid of nuance, but it still might be good enough for some purposes. It might, for example, help in searching some text for relevant content worth the effort of a proper translation.

Here’s a short bit of code to display unidecode‘s transliterations of the Hebrew alphabet.

    for i in range(22+5):
        ch = chr(i + ord('א'))
        print(ch, unidecode.unidecode(ch))

I wrote 22 + 5 rather than 27 above to give a hint that the extra values are the final forms of five letters [1]. Also if ord('א') doesn’t work for you, you can replace it with 0x05d0.

Here’s a comparison of the transliterations used in cjhebrew and unidecode. I’ve abbreviated the column headings to make a narrower table.

|---------+---+----+----|
| Unicode |   | cj | ud |
|---------+---+----+----|
| U+05d0  | א | '  | A  |
| U+05d1  | ב | b  | b  |
| U+05d2  | ג | g  | g  |
| U+05d3  | ד | d  | d  |
| U+05d4  | ה | h  | h  |
| U+05d5  | ו | w  | v  |
| U+05d6  | ז | z  | z  |
| U+05d7  | ח | .h | KH |
| U+05d8  | ט | .t | t  |
| U+05d9  | י | y  | y  |
| U+05da  | ך | K  | k  |
| U+05db  | כ | k  | k  |
| U+05dc  | ל | l  | l  |
| U+05dd  | ם | M  | m  |
| U+05de  | מ | m  | m  |
| U+05df  | ן | N  | n  |
| U+05e0  | נ | n  | n  |
| U+05e1  | ס | s  | s  |
| U+05e2  | ע | `  | `  |
| U+05e3  | ף | P  | p  |
| U+05e4  | פ | p  | p  |
| U+05e5  | ץ | .S | TS |
| U+05e6  | צ | s  | TS |
| U+05e7  | ק | q  | q  |
| U+05e8  | ר | r  | r  |
| U+05e9  | ש | /s | SH |
| U+05ea  | ת | t  | t  |
|---------+---+----+----|

The transliterations are pretty similar, despite different design goals. The unidecode module is trying to pick the best mapping to ASCII characters. The cjhebrew package is trying to use mnemonic ASCII sequences to map into Hebrew. The former doesn’t need to be unique, but the latter does. The post on cjhebrew explains, for example, that it uses capital letters for final forms of Hebrew letters.

Here’s the corresponding table for vowel points (niqqud).

|---------+---+----+----|
| Unicode |   | cj | ud |
|---------+---+----+----|
| U+05b0  | ְ  | :  | @  |
| U+05b1  | ֱ  | E: | e  |
| U+05b2  | ֲ  | a: | a  |
| U+05b3  | ֳ  | A: | o  |
| U+05b4  | ִ  | i  | i  |
| U+05b5  | ֵ  | e  | e  |
| U+05b6  | ֶ  | E  | e  |
| U+05b7  | ַ  | a  | a  |
| U+05b8  | ָ  | A  | a  |
| U+05b9  | ֹ  | o  | o  |
| U+05ba  | ֺ  | o  | o  |
| U+05bb  | ֻ  | u  | u  |
|---------+---+----+----|

Related posts

[1] Unicode lists the final forms of letters come before the ordinary form. For example, final kaf has Unicode value U+05da and kaf has value U+05db.

The post Transliterating Hebrew first appeared on John D. Cook.

]]>
https://www.johndcook.com/blog/2020/08/23/transliterating-hebrew/feed/ 1
Permutable polynomials https://www.johndcook.com/blog/2020/08/22/permutable-polynomials/ https://www.johndcook.com/blog/2020/08/22/permutable-polynomials/#comments Sun, 23 Aug 2020 01:49:17 +0000 https://www.johndcook.com/blog/?p=59607 Two polynomials p(x) and q(x) are said to be permutable if p(q(x)) = q(p(x)) for all x. It’s not hard to see that Chebyshev polynomials are permutable. First, Tn(x) = cos (n arccos(x)) where Tn is the nth Chebyshev polyomial. You can take this as a definition, or if you prefer another approach to defining […]

The post Permutable polynomials first appeared on John D. Cook.

]]>
Two polynomials p(x) and q(x) are said to be permutable if

p(q(x)) = q(p(x))

for all x. It’s not hard to see that Chebyshev polynomials are permutable.

First,

Tn(x) = cos (n arccos(x))

where Tn is the nth Chebyshev polyomial. You can take this as a definition, or if you prefer another approach to defining the Chebyshev polynomials it’s a theorem.

Then it’s easy to show that

Tm(Tn(x)) = Tmn (x).

because

cos(m arccos(cos(n arccos(x)))) = cos(mn arccos(x)).

Then the polynomials Tm and Tn must be permutable because

Tm(Tn(x)) = Tmn (x) = Tn(Tm(x))

for all x.

There’s one more family of polynomials that are permutable, and that’s the power polynomials xk. They are trivially permutable because

(xm)n = (xn)m.

It turns out that the Chebyshev polynomials and the power polynomials are essentially [1] the only permutable sequence of polynomials.

Related posts

[1] Here’s what “essentially” means. A set of polynomials, at least one of each positive degree, that all permute with each other is called a chain. Two polynomials p and q are similar if there is an affine polynomial

λ(x) = ax + b

such that

p(x) = λ-1( q( λ(x) ) ).

Then any permutable chain is similar to either the power polynomials or the Chebyshev polynomials. For a proof, see Chebyshev Polynomials by Theodore Rivlin.

The post Permutable polynomials first appeared on John D. Cook.

]]>
https://www.johndcook.com/blog/2020/08/22/permutable-polynomials/feed/ 1
Including a little Hebrew in an English LaTeX document https://www.johndcook.com/blog/2020/08/22/hebrew-latex/ https://www.johndcook.com/blog/2020/08/22/hebrew-latex/#comments Sat, 22 Aug 2020 17:59:19 +0000 https://www.johndcook.com/blog/?p=59571 I was looking up how to put a little Hebrew inside a LaTeX document and ran across a good answer on tex.stackexchange. Short answer: use the cjhebrew package. In a nutshell, you put your Hebrew text between \< and > using the cjhebrew package’s transliteration. You write left-to-right, and the text will appear right-to-left. For […]

The post Including a little Hebrew in an English LaTeX document first appeared on John D. Cook.

]]>
I was looking up how to put a little Hebrew inside a LaTeX document and ran across a good answer on tex.stackexchange. Short answer: use the cjhebrew package.

In a nutshell, you put your Hebrew text between \< and > using the cjhebrew package’s transliteration. You write left-to-right, and the text will appear right-to-left. For example, \<'lp> produces

aleph in Hebrew

using ‘ for א, l for ל, and p for ף.

The code for each Hebrew letter is its English transliteration, with three footnotes.

First, when two Hebrew letters roughly correspond to the same English letter, one form may have a dot in front of it. For example, ט and ת both make a t sound; the former is encoded as .t and the latter as t.

Second, five Hebrew letters have a different form when used at the end of a word [1]. For such letters the final form is the capitalized value of the regular form. For example, פ and its final form ף are denoted by p and P respectively. The package will automatically choose between regular and final forms, but you can override this by using the capital letter in the middle of a word or by using a | after a regular form at the end of a word.

Finally, the letter ש is written with a /s The author already used s for ס and .s for צ, so he needed a new symbol to encode a third letter corresponding to s [2]. Also ש has a couple other forms. The letter can make either the sh or s sound, and you may see dots on top of the letter to distinguish these. The cjhebrew package uses +s for ש with a dot on the top right, the sh sound, and ,s for ש with a dot on the top left, the s sound.

Here is the complete consonant transliteration table from the cjhebrew documentation.

Note that the code for א is a single quote ' and the code for ע is a back tick (grave accent) `.

You can also add vowel points (niqqud). These are also represented by their transliteration to English sounds, with one exception. The sh’va is either silent or represents a schwa sound, so there’s not a convenient transliterations. But the sh’va looks like a colon, so it is represented by a colon. See the package documentation for more details.

Related posts

[1] You may have seen something similar in Greek with sigma σ and final sigma ς. Even English had something like this. For example, people used to use a different form of t at the end of a word when writing cursive. My mother wrote this way.

[2] It would be more phonetically faithful to transliterate צ as ts, but that would make the LaTeX package harder to implement since it would have to disambiguate whether ts represents צ or תס.

The post Including a little Hebrew in an English LaTeX document first appeared on John D. Cook.

]]>
https://www.johndcook.com/blog/2020/08/22/hebrew-latex/feed/ 1
All English vowel sounds in one sentence https://www.johndcook.com/blog/2020/08/20/english-vowel-sounds/ https://www.johndcook.com/blog/2020/08/20/english-vowel-sounds/#comments Thu, 20 Aug 2020 16:24:21 +0000 https://www.johndcook.com/blog/?p=59481 Contrary to popular belief, English has more than five or ten vowel sounds. The actual number is disputed because of disagreements over when two sounds are sufficiently distinct to be classified as separate sounds. I’ve heard some people say 15, some 17, some over 20. I ran across a podcast episode recently that mentioned a […]

The post All English vowel sounds in one sentence first appeared on John D. Cook.

]]>
Contrary to popular belief, English has more than five or ten vowel sounds. The actual number is disputed because of disagreements over when two sounds are sufficiently distinct to be classified as separate sounds. I’ve heard some people say 15, some 17, some over 20.

I ran across a podcast episode recently that mentioned a sentence that demonstrates a different English vowel sound in each word:

Who would know naught of art must learn, act, and then take his ease [1].

The hosts noted that to get all the vowels in, you need to read the sentence with non-rhotic pronunciation, i.e. suppressing the r in art.

I’ll run this sentence through some software that returns the phonetic spelling of each word in IPA symbols to see the distinct vowel sounds that way. First I’ll use Python, then Mathematica.

Python

Let’s run this through some Python code that converts English words to IPA notation so we can look at the vowels.

    import eng_to_ipa as ipa

    text = "Who would know naught of art must learn, act, and then take his ease."
    print(ipa.convert(text))

This gives us

hu wʊd noʊ nɔt əv ɑrt məst lərn, ækt, ənd ðɛn teɪk hɪz iz

Which includes the following vowel symbols:

  1. u
  2. ʊ
  3. ɔ
  4. ə
  5. ɑ
  6. ə
  7. ə
  8. æ
  9. ə
  10. ɛ
  11. ɪ
  12. i

This has some duplicates: 5, 7, 8, and 10 are all schwa symbols.

By default the eng_to_ipa gives one way to write each word in IPA notation. There is an optional argument, retrieve_all that defaults to False but may return more alternatives when set to True. However, in our example the only difference is that the second alternative writes and as ænd rather than ənd.

It looks like the eng_to_ipa module doesn’t transcribe vowels with sufficient resolution to distinguish some of the sounds in the model sequence. For example, it doesn’t seem to distinguish the stressed sound ʌ from the unstressed ə.

Mathematica

Here’s Mathematica code to split the model sentence into words and show the IPA pronunciation of each word.

    text = "who would know naught of art must \
        learn, act, and then take his ease" 
    ipa[w_] := WordData[w, "PhoneticForm"]
    Map[ipa, TextWords[text]]

This returns

    {"hˈu", "wˈʊd", "nˈoʊ", "nˈɔt", "ˈʌv", "ˈɒrt", "mˈʌst", 
    "lˈɝn", "ˈækt", "ˈænd", "ðˈɛn", "tˈeɪk", "hˈɪz", "ˈiz"}

By the way, I had to write the first word as “who” because WordData won’t do it for me. If you ask for

    ipa["Who"]

Mathematica will return

    Missing["NotAvailable"]

though it works as expected if you send it “who” rather than “Who.”

Let’s remove the stress marks and join the words together so we can compare the Python and Mathematica output. The top line is from Python and the bottom is from Mathematica.

    hu wʊd noʊ nɔt əv ɑrt məst lərn ækt ænd ðɛn teɪk hɪz iz
    hu wʊd noʊ nɔt ʌv ɒrt mʌst lɝn  ækt ænd ðɛn teɪk hɪz iz

There are a few differences, summarized in the table below. Since the symbols are a little difficult to tell apart, I’ve included their Unicode code points.

    |-------+------------+-------------|
    | Word  | Python     | Mathematica |
    |-------+------------+-------------|
    | of    | ə (U+0259) | ʌ (U+028C)  |
    | must  | ə (U+0259) | ʌ (U+028C)  |
    | art   | ɑ (U+0251) | ɒ (U+0252)  |
    | learn | ə (U+0259) | ɝ (U+025D)  |
    |-------+------------+-------------|

Mathematica makes some distinctions that Python missed.

Update: See the first comment below for variations on how the model sentence can be pronounced and how to get more distinct vowel sounds out of it.

More linguistics posts

[1] After writing this post I saw the sentence in writing, and the fourth word is “aught” rather than “naught.” This doesn’t change the vowel since the two words rhyme, but Mathematica doesn’t recognize the word “aught.”

The post All English vowel sounds in one sentence first appeared on John D. Cook.

]]>
https://www.johndcook.com/blog/2020/08/20/english-vowel-sounds/feed/ 1
Three notations by Iverson https://www.johndcook.com/blog/2020/08/18/three-notations-by-iverson/ https://www.johndcook.com/blog/2020/08/18/three-notations-by-iverson/#comments Tue, 18 Aug 2020 12:27:39 +0000 https://www.johndcook.com/blog/?p=59116 The floor of y is the greatest integer less than or equal to y and is denoted ⌊y⌋. Similarly, the ceiling of y is the smallest integer greater than or equal to y and is denoted ⌈y⌉. Both of these notations were introduced by Kenneth Iverson. Before Iverson’s notation caught on, you might see [x] for […]

The post Three notations by Iverson first appeared on John D. Cook.

]]>
The floor of y is the greatest integer less than or equal to y and is denoted ⌊y⌋.

Similarly, the ceiling of y is the smallest integer greater than or equal to y and is denoted ⌈y⌉.

Both of these notations were introduced by Kenneth Iverson. Before Iverson’s notation caught on, you might see [x] for the floor of x, and I don’t know whether there was a notation for ceiling.

There was also a lack of standardization over whether [x] meant to round x down or round it to the nearest integer. Iverson’s notation caught on because it’s both mnemonic and symmetrical.

Iverson also invented the notation of using a Boolean expression inside square brackets to indicate the function that is 1 when the argument is true and 0 when it is false. I find this notation very convenient. I’ve used it on projects for two different clients recently.

Here’s an equation from Concrete Mathematics using all three Iverson notations discussed here:

x⌉ – ⌊x⌋ = [x is not an integer].

In words, the ceiling of x minus the floor of x is 1 when x is not an integer and 0 when x is an integer.

Related links

The post Three notations by Iverson first appeared on John D. Cook.

]]>
https://www.johndcook.com/blog/2020/08/18/three-notations-by-iverson/feed/ 1
Entering symbols in Emacs https://www.johndcook.com/blog/2020/08/18/emacs-symbols/ https://www.johndcook.com/blog/2020/08/18/emacs-symbols/#respond Tue, 18 Aug 2020 12:21:29 +0000 https://www.johndcook.com/blog/?p=59312 Emacs has a relatively convenient way to add accents to letters or to insert a Unicode character if you know the code point for the value. See these notes. But usually you don’t know the Unicode values of symbols. Then what do you do? TeX commands You enter symbols by typing their corresponding TeX commands […]

The post Entering symbols in Emacs first appeared on John D. Cook.

]]>
Emacs has a relatively convenient way to add accents to letters or to insert a Unicode character if you know the code point for the value. See these notes.

But usually you don’t know the Unicode values of symbols. Then what do you do?

TeX commands

You enter symbols by typing their corresponding TeX commands by using

    M-x set-input-method RET tex

After doing that, you could, for example, enter π by typing \pi.

You’ll see the backslash as you type the command, but once you finish you’ll see the symbol instead [1].

HTML entities

You may know the HTML entity for a symbol and want to use that to enter characters in Emacs. Unfortunately, the following does NOT work.

    M-x set-input-method RET html

However, there is a slight variation on this that DOES work:

    M-x set-input-method RET sgml

Once you’ve set your input method to sgml, you could, for example, type &radic; to insert a √ symbol.

Why SGML rather than HTML?

HTML was created by simplifying SGML (Standard Generalized Markup Language). Emacs is older than HTML, and so maybe Emacs supported SGML before HTML was written.

There may be some useful SGML entities that are not in HTML, though I don’t know. I imagine these days hardly anyone knows anything about SGML beyond the subset that lives on in HTML and XML.

Changing input modes

If you want to move between your default input mode and TeX mode, you can use the command toggle-input-method. This is usually mapped to C-u C-\.

You can see a list of all available input methods with list-input-methods. Most of these are spoken languages, such as Arabic or Welsh, rather than technical input modes like TeX and SGML.

More Emacs posts

[1] I suppose there could be a problem if one command were a prefix of another. That is, if there were symbols \foo and \foobar and you intended to insert the latter, Emacs might think you’re done after you’ve typed the former. But I can’t think of a case where that would happen. TeX commands are nearly prefix codes. There are TeX commands like \tan and \tanh, but these don’t represent symbols per se. Emacs doesn’t need any help to insert the letters “tan” or “tanh” into a file.

The post Entering symbols in Emacs first appeared on John D. Cook.

]]>
https://www.johndcook.com/blog/2020/08/18/emacs-symbols/feed/ 0
Adding phase-shifted sine waves https://www.johndcook.com/blog/2020/08/17/adding-phase-shifted-sine-waves/ https://www.johndcook.com/blog/2020/08/17/adding-phase-shifted-sine-waves/#comments Mon, 17 Aug 2020 20:19:23 +0000 https://www.johndcook.com/blog/?p=59269 Suppose you have two sinusoidal functions with the same frequency ω but with different phases and different amplitudes: f(t) = A sin(ωt) and g(t) = B sin(ωt + φ). Then their sum is another sine wave with the same frequency h(t) = C sin(ωt + ψ). Note that this includes cosines as a special case […]

The post Adding phase-shifted sine waves first appeared on John D. Cook.

]]>
Suppose you have two sinusoidal functions with the same frequency ω but with different phases and different amplitudes:

f(t) = A sin(ωt)

and

g(t) = B sin(ωt + φ).

Then their sum is another sine wave with the same frequency

h(t) = C sin(ωt + ψ).

Note that this includes cosines as a special case since a cosine is a sine with phase shift φ = 90°.

Sum of two phase-shifted sine waves with the same frequency is another sine wave

This post will

  • prove that the sum of sine waves is another sine wave,
  • show how to find its amplitude and phase, and
  • discuss the significance of this result in signal processing.

Finding the amplitude and phase

Note f + g and h both satisfy the second order differential equation

y” = – ω² y

Therefore if they also satisfy the same initial conditions y(0) and y‘(0) then they’re the same function.

The functions  f + g and h are equal at 0 if

B sin(φ) = C sin(ψ).

and their derivatives are equal at 0 if

ω A + ω B cos(φ) = ω C cos(ψ).

Taking ratios says that

tan(ψ) = B sin(φ) / (AB cos(φ))

or

ψ = arctan( B sin(φ) / (AB cos(φ)) ).

Once we have ψ, we solve for C and find

C = B sin(φ) / sin(ψ).

Special case of sine and cosine

Let’s look at the special case of φ = 90°, i.e. adding A sin(ωt) and B cos(ωt). Then sin(φ) = 1 and cos(φ) = 0, and the equation for ψ simplifies to

ψ = arctan(B/A).

If an angle has tangent B/A, then it’s sine is B / √(A² + B²), and so we have

C = √(A² + B²).

Linear time invariant (LTI) systems

A linear, time-invariant system can differentiate or integrate signals. It can change their amplitude or phase. And it can add two signals together.

It’s easy to see that changing the amplitude or phase of a signal doesn’t change its frequency. It’s also easy to see that differentiation and integration of sine waves doesn’t change their frequency. But it’s not as clear that adding two sines with the same frequency doesn’t change their frequency. Here we’ve shown that’s the case.

Bode plots are a way to show how an LTI system changes in response to changes in its inputs. These plots show what happens to the amplitude and to the phase. They don’t need to show what happens to the frequency because the frequency doesn’t change.

More signal processing posts

The post Adding phase-shifted sine waves first appeared on John D. Cook.

]]>
https://www.johndcook.com/blog/2020/08/17/adding-phase-shifted-sine-waves/feed/ 2
A different kind of computational survival https://www.johndcook.com/blog/2020/08/15/computational-survival/ https://www.johndcook.com/blog/2020/08/15/computational-survival/#comments Sat, 15 Aug 2020 15:09:26 +0000 https://www.johndcook.com/blog/?p=59180 Last year I wrote a post about being a computational survivalist, someone able to get their work done with just basic command line tools when necessary. This post will be a different take on the same theme. I just got a laptop from an extremely security-conscious client. I assume it runs Windows 10 and that […]

The post A different kind of computational survival first appeared on John D. Cook.

]]>
Abandoned shopping mall

Last year I wrote a post about being a computational survivalist, someone able to get their work done with just basic command line tools when necessary. This post will be a different take on the same theme.

I just got a laptop from an extremely security-conscious client. I assume it runs Windows 10 and that I will not be able to install any software without an act of Congress. I don’t know yet because I haven’t booted it up. And in fact I cannot boot it up yet because I don’t have the badge yet to unlock it.

If being able to work with just default command line tools is like wilderness survival, being able to work with only consumer software is like urban survival, like trying to live in an abandoned shopping mall.

There must be some scientific software on the laptop. I imagine I may have to re-implement from scratch some tools that aren’t installed. I’ve been in that situation before.

One time I was an expert witness on a legal case and had to review the other side’s software. I could only work on their laptop, from their attorney’s office, with no network connection and no phone. I could request some software to be installed before I arrived, so I asked them to put Python on the laptop. I could bring books into the room with the laptop, so I brought the Python Cookbook with me.

If you don’t have grep, sed, or awk, but you do have Perl, you and roll your own version of the utilities in a few lines of code. For example, see Perl as a better grep.

I always use LaTeX for writing math, but the equation editor in Microsoft Word supports a large amount of LaTeX syntax. Or at least it did when I last tried it a decade ago.

The Windows command line has more Unix-like utilities than you might imagine. Several times I’ve typed a Unix command at the Windows cmd.exe prompt, thought “Oh wait. I’m on Windows, so that won’t work,” and the command works. The biggest difference between the Windows and Linux command lines is not the utilities per se. You can install many of the utilities, say through GOW.

The biggest difference in command lines is that on Windows, each utility parses its own arguments, whereas on Linux the shell parses the arguments first and passes the result to the utilities. So, for example, passing multiple files to a utility may or may not work on Windows, depending on the capability of the utility. On Linux, this just works because it is the shell itself rather than the utilities launched from the shell that orchestrates the workflow.

I expect this new project will be very interesting, and worth putting up with the minor annoyances of not having my preferred tools at my fingertips. And maybe it won’t be as hard as I imagine to request new software. If not, it can be fun to explore workarounds.

It’s sort of a guilty pleasure to find a way to get by without the right tool for the job. It would be a waste of time under normal circumstances, and not something the client should be billed for, but you can hack with a clear conscience when you’re forced into doing so.

The post A different kind of computational survival first appeared on John D. Cook.

]]>
https://www.johndcook.com/blog/2020/08/15/computational-survival/feed/ 11
Multiply, divide, and floor https://www.johndcook.com/blog/2020/08/14/multiply-divide-and-floor/ https://www.johndcook.com/blog/2020/08/14/multiply-divide-and-floor/#respond Fri, 14 Aug 2020 13:28:28 +0000 https://www.johndcook.com/blog/?p=59114 Let n be a positive integer and x any real number. If you multiply x by n, then divide by n, of course you get x back. Now suppose you multiply x by n, round down, then divide by n, and round down again. Do you get x back? Not necessarily. The last step rounds down […]

The post Multiply, divide, and floor first appeared on John D. Cook.

]]>
Let n be a positive integer and x any real number. If you multiply x by n, then divide by n, of course you get x back.

Now suppose you multiply x by n, round down, then divide by n, and round down again. Do you get x back?

Not necessarily. The last step rounds down to an integer, so you couldn’t possibly get x back unless x was an integer to begin with.

However, you do get back x rounded down to the nearest integer. In symbols,

\left\lfloor \frac{\lfloor nx \rfloor}{n} \right\rfloor = \lfloor x \rfloor

Here ⌊y⌋ is the floor of y, the greatest integer less than or equal to y. I found this via Problem 5 here.

The equation says that, in a limited sense, multiplication and division commute with taking floors. But you do have to assume n is an integer, and it’s important that you multiply first, then divide. The corresponding equation where you divide first doesn’t always hold.

The relationship above may not seem so surprising if you haven’t worked with floors, but generally floors and ceilings don’t play so nicely with other operations. The most tedious chapter of Concrete Mathematics is probably the one devoted to manipulating expressions with floors and ceilings. Once you expect these manipulations to be difficult, which they usually are, you find it remarkable that something would work out so simply.

I don’t know of an immediate application for this identity, though I vaguely recall wanting to use something like this and concluding that it probably wasn’t true.

 

The post Multiply, divide, and floor first appeared on John D. Cook.

]]>
https://www.johndcook.com/blog/2020/08/14/multiply-divide-and-floor/feed/ 0
Continued fractions with period 1 https://www.johndcook.com/blog/2020/08/13/continued-fraction-period/ https://www.johndcook.com/blog/2020/08/13/continued-fraction-period/#respond Fri, 14 Aug 2020 00:36:06 +0000 https://www.johndcook.com/blog/?p=59089 A while back I wrote about continued fractions of square roots. That post cited a theorem that if d is not a perfect square, then the continued fraction representation of d is periodic. The period consists of a palindrome followed by 2⌊√d⌋. See that post for details and examples. One thing the post did not […]

The post Continued fractions with period 1 first appeared on John D. Cook.

]]>
A while back I wrote about continued fractions of square roots. That post cited a theorem that if d is not a perfect square, then the continued fraction representation of d is periodic. The period consists of a palindrome followed by 2⌊√d⌋. See that post for details and examples.

One thing the post did not address is the length of the period. The post gave the example that the continued fraction for √5 has period 1, i.e. the palindrome part is empty.

\sqrt{5} = 2 + \cfrac{1}{4+ \cfrac{1}{4+ \cfrac{1}{4+ \cfrac{1}{4+ \ddots}}}}

There’s a theorem [1] that says this pattern happens if and only if d = n² + 1. That is, the continued fraction for √d is periodic with period 1 if and only if d is one more than a square. So if we wanted to find the continued fraction expression for √26, we know it would have period 1. And because each period ends in 2⌊√26⌋ = 10, we know all the coefficients after the initial 5 are equal to 10.

\sqrt{26} = 5 + \cfrac{1}{10+ \cfrac{1}{10+ \cfrac{1}{10+ \cfrac{1}{10+ \ddots}}}}

[1] Samuel S. Wagstaff, Jr. The Joy of Factoring. Theorem 6.15.

The post Continued fractions with period 1 first appeared on John D. Cook.

]]>
https://www.johndcook.com/blog/2020/08/13/continued-fraction-period/feed/ 0
A bevy of ones https://www.johndcook.com/blog/2020/08/13/a-bevy-of-ones/ https://www.johndcook.com/blog/2020/08/13/a-bevy-of-ones/#respond Thu, 13 Aug 2020 23:14:12 +0000 https://www.johndcook.com/blog/?p=59077 Take any positive integer d that is not a multiple of 2 or 5. Then there is some integer k such that d × k has only 1’s in its decimal representation. For example, take d = 13. We have 13 × 8457 = 111111. Or if we take d = 27, 27 × 4115226337448559670781893 = […]

The post A bevy of ones first appeared on John D. Cook.

]]>
Take any positive integer d that is not a multiple of 2 or 5. Then there is some integer k such that d × k has only 1’s in its decimal representation. For example, take d = 13. We have

13 × 8457 = 111111.

Or if we take d = 27,

27 × 4115226337448559670781893 = 111111111111111111111111111.

Let’s change our perspective and start with the string of 1’s. If d is not a multiple of 2 or 5, then there is some number made up of only 1’s that is divisible by d. And in fact, the number of 1’s is no more than d.

This theorem generalizes to any integer base b > 1. If d is relatively prime to b, then there is a base b number with d or fewer “digits” which is divisible by d [1].

The following Python code allows us to find the number k such that d × k has only 1’s in its base b representation, provided k is relatively prime to b. It returns the number of 1’s we need to string together to find a multiple of k. If k shares a factor with b, the code returns 0 because no string of 1’s will ever be divisible by k.

    from math import gcd
    
    def all_ones(n, b = 10):
        return sum(b**i for i in range(n))
    
    def find_multiple(k, b = 10):
        if gcd(k, b) > 1:
            return 0
        for n in range(1, k+1):
            if all_ones(n, b) % k == 0:
                return n

The two Python functions above default to base 10 if a base isn’t provided.

We could find a multiple of 5 whose hexadecimal representation is all 1’s by calling

    print(find_multiple(5, 16))

and this tells us that 11111sixteen is a multiple of 5, and in fact

5 × 369dsixteen = 11111sixteen.

[1] Elmer K. Hayashi. Factoring Integers Whose Digits Are all Ones. Mathematics Magazine, Vol. 49, No. 1 (Jan., 1976), pp. 19-22

The post A bevy of ones first appeared on John D. Cook.

]]>
https://www.johndcook.com/blog/2020/08/13/a-bevy-of-ones/feed/ 0
Symbol pronunciation https://www.johndcook.com/blog/2020/08/12/symbol-pronunciation/ https://www.johndcook.com/blog/2020/08/12/symbol-pronunciation/#comments Thu, 13 Aug 2020 01:18:51 +0000 https://www.johndcook.com/blog/?p=59028 I was explaining to someone this evening that I’m in the habit of saying “bang” rather than “exclamation point.” Here’s a list of similar nicknames for symbols. These nicknames could complement the NATO phonetic alphabet if you needed to read symbols out loud, say over the phone. You might, for example, pronounce “HL&P” as “Hotel […]

The post Symbol pronunciation first appeared on John D. Cook.

]]>
I was explaining to someone this evening that I’m in the habit of saying “bang” rather than “exclamation point.” Here’s a list of similar nicknames for symbols.

These nicknames could complement the NATO phonetic alphabet if you needed to read symbols out loud, say over the phone. You might, for example, pronounce “HL&P” as “Hotel Lima Pretzel Papa.”

Or you might use them to have one-syllable names for every symbol. This is a different objective than maximizing phonetic distinctiveness. For example, referring to # and $ as “hash” and “cash” is succinct, but could easily be misheard.

You could also optimize for being clever. Along those lines, I like the idea of pronouncing the symbols ( and ) as “wane” and “wax”, by analogy with phases of the moon, though this would be a bad choice if you want to be widely and quickly understood.

Even if someone understood the allusion to lunar phases, and they knew which one looks like an opening parenthesis and which one looks like a closing parenthesis, they might get your meaning backward because they don’t live in the same hemisphere you do! I implicitly assumed the perspective of the northern hemisphere in the paragraph above. Someone from the southern hemisphere would understand “wane” to mean ) and “wax” to mean (.

Phases of the moon in northern and southern hemispheres

The post Symbol pronunciation first appeared on John D. Cook.

]]>
https://www.johndcook.com/blog/2020/08/12/symbol-pronunciation/feed/ 9