Solving hard problems

We help companies solve hard problems in mathematics, statistics, and computing. Let’s explore how we might work together.

Illegible work

Posted on 18 June 2022 by John

When James Scott uses the word legible, he doesn’t refer to handwriting that is clear enough to read. He uses the word more broadly to mean something that is easy to classify, something that is bureaucrat-friendly. A thing is illegible if it is hard to pigeonhole. I first heard the term from Venkatesh Rao’s essay A Big Little Idea Called Legibility.

Much of the work I do is illegible. If the work were legible, companies would have an employee who does it [1] and they wouldn’t call a consultant.

Here’s a template for a conversation I’ve had a few times:

“We’ve got kind of an unusual problem. It’s related to some things I’ve seen you write about. Have you heard of …?”

“No, what’s that?”

“Never mind. We’d like you to help us with a project. …”

Years ago, when people heard that I worked in statistics they’d ask what programming language I worked in. They expected me to say R or SAS or something like that, but I’d say C++. Not that I recommend doing statistics in C++ [2] in general, but people came to me with unusual projects that they couldn’t get done with standard tools. If an R module would have done what they wanted, they wouldn’t have knocked on my door.

Doing illegible work is a lot of fun, but it’s hard to market. Try typing “Someone who can help with a kinda off the wall math / computer science project” into Google. It’s not helpful. Search engines can only answer questions that are legible to search engines. Illegible work is more likely to come from word of mouth than from a search engine.

***

[1] Sometimes companies call a consultant because they have occasional need for some skill, something they do not need often enough to justify hiring a full-time employee to do. Or maybe they have the skills in house to do a project but don’t have anyone available. Or maybe they want an outside auditor. But in this post I’m focusing on weird projects.

[2] When I mention C++, I know some people are thinking “But isn’t C++ terribly complex?” Why yes, yes it is. But my colleagues and I already knew C++, and we stuck to a sane subset of the language. It was not unusual to rewrite R code in C++ and make it 100x faster.

“Why don’t you just use C?” These days I’m more likely to write C than C++. Clients don’t want me to write enterprise applications, just small numerical libraries, and they usually ask for C.

Length of periods in the (infinite) periodic table

Posted on 18 June 2022 by John

A few days ago I wrote about what the periodic table would look like if we extended it, assuming the patterns that hold for known elements continue to hold.

That post reported that the number of elements in nth period works out to

$P_n = \frac{(-1)^n(2n+3) + 2n^2 + 6n + 5}{4}$

There’s a simpler expression for P_n:

Here ⌊x⌋ is the largest integer no greater than x.

I posted this yesterday on @AlgebraFact. Surely many people have come up with the same formula, but it doesn’t seem to be well known.

Doubly periodic but not analytic

Posted on 18 June 2022 by John

A sine wave is the canonical periodic function, so an obvious way to create a periodic function of two variables would be to multiply two sine waves:

f(x, y) = sin(x) sin(y)

This function is doubly periodic: periodic in the horizontal and vertical directions.

Now suppose you want to construct a doubly periodic function of a complex variable z = x + iy. One thing you might try is

f(x + iy) = sin(x) sin(y)

This is a function of a complex variable, technically, but it’s not what we usually have in mind when we speak of functions of a complex variable. We’re usually interested in complex functions that are differentiable as complex functions.

If h is the increment in the definition of a derivative, we require the limit as h approaches 0 to be equal no matter what route h takes. On the real line, h could go to zero from the left or from the right. That’s all the possibilities. But in the complex plane, h could approach the origin from any angle, or it could take a more complex route such as spiraling toward 0.

It turns out that it is sufficient that the limit be the same whether h goes to 0 along the real or imaginary axis. This gives us the Cauchy-Riemann equations. If

f(x, y) = u(x, y) + i v(x, y)

then the Cauchy-Riemann equations require the partial derivative of u with respect to x to equal the partial derivative of v with respect to y, and the partial derivative of v with respect to x to be the negative of the partial of u with respect to y.

u_x = v_y
v_x = −u_y

These equations imply that if v = 0 then u is constant. A real-valued function of a complex variable cannot be analytic unless its constant.

So can we rescue our example by making up an imaginary component v? If so

f(x, y) = sin(x) sin(y) + i v(x, y)

would have to satisfy the Cauchy-Riemann equations. The first equation would require

v(x, y) = − cos(x) cos(y) + a cos(x)

for some constant a, but the second would require

v(x, y) = cos(x) cos(y) + b cos(y)

for some constant b. Note the negative sign in the former but not in the latter. I’ll leave it as an exercise for the reader to show that these requirements are contradictory.

Liouville’s theorem says that a bounded analytic function must be constant. If an analytic function f is doubly periodic, then it must either be constant or have a singularity. There are many such functions, known as elliptic functions.

Letter-like Unicode symbols

Posted on 16 June 2022 by John

Unicode provides a way to distinguish symbols that look alike but have different meanings.

We can illustrate this with the following Python code.

    import unicodedata as u

    for pair in [('K', 'K'), ('Ω', 'Ω'), ('ℵ', 'א')]:
        for c in pair:
            print(format(ord(c), '4X'), u.bidirectional(c), u.name(c))

This produces

      4B L LATIN CAPITAL LETTER K
    212A L KELVIN SIGN
     3A9 L GREEK CAPITAL LETTER OMEGA
    2126 L OHM SIGN
    2135 L ALEF SYMBOL
     5D0 R HEBREW LETTER ALEF

Even though K and K look similar, the former is a Roman letter and the latter is a symbol for temperature in Kelvin. Similarly, Ω and Ω are semantically different even though they look alike.

Or rather, they probably look similar. A font may or may not use different glyphs for different code points. The editor I’m using to write this post uses a font that makes no difference between ohm and omega. The letter K and the Kelvin symbol are slightly different if I look very closely. The two alefs appear substantially different.

Note that the mathematical symbol alef is a left-to-right character and the Hebrew latter alef is a right-to-left character. The former could be useful to tell a word processor “This isn’t really a Hebrew letter; it’s a symbol that looks like a Hebrew letter. Don’t change the direction of my text or switch any language-sensitive features like spell checking.”

These letter-like symbols can be used to provide semantic information, but they can also be used to deceive. For example, a malicious website could change a K in a URL to a Kelvin sign.

Periodic table of abbreviations

Posted on 16 June 2022 by John

I just updated my earlier post on chemical element abbreviations by adding a table to visualize the groupings, a sort of periodic table of element abbreviations. See that post for details.

First letter
First two letters
First letter and next consonant
Initials of first and second syllables
Initials of first and third syllables
First and last letter
First letter and something else
Historical

Continued fractions as matrix products

Posted on 15 June 2022 by John

Let p_n / q_n be the nth convergent of a continued fraction:

$\frac{p_n}{q_n} = a_0 + \cfrac{1}{a_1+\cfrac{1}{a_2+\cfrac{1}{a_3+ \ddots \cfrac{1}{a_n}}}}$

Then

$\begin{pmatrix} p_n & p_{n-1} \\ q_n & q_{n-1} \end{pmatrix} = \begin{pmatrix} a_0 & 1 \\ 1 & 0 \end{pmatrix}\begin{pmatrix} a_1 & 1 \\ 1 & 0 \end{pmatrix} \cdots \begin{pmatrix} a_n & 1 \\ 1 & 0 \end{pmatrix}$

Source: Julian Havil. The Irrationals. p. 212.

Greek letter paradox

Posted on 13 June 2022 by John

The Greek letter paradox is seeing the same symbol in two contexts and assuming it means the same thing. Maybe it’s used in many contexts, but I first heard it in the context of comparing statistical models.

I used the phrase in my previous post, looking at

α exp(5t) + β t exp(5t)

and

α exp(4.999 t) + β exp(5.001 t).

In both cases, α is the coefficient of a term equal to or approximately equal to exp(5t), so in some sense α means the same thing in both equations. But as that post shows, the value of α depends on its full context. In that sense, it’s a coincidence that the same letter is used in both equations.

When the two functions above are solutions to a differential equation and a tiny perturbation in the same equation, the values of α and β are very different even though the resulting functions are nearly equal (for moderately small t).

Double roots and ODEs

Posted on 13 June 2022 by John

This post will resolve a sort of paradox. The process of solving a difference or differential equation is different when the characteristic equation has a double root. But intuitively there shouldn’t be much difference between having a double root and having two roots very close together.

I’ll first say how double roots effect finding solutions to difference and differential equations, then I’ll focus on the differential equations and look at the difference between a double root and two roots close together.

Double roots

The previous post looked at a second order difference equation and a second order differential equation. Both are solved by analogous techniques. The first step in solving either the difference equation

ay_n + by_n−1 + cy_n−2 = 0

or the differential equation

ay″ + by′ + cy = 0

is to find the roots of the characteristic equation

aλ² + bλ + c = 0.

If there are two distinct roots to the characteristic equation, then each gives rise to an independent solution to the difference or differential equation. The roots may be complex, and that changes the qualitative behavior of the solution, but it doesn’t change the solution process.

But what if the characteristic equation has a double root λ? One solution is found the same way: λⁿ for the difference equation and exp(λt) for the differential equation. We know from general theorems that a second independent solution must exist, but how do we find it?

For the differential equation a second solution is nλⁿ and for the differential equation t exp(λt) is our second solution.

Close roots

For the rest of the post I’ll focus on differential equations, though similar remarks apply to difference equations.

If λ = 5 is a double root of our differential equation then the general solution is

α exp(5t) + β t exp(5t)

for some constants a and b. But if our roots are λ = 4.999 and λ = 5.001 then the general solution is

α exp(4.999 t) + β exp(5.001 t)

Surely these two solutions can’t be that different. If the parameters for the differential equation are empirically determined, can we even tell the difference between a double root and a pair of distinct roots a hair’s breadth apart?

On the other hand, β exp(5t) and β t exp(5t) are quite different for large t. If t = 100, the latter is 100 times larger! However, this bumps up against the Greek letter paradox: assuming that parameters with the same name in two equations mean the same thing. When we apply the same initial conditions to the two solutions above, we get different values of α and β for each, So comparing exp(5t) and t exp(5t) is the wrong comparison. The right comparison is between the solutions to two initial value problems.

We’ll look at two example, one with λ positive and one with λ negative.

Example with λ > 0

If λ > 0, then solutions grow exponentially as t increases. The size of β t exp(λt) will eventually matter, regardless of how small β is. However, any error in the differential equation, such as measurement error in the initial conditions or error in computing the solution numerically, will also grow exponentially. The difference between a double root and two nearby roots may matter less than, say, how accurately the initial conditions are know.

Let’s look at

y″ − 10 y′ + 25 y = 0

and

y″ − 10 y′ + 24.9999 y = 0

both with initial conditions y(0) = 3 and y′(0) = 4.

The characteristic equation for the former has a double root λ = 5 and that of the latter has roots 4.99 and 5.01.

The solution to the first initial value problem is

3 exp(5t) − 11 t exp(5t)

and the solution to the second initial value problem is

551.5 exp(4.99t) − 548.5 exp(5.01t).

Notice that the coefficients are very different, illustrating the Greek letter paradox.

The expressions for the two solutions look different, but they’re indistinguishable for small t, say less than 1. But as t gets larger the solutions diverge. The same would be true if we solved the first equation twice, with y(0) = 3 and y(0) = 3.01.

Example with λ < 0

Now let’s look at

y″ + 10 y′ + 25 y = 0

and

y″ + 10 y′ + 24.9999 y = 0

both with initial conditions y(0) = 3 and y′(0) = 4. Note that the velocity coefficient changed sign from -10 to 10.

Now the characteristic equation of the first differential equation has a double root at -5, and the second has roots -4.99 and -5.01.

The solution to the first initial value problem is

3 exp(−5t) + 19 t exp(−5t)

and the solution to the second initial value problem is

951.5 exp(−4.99t) − 948.5 exp(5.01t).

Again the written forms of the two solutions look quite different, but their plots are indistinguishable.

Unlike the case λ > 0, now with λ < 0 the difference between the solutions is bounded. Before the solutions started out close together and eventually diverged. Now the solutions are close together forever.

Here’s a plot of the difference between the two solutions.

So the maximum separation between the two solutions occurs somewhere around t = 0.5 where the solutions differ in the sixth decimal place.

Conclusion

Whether the characteristic equation has a double root or two close roots makes a significant difference in the written form of the solution to a differential equation. If the roots are positive, the solutions are initially close together then diverge. If the roots are negative, the solutions are always close together.

Difference equations and differential equations

Posted on 12 June 2022 by John

Difference equations are very much analogous to differential equations. Difference equations are more elementary, but differential equations are more familiar.

It seems odd to use a more advanced thing to explain a simpler thing, like saying a quartet is a symphony orchestra with only four instruments. But if for some reason you managed to become familiar with orchestras before learning what a quartet is, this explanation would be helpful.

Differential equations are a standard part of the curriculum for science and engineering students, but difference equations are not. And not without reason: differential equations are a prerequisite for future courses but difference equations are not. But if a curriculum were to include difference equations and differential equations, it might make sense to teach the former first because the topic is more tangible.

To illustrate the similarity between the two topics, this post will solve the difference equation [1]

F_n+2 = F_n+1 + F_n

and the differential equation

y” = y‘ + y

The difference equation defines the Fibonacci numbers, with one set of initial conditions, and the Lucas numbers with different initial conditions. The differential equation is analogous in ways we’ll see below.

Both equations are solved by assuming the form of the solution with a free parameter. Two values of the parameter will be found via a quadratic equation,giving two independent solutions. All solutions are linear combinations of these two solutions.

Difference equation example

The Fibonacci numbers start with F₀ = 0 and F₁ = 1. From there it’s obvious that F₂ = 1, F₃ = 2, and so forth.

The Lucas numbers start with L₀ = 2 and L₁ = 1. From there we see L₂ = 3, L₄ = 4, etc.

Bootstrapping the solution starting from initial conditions is trivial. However, this does not give us a general expression for the solution, and it does not let us explore the solutions of the difference equation for all initial conditions.

So let’s go back and solve the difference equation first, then apply initial conditions, analogous to what one typically does for ordinary differential equations.

Assume solutions have the form F_n = xⁿ for some x to be determined. Then

xⁿ⁺² − xⁿ⁺¹ − xⁿ = 0

for all non-negative n and so

x² − x − 1 = 0

and the two roots are the golden ratio

ϕ = (1 + √5)/2

and its complement

1 − ϕ = (1 − √5)/2.

So the general solution to the recurrence relation

F_n+2 = F_n+1 + F_n

F_n = aϕⁿ + b(1 − ϕ)ⁿ

for constants a and b to be determined by initial conditions. For the Fibonacci numbers, a = 1/√5 and b = −a. For the Lucas numbers, a = b = 1.

Our assumption of the form of the solution is justified because it worked. There can’t be any other solutions of another form because clearly the initial conditions determine the third element of the sequence, and the second and third elements determine the fourth, etc.

Differential equation example

Assume solutions have the form y = exp(kt) for some k to be determined. Sticking this into the differential equation shows

k² exp(kt) − k exp(kt) − exp(kt) = 0

and so

k² − k − 1 = 0.

This is the same equation as above, and so the roots are ϕ and 1 − ϕ. The general solution is

y(t) = a exp(ϕt) + b exp((1−ϕ)t)

where the constants a and b are determined by the initial value of y and y‘.

It’s not as clear in this case that we’ve found all the solutions and that our solution is unique given initial conditions, but it’s true.

[1] Someone might object that this is a recurrence relation and not a difference equation: each term is defined in terms of its predecessors, but not in terms of function differences. But you could rewrite the equation in terms of differences.

If ΔF_n = F_n – F_n-1 then we can rewrite our equation as Δ² F – 3ΔF + F = 0.

Generating functions for polynomial sequences

Posted on 11 June 2022 by John

The previous post looked at a generating function for a specific polynomial sequence. This post will look at generating functions for polynomial sequences in general. (There’s an alternating term in the previous post that isn’t polynomial, but we’ll address that too.)

The starting point for this post is a simple observation:

$x\left(\frac{d}{dx} x^n \right) = n x^n$

If we let xD be the operator that differentiates a function then multiplies the result by x, we have

We can apply xD m times, each time multiplying xⁿ by a factor of n.

And more generally, for any polynomial p(x) we have

Now let S be a set of integers and form a generating function F(x) by summing xⁿ over n in S.

Then we have

$\begin{align*} p(xD)F(x) &= p(xD)\sum_S x^n \\ &= \sum_S p(xD)x^n \\ &= \sum_S p(n)x^n \end{align*}$

In words, multiplying the nth term of a generating function by p(n) is the same as operating on the generating function by p(xD).

Example

The previous post computed the generating function of

$Z_n = \frac{(-1)^n(3n+6) + 2n^3 + 12n^2 + 25n - 6}{12}$

using Mathematica. Here we will compute the generating function again using the result derived below.

Before we computed

$g(x) = \sum_{n=1}^\infty Z_nx^n$

by summing over the positive integers. But Z_n is not quite a polynomial function of n. Aside from the alternating term it is a cubic polynomial in n. The alternating term is a polynomial in n if we restrict ourselves to even values of n, and it is another polynomial if we restrict ourselves to odd values of n.

Define

$\begin{align*} F_1(x) &= \frac{2n^3 + 12n^2 + 25n - 6}{12} \\ F_2(x) &= \frac{(3n+6)}{12} \\ F_3(x) &= -\frac{(3n+6)}{12} \\ \end{align*}$

Then we have

$\sum_{n} Z_nx^n = \sum_{n} F_1(n)x^n + \sum_{n \text{ even}} F_2(n)x^n + \sum_{n \text{ odd}} F_3(n)x^n$

for positive integer n, splitting our original generating function into three generating functions, each summed over a different set of integers.

Define

$\begin{align*} g_1(x) &= \sum_{n > 1} x^n = \frac{x}{1-x}\\ g_2(x) &= \sum_{n > 1,\, n \text{ even}}^\infty x^n = \frac{x^2}{1 - x^2}\\ g_3(x) &= \sum_{n > 1,\, n \text{ odd}}^\infty x^n = \frac{x}{1 - x^2}\\ \end{align*}$

Then

If we expand the line above, we should get the same expression for g(x) as in the previous post.

Solving hard problems

Illegible work

Length of periods in the (infinite) periodic table

Doubly periodic but not analytic

Related posts

Letter-like Unicode symbols

Related posts

Periodic table of abbreviations

Continued fractions as matrix products

Related posts

Greek letter paradox

Double roots and ODEs

Double roots

Close roots

Example with λ > 0

Example with λ < 0

Conclusion

More differential equation posts

Difference equations and differential equations

Difference equation example

Differential equation example

Related posts

Generating functions for polynomial sequences

Example