Polynomial analog of the Collatz conjecture

The Collatz conjecture, a.k.a. the 3n + 1 problem, a.k.a. the hailstone conjecture, asks whether the following sequence always terminates.

Start with a positive integer n.

  1. If n is even, set nn /2. Otherwise n ← 3n + 1.
  2. If n = 1, stop. Otherwise go back to step 1.

The Collatz conjecture remains an unsolved problem, though there has been progress toward a proof. Some people, however, are skeptical whether the conjecture is true.

This post will look at a polynomial analog of the Collatz conjecture. Instead of starting with a positive integer, we start with a polynomial with coefficients in the integers mod m.

If the polynomial is divisible by x, then divide by x. Otherwise, multiply by (x + 1) and add 1. Here x is analogous to 2 and (x + 1) is analogous to 3 in the (integer) Collatz conjecture.

Here is Mathematica code to carry out the polynomial Collatz operation.

    c[p_, m_] := 
        PolynomialMod[
            If[ (p /. x -> 0) == 0, 
                 p/x, 
                 (x + 1) p + 1
            ], 
            m
        ]

If m = 2, the process always converges. In fact, it converges in at most n² + 2n steps where n is the degree of the polynomial [1].

Here’s an example starting with the polynomial x7 + x3 + 1.

    Nest[c[#, 2] &, x^7 + x^3 + 1, 15]

This returns 1, and so in this instance 15 iterations are enough.

If m = 3, however, the conjecture is false. In [1] the authors report that the sequence starting with x² + 1 is periodic with period 6.

The following code produces the sequence of values.

    NestList[c[#, 3] &, x^2 + 1, 6]

This returns the sequence

  • 1 + x2
  • 2 + x + x2 + x3
  • 2 x2 + 2 x3 + x4
  • 2 x + 2 x2 + x3
  • 2 + 2 x + x2
  • x + x3
  • 1 + x2

Related posts

[1] Kenneth Hicks, Gary L. Mullen, Joseph L. Yucas and Ryan Zavislak. A Polynomial Analogue of the 3n + 1 Problem. The American Mathematical Monthly, Vol. 115, No. 7. pp. 615-622.

Major memory system telephone keypad

This weekend I had to enter an alphabetic passcode on a numeric keypad. The keypad used the same letter-to-digit convention as a phone, but the letters were not printed on the keypad. That made me think about how much better the Major system is.

I wondered what phone keypads would look like if they used the Major memory system, and so I made the image below.

0: S Z, 1: T D ð θ, 2: N ŋ, 3: M, 4: R, 5: L, 6: ʤ ʧ ʃ ʒ, 7: K G, 8: F V, 9: P B

The Major memory system is a way of encoding numbers as words to make them easier to memorize. The system associates a consonant sound with each digit; you’re free to insert any vowels you like. For example, if you wanted to memorize 745, you might encode it as gorilla.

Note that gorilla decodes to 745 and not 7455 because the word has only one L sound, even though it is spelled with two Ls.

One nice feature of the Major system is that if you multiply a number by 10, you can pluralize its mnemonic. For example, a possible mnemonic for 7450 is gorillas.

The Major system emphasizes sounds because humans remember sounds more easily than symbols. If you have a photographic memory for symbols, just memorize the digits and don’t bother with any mnemonic system.

Some of the sounds associated with digits are not represented by a single letter in English and so the keypad above contains a few IPA (International Phonetic Alphabet) symbols. The number 1 is encoded by any of the sounds “t”, “d”, or “th.” The IPA symbol θ represents the th sound in think and the ð represents the th sound in this. The symbol ŋ as a possible encoding for 2 represents the ng sound in sing. And the number 6 can be encoded as one of several similar sounds: ch, sh, soft g, or soft z.

The conventional phone keypad looks simpler: 2 = A, B, or C, 3 = D, E, or F, etc. It’s the kind of thing James Scott would call “legible,” something that looks simple on paper and warms a bureaucrat’s heart, but doesn’t necessarily work well in practice. The sounds associated with the letters for a given digit have nothing in common, so a number can be represented by dissimilar sounds, and similar sounds do not represent the same number.

Encoding telephone numbers as words is rarely possible using the conventional keypad letters. Phone numbers that do correspond to memorable words are highly valued. Every letter has to correspond to a digit, and it matters how the word is spelled.

The Major system is much more flexible since you’re free to supply vowels as you wish, and you can choose from a wide variety of words that spell a single consonant sound in different ways.

Related posts

Approximation error over unit disk

The previous post ended by plotting the error in approximating exp(x) with (2+x)/(2 – x):

error in linear and bilinear approximations to exp

The error is much larger on the right end than the left. That made me wonder what the error might look like in the complex plane. Does the slice along real axis exhibit the minimum and maximum error, or are the other places in the unit disk where the error is even smaller or even larger?

Here’s a plot of the error over the unit square.

Here’s the Mathematica code that produced the plot.

    ComplexPlot3D[ Exp[z] - (2 + z)/(2 - z), 
        {z, -1 - I, 1 + I}, PlotRange -> All]

It looks like the error might indeed be minimized and maximized along the real axis.

Incidentally, we can tell the error has a zero of order three at the origin because the sequence of colors, representing the phase, goes through three cycles.

We know the maximum and minimum of the absolute value of the error over the unit disk both occur on the boundary, i.e. on the unit circle, by the maximum modulus theorem. So we plot the error along the rim of the unit disk by letting z = exp(2πit) for 0 ≤ t ≤ 1.

Here’s the code that produced the plot.

    Plot[Abs[f[Exp[2 Pi I t]]], {t, 0, 1}]

It appears the error is maximized when t is 0 or 1, i.e. when z = 1, and minimized when t = 1/2, i.e. when z = -1.

Simple derivation of exponential approximation

I was watching one of Brian Douglas’ videos on control theory (Discrete Control #5) and ran into a simple derivation of an approximation I presented earlier.

Back in April I wrote several post on simple approximations for log, exp, etc. In this post I gave an approximation for the exponential function:

\exp(x) \approx \frac{2 + x}{2 - x}

The control theory video arrives at the same approximation as follows.

\begin{align*} \exp(x) &= \exp(x/2)\, \exp(x/2) \\ &= \frac{\exp(x/2)}{\exp(-x/2)} \\ &\approx \frac{1 + x/2}{1 - x/2} \\ &= \frac{2 + x}{2-x} \end{align*}

As I believe I’ve suggested before here, in a derivation like the one above, where you have mostly equalities and one or two approximations, pay special attention to the approximation steps. The approximation step above uses a first order Taylor approximation in the numerator and denominator.

The plot below shows that the approximation above (the bilinear approximation) is more accurate than doing a single Taylor approximation, approximating exp(x) by 1 + x (linear approximation).

exp, linear approx, bilinear approx

Here’s a plot focusing on the error in the bilinear and linear approximations.

error in linear and bilinear approximations to exp

The bilinear approximation is hard to tell from 0 in the plot above for x up to 0.5.

The derivation above is simple, but why is the result so good? An explanation in terms of Padé approximation is given here.

Upper case, lower case, title case

Converting text to all upper case or all lower case is a fairly common task.

One way to convert text to upper case would be to use the tr utility to replace the letters a through z with the letters A through Z. For example,

    $ echo Now is the time | tr '[a-z]' '[A-Z]'
    NOW IS THE TIME

You could convert to lower case by reversing the arguments to tr.

The approach above works if your text consists of only unadorned Roman letters. But it wouldn’t work, for example, if you gave it a jalapeño or π:

    $ echo jalapeño π | tr '[a-z]' '[A-Z]'
    JALAPEñO π

Using the character classes [:lower:] and [:upper:] won’t help either.

Tussling with Unicode

One alternative would be to use the uc command from the Unicode::Tussle package [1] I mentioned a few days ago. There’s also a lc counterpart, and a tc for title case. These utilities handle far more than Roman letters.

    $ echo jalapeño π | uc
    JALAPEÑO Π

Unicode capitalization rules are a black hole, but we’ll just look at one example and turn around quickly before we cross the event horizon.

Suppose you want to send all the letters in the Greek word σόφος to upper case.

    $ echo σόφος | uc
    ΣΌΦΟΣ

Greek has two lower case forms of sigma: ς at the end of a word and σ everywhere else. But there’s only one upper case sigma, so both get mapped to Σ. This means that if we convert the text to upper case and then to lower case, we won’t end up exactly where we started.

    $ echo σόφος | uc | lc
    σόφοσ

Note that the lc program chose σ as the lower case of Σ and didn’t take into account that it was at the end of a word.

Related posts

[1] “Tussle” is an acronym for Tom [Christiansen]’s Unicode Scripts So Life is Easier.

Continued fraction from entropy source testing

NIST publication 800-90B, Recommendations for the Entropy Sources Used for Random Bit Generation, contains an interesting continued fraction.

Define a function

F\left(\frac{1}{z}\right) = \Gamma(3,z) z^{-3} e^z

where

\Gamma(a, b) = \int_b^\infty t^{a-1} e^{-t} \, dt

is the incomplete gamma function.

NIST 800-90B gives [1] a continued fraction implement F, but the continued fraction includes a parameter k for no apparent reason. NIST cites a paper that in turn cites Abramowitz and Stegun formula 6.5.31.

It appears that the k in NIST 800-90B corresponds to a-1 in A&S 6.5.31, where a is the first argument to the incomplete gamma function and the exponent on 1/z, and so a = 3.

The continued fraction in A&S is

\cfrac{1}{z+\cfrac{1-a}{1+\cfrac{1}{z+\cfrac{2-a}{1+ \cfrac{2}{z +\ddots}}}}}

However, it’s not entirely clear how to fill in the dots. Apparently the denominators alternate between z and 1, but what about the numerators. The pattern we are give is

1, 1-a, 1, 2-a, 2

That’s as far as A&S goes. My initial guess was that we should ignore the 1 at the beginning of the series. In that case the rest of the series would be

3-a, 3, 4-a, 4, …

and that turns out to be right. If you don’t ignore the first 1, it can be hard to figure out how it fits into the pattern (because it doesn’t!).

How many terms of the continued fraction do we need? It depends on the size of the argument z. For large z, say z > 4, the terms shown above give a good approximation. But for smaller z, the approximation is terrible. For example, F(1) = 5, but the continued fraction evaluated at 1 gives -3/2.

The NIST paper requires evaluating F for values around 1, and it would take some work to figure out how far to carry the continued fraction before it is reliable in that range.

Related posts

[1] Why does NIST include a way to compute F? The incomplete gamma function is difficult to implement well and some mathematical libraries don’t include it. Having a stand-alone implementation of this function might make a testing program more self-contained.

Removing Unicode formatting

Several people responded to my previous post asserting that screen readers would not be able to read text formatted via Unicode variants. Maybe some screen readers can’t handle this, but there’s no reason they couldn’t.

Before I go any further, I’d like to repeat my disclaimer from the previous post:

It’s a dirty hack, and I’d recommend not overdoing it. But it could come in handy occasionally. On the other hand, some people may not see what you intend them to see.

This formatting is gimmicky and there are reasons to only use it sparingly or not at all. But I don’t see why screen readers need to be stumped by it.

In the example below, I format the text “The quick brown fox” by running it through unifont as in the previous post.

If we pipe the output through unidecode then we mostly recover the original text. (I wrote about unidecode here.)

    $ unifont The quick brown fox | unidecode 

            Double-Struck: The quick brown fox
                Monospace: The quick brown fox
               Sans-Serif: The quick brown fox
        Sans-Serif Italic: The quick brown fox
          Sans-Serif Bold: The quick brown fox
   Sans-Serif Bold Italic: The quick brown fox
                   Script: T he quick brown fox
                   Italic: The quick brown fox
                     Bold: The quick brown fox
              Bold Italic: The quick brown fox
                  Fraktur: T he quick brown fox
             Bold Fraktur: T he quick brown fox

The only problem is that sometimes there’s an extra space after capital letters. I don’t know whether this is a quirk of unifont or unidecode.

This isn’t perfect, but it’s a quick proof of concept that suggests this shouldn’t be a hard thing for a screen reader to do.

Maybe you don’t want to normalize Unicode characters this way all the time, but you could have some configuration option to only do this for Twitter, or to only do it for characters outside a certain character range.

How to format text in Twitter

Twitter does not directly provide support for formatting text in bold, italic, etc. But it does support Unicode characters [1], and so a hack to get around the formatting limitation is to replace letters with Unicode variants.

For example, you could tweet

How to include bold or italic text in a tweet.

I cheated in the line above, using bold and italic formatting rather than Unicode characters because some readers might not be able to read it.

Here’s a screenshot of the actual Unicode text in Emacs. You can see the text in the footnotes [2].

This is plain text. I have asked for the details on the ‘b’ in bold, and the bottom windows shows that it is not the common U+0062 for ‘b’ down in the ASCII range, but U+1D5EF up in the Supplementary Multilingual Plane. Similarly, the i in italic above is not U+0069 but U+1D456.

Here’s how the text appears in Twitter:

It’s a dirty hack, and I’d recommend not overdoing it. But it could come in handy occasionally. On the other hand, some people may not see what you intend them to see. Here’s a portion of a screenshot from an Android device:

How to include XXXX or XXXXXX test

As a very rough rule of thumb, characters with smaller Unicode values are more likely to display correctly everywhere. Math symbols like ∞ (U+221E) work everywhere as far as I know. I wouldn’t depend on any Unicode character above 0xFFFF.

Update: Several people have said this formatting poses a problem for speech readers. The next post explains why it shouldn’t. (Maybe it does cause a problem, but it wouldn’t have to.)

How to produce Unicode formatting

I produced the Unicode text above using the programs unifont and unisupers from the Perl module Unicode::Tussle. See this post for how to install the module. Here’s a screenshot of using these utilities from the command line.

To use unifont, type the text you’d like to format after the command. It then shows the text formatted several ways using Unicode characters. I typed “bold” and copied the bold version of the word. The text could be anything; it’s a coincidence that I gave it text that was also a format name. For example, I created the double-struck R and C above with the command

    unifont R C

The unisupers command does not take an argument but instead takes its input from standard input. So I hit return after the command name and then typed ‘n’ to get the superscript n.

Related posts

[1] Twitter supports Unicode characters, but there’s a question of whether readers will have fonts installed to display the characters. I wrote eight years ago about some symbols users were and were not likely to see, but my impression is that the situation has improved quite a bit since then.

[2] Here’s the actual text of the tweet:

How to include  or  text in a tweet.
Weierstrass function.
Im: ℂ -> ℝ
ℝⁿ -> ℝᵐ

(I pasted the text into my blogging software, but it looks like it is deleting the words “bold” and “italic.”)

Predator-Prey period

The Lotka-Volterra equations are a system of nonlinear differential equations for modeling a predator-prey ecosystem. After a suitable change of units the equations can be written in the form

\begin{align*} \frac{dx}{dt} &= \phantom{-}ax(y-1) \\ \frac{dy}{dt} &= -by(x-1) \end{align}

where ab = 1. Here x(t) is the population of prey at time t and y(t) is the population of predators. For example, maybe x represents rabbits and y represents foxes, or x represents Eloi and y represents Morlocks.

It is well known that the Lotka-Volterra equations have periodic solutions. It is not as well known that you can compute the period of a solution without having to first solve the system of equations.

This post will show how to compute the period of the system. First we’ll find the period by solving the equations for a few different initial conditions, then we’ll show how to directly compute the period from the system parameters.

Phase plot

Here is a plot of (x(t), y(t)) showing that the solutions are periodic.

Predator-prey phase plots for varying initial conditions

And here’s the Python code that made the plot above.

    import matplotlib.pyplot as plt
    import numpy as np
    from scipy.integrate import solve_ivp
    
    # Lotka-Volterra equations
    def lv(t, z, a, b):
        x, y = z
        return [a*x*(y-1), -b*y*(x-1)]
    
    begin, end = 0, 20
    t = np.linspace(begin, end, 300)
    
    styles = ["-", ":", "--"]
    prey_init = [2, 4, 8]
    a = 1.5
    b = 1/a
    
    for style, p0 in zip(styles, prey_init):
        sol = solve_ivp(lv, [begin, end], [p0, 1], t_eval=t, args=(a, b))
        plt.plot(sol.y[0], sol.y[1], style)
    
    plt.xlabel("prey")
    plt.ylabel("preditor")
    plt.legend([f"$p_0$ = {p0}" for p0 in prey_init])    

Note that the derivative of x is zero when y = 1. Since our initial condition sets y(0) = 1, we’re specifying the maximum value of x. (If our initial values of x were less than 1, we’d be specifying the minimum of x.)

Time plot

The components x and y have the same period, and that period depends on a and b, and on the initial conditions. The plot below shows how the period increases with x(0), which as we noted above is the maximum value of x.

Predator-prey solutions over time for varying initial conditions

And here’s the code that made the plot.

    fig, ax = plt.subplots(3, 1)
    for i in range(3):
        sol = solve_ivp(lv, [begin, end], [prey_init[i], 1], t_eval=t, args=(a, b))
        ax[i].plot(t, sol.y[0], "-")
        ax[i].plot(t, sol.y[1], "--")    
        ax[i].set_xlabel("$t$")
        ax[i].set_ylabel("population")
        ax[i].set_title(f"x(0) = {prey_init[i]}")
    plt.tight_layout()

Finding the period

In [1] the author develops a way to compute the period of the system as a function of its parameters without solving the differential equations.

First, calculate an invariant h of the system:

h = b(x - \log(x) - 1) + a(y - \log(y) - 1)

Since this is an invariant we can evaluate it anywhere, so we evaluate it at the initial conditions.

Then the period only depends on a and h. (Recall we said we can scale the equations so that ab = 1, so a and b are not independent parameters.)

If h is not too large, we can compute the approximate period using an asymptotic series.

P(h) = 2\pi\left( 1 + \frac{\sigma}{6} h + \frac{\sigma^2}{144} h^2 + \left(\frac{\sigma}{360} - \frac{139\sigma^3}{38880}\right) h^3 + \cdots \right)

where σ = (a + b)/2 = (a² + 1)/2a.

We use this to find the periods for the example above.

    def find_h(a, x0, y0):
        b = 1/a
        return b*(x0 - np.log(x0) - 1) + a*(y0 - np.log(y0) - 1)

    def P(h, a):
        sigma = 0.5*(a + 1/a)
        s = 1 + sigma*h/6 + sigma**2*h**2/144
        return 2*np.pi*s

    print([P(find_h(1.5, p0, 1), 1.5) for p0 in [2, 4, 8]])

This predicts periods of roughly 6.5, 7.5, and 10.5, which is what we see in the plot above.

When h is larger, the period can be calculated by numerically evaluating an integral given in [1].

Related posts

[1] Jörg Waldvogel. The Period in the Volterra-Lotka Predator-Prey Model. SIAM Journal on Numerical Analysis, Dec., 1983, Vol. 20, No. 6, pp. 1264-1272

Error estimates for splines with more boundary conditions

Yesterday I wrote about rates of convergence for natural cubic splines. This brief post reports similar results for more boundary conditions.

As explained in the earlier post, a cubic spline that interpolates a function f at n+1 points satisfies 4n -2 equations in 4n variables. Two more equations are necessary to uniquely determine the interpolating cubic spline.

A natural cubic spline S over [a, b] satisfies

S”(a) = 0
S”(b) = 0.

There are other ways of coming up with two more equations. You might match first derivatives at the end points

S‘(a) = f‘(a)
S‘(b) = f‘(b)

or second derivatives

S”(a) = f”(a)
S”(b) = f”(b).

Or if f is periodic you could require first and second derivatives of the spline to match at a and b.

S‘(a) = S‘(b)
S”(a) = S”(b).

In each case, you get the same order of convergence as we reported earlier for the natural cubic spline, assuming f is 4 times continuously differentiable over [a, b]. The error goes down like O(1/n4).

Source: D. Kershaw. A Note on the Convergence of Interpolatory Cubic Splines. SIAM Journal on Numerical Analysis, Mar., 1971, Vol. 8, No. 1 pp. 67-74.

Example

When we use natural cubic splines to fit exp(x) over [0, 2π] in the earlier post, the error was fairly large. We get better results when we specify the derivatives at the two ends of the interval. Here’s what we get for 11 points:

The error is about 7 times smaller than with natural cubic splines.

And here’s what we get for 21 points:

In this case the error is about 8x smaller than with natural cubic splines.

When we go from 11 to 21 points, specifying the derivatives on each end, the error drops by about a factor of 15, close value of 16 we’d expect in the limit.

Incidentally, to specify the boundary conditions of the spline in Python we need to add an option parameter bc_type in the call to CubicSpline. Otherwise the code is the same as before. In our example

    bc_type=((1, 1), (1, np.exp(2*np.pi)))

We set bc_type equal to a pair of pairs. The first pair gives a boundary condition on the left end and the second gives a boundary condition on the right. The first argument of the inner pair in both cases is 1, because we’re specifying 1st derivatives. Use a 2 to specify 2nd derivatives. The second argument to the pair the value of the derivative: exp(0) = 1 on the left, and exp(2π) on the right.