Make boring work harder

I was searching for something this morning and ran across several pages where someone blogged about software they wrote to help write their dissertations. It occurred to me that this is a pattern: I’ve seen a lot of writing tools that came out of someone writing a dissertation or some other book.

The blog posts leave the impression that the tools required more time to develop than they would save. This suggests that developing the tools was a form of moral compensation, procrastinating by working on something that feels like it’s making a contribution to what you ought to be doing.

Even so, developing the tools may have been a good idea. As with many things in life, it makes more sense when you ask “Compared to what“? If the realistic alternative to futzing around with scripts was to write another chapter of the dissertation, then developing the tools was not the best use of time, assuming they don’t actually save more time than they require.

But if the realistic alternative was binge watching some TV series, then writing the tools may have been a very good use of time. Any time the tools save is profit if the time that went into developing them would have otherwise have been wasted.

Software developers are often criticized for developing tools rather than directly developing the code they’re paid to write. Sometimes these tools really are a good investment. But even when they’re not, they may be better than the realistic alternative. They may take time away from Facebook rather than time away from writing production code.

Another advantage to tool building, aside from getting some benefit from time that otherwise would have been wasted, is that it builds momentum. If you can’t bring yourself to face the dissertation, but you can bring yourself to write a script for writing your dissertation, you might feel more like facing the dissertation afterward.

Related post: Automate to save mental energy, not time

Sum of divisor powers

The function σk takes an integer n and returns the sum of the kth powers of divisors of n. For example, the divisors of 14 are 1, 2, 4, 7, and 14. If we set k = 3 we get

σ3(n) = 1³ + 2³ + 4³ + 7³ + 14³ = 3096.

A couple special cases may use different notation.

  • σ1(n) is the sum of the divisors of n and the function is usually written σ(n) with no subscript.

In Python you can compute σk(n) using divisor_sigma from SymPy. You can get a list of the divisors of n using the function divisors, so the bit of code below illustrates that divisor_sigma computes what it’s supposed to compute.

    n, k = 365, 4
    a = divisor_sigma(n, k)
    b = sum(d**k for d in divisors(n))
    assert(a == b)

The Wikipedia article on σk gives graphs for k = 1, 2, and 3 and these graphs imply that σk gets smoother as k increases. Here is a similar graph to those in the article.

The plots definitely get smoother as k increases, but the plots are not on the same vertical scale. In order to make the plots more comparable, let’s look at the kth root of σk(n). This amounts to taking the Lebesgue k norm of the divisors of n.

Now that the curves are on a more similar scale, let’s plot them all on a single plot rather than in three subplots.

If we leave out k = 1 and add k = 4, we get a similar plot.

The plot for k = 2 that looked smooth compared to k = 1 now looks rough compared to k = 3 and 4.

An almost integer pattern in many bases

A few days ago I mentioned a passing comment in video by Richard Boucherds. This post picks up on another off-hand remark from that post.

Boucherds was discussing why exp(π √67) and exp(π √163) are nearly an integer.

exp(π √67) = 147197952744 – ε1
exp(π √163) = 262537412640768744 – ε2

where ε1 and ε2 and on the order of 10-6 and 10-12 respectively.

He called attention to the 744 at the end and commented that this isn’t just an artifact of base 10. In many other bases, these numbers end in that bases’ representation of 744. This is what I wanted to expand on.

To illustrate Boucherds’ remark in hexadecimal, note that

    147197952744 -> 0x2245ae82e8 
    262537412640768744 -> 0x3a4b862c4b402e8 
    744 -> 0x2e8 

Boucherds’ comment is equivalent to saying

147197952744 = 744 mod m


262537412640768744 = 744 mod m

for many values of m. Equivalently 147197952000 and 262537412640768000 have a lot of factors; every factor of these numbers is a base where the congruence holds.

So for how many bases m are these numbers congruent to 744?

The number of factors of a number n is written d(n). This is a multiplicative function, meaning that for relatively prime numbers a and b,

d(ab) = d(a) d(b).

Note that

147197952000 = 215 33 53 113


262537412640768000 = 218 33 53 233 293

It follows that

d(147197952000) =
d(215 33 53 113) =
d(215) d(33) d(53) d(113).

Now for any prime power pk

d(pk) = k + 1,

and so

d(147197952000) = 16 × 4 × 4 × 4.


d(262537412640768000) = 19 × 4 × 4 × 4 × 4.

For more about almost integers, watch Boucherds’ video and see this post.

Org entities

This morning I found out that Emacs org-mode has its own markdown entities, analogous to HTML entities or LaTeX commands. Often they’re identical to LaTeX commands. For example, \approx is the approximation symbol ≈, exactly as in LaTeX.

So what’s the advantage of org-entities? In fact, how does Emacs even know whether \approx is a LaTeX command or an org entity?

If you use the command C-c C-x \ , Emacs will show you the compiled version of the entity, i.e. ≈ rather than the command \approx. This is global: all entities are displayed. The org entities would be converted to symbols if you export the file to HTML or LaTeX, but this gives you a way to see the symbols before exporting.

Here something that’s possibly surprising, possibly useful. The symbol you see is for display only. If you copy and paste it to another program, you’ll see the entity text, not the symbol. And if you C-c C-x \ again, you’ll see the command again, not the symbol; Note that the full name of the command is org-toggle-pretty-entities with “toggle” the middle.

If you use set-input-method to enter symbols using LaTeX commands or HTML entities as I described here, Emacs inserts a Unicode character and is irreversible. Once you type the LaTeX command \approx or the corresponding HTML entity ≈, any knowledge of how that character was entered is lost. So org entities are useful when you want to see Unicode characters but want your source file to remain strictly ASCII.

Incidentally, there are org entities for Hebrew letters, but only the first four, presumably because these are the only ones used as mathematical symbols.

To see a list of org entities, use the command org-entities-help. Even if you never use org entities, the org entity documentation makes a nice reference for LaTeX commands and HTML entities. Here’s a screenshot of the first few lines.

First few lines of org-entities-help

Related posts

How big is the monster?

Voyager 1 photo of Jupiter. Jan. 6, 1979

Symmetries are captured by mathematical groups. And just as you can combine kinds symmetry to form new symmetries, you can combine groups to form new groups.

So-called simple groups are the building blocks of groups as prime numbers are the building blocks of integers [1].

Finite simple groups have been fully classified, and they fall into several families, with 26 exceptions that fall outside any of these families. The largest of these exceptional groups is called the monster.

The monster is very large, containing approximately 8 × 1053 elements. I saw a video by Richard Boucherds where he mentioned in passing that the number of elements in the group is a few orders of magnitude larger than the number of atoms in earth.

I tried to find a comparison that is closer to 8 × 1053 and settled on the number of atoms in Jupiter.

The mass of Jupiter is about 2 × 1027 kg. The planet is roughly 3/4 hydrogen and 1/4 helium by mass, and from that you can calculate that Jupiter contains about 1054 atoms.

Before doing my calculation with Jupiter, I first looked at lengths such as the diameter of the Milky Way in angstroms. But even the diameter of the observable universe in angstroms is far too small, only about 9 × 1036.

More on finite simple groups

[1] The way you build up groups from simple groups is more complicated than the way you build integers out of primes, but there’s still an analogy there.

The photo at the top of the post was taken by Voyager 1 on January 6, 1979.

Square waves and cobwebs

This is a follow-up to yesterday’s post. In that post we looked at iterates of the function

f(x) = exp( sin(x) )

and noticed that even iterations of f converged to a square wave. Odd iterates also converge to a square wave, but a different one. The limit of odd iterations is the limit of even iterations turned upside down.

Jan Van lint correctly pointed out in the comments

If you plot y=f(f(x)) and y=x, you can see that there are two stable fixed points, with one unstable fixed point in between.

Here’s the plot he describes.

You can see that there are fixed points between 1.5 and 2.0, between 2.0 and 2.5, and between 2.5 and 3. The following Python code will find these fixed points for us.

    from numpy import exp, sin
    from scipy.optimize import brentq

    def f(x): return exp(sin(x))
    def g(x): return f(f(x)) - x

    brackets = [(1.5, 2,0), (2.0, 2.5), (2.5, 3)]
    roots = [brentq(g, *b) for b in brackets]

This shows that the fixed points of g are


If we apply f to each of these fixed points, we get the same numbers again, but in the opposite order. This is why the odd iterates and even iterates are upside-down from each other.

Next we’ll show a couple cobweb plots to visualize the convergence of the iterates. We’ll also show that the middle fixed point is unstable by looking at two iterations, one starting slightly below it and the other starting slightly above it.

The first starts at 2.2191 and progressed down toward the lower fixed point.

The second starts at 2.2192 and progresses up toward the upper fixed point.

Related posts

Unexpected square wave

Last night a friend from Vanderbilt, Father John Rickert, sent me a curious math problem. (John was a PhD student in the math department while I was a postdoc. He went on to be a Catholic priest after graduating.) He said that if you look at iterates of

f(x) = exp( sin(x) )

the plots become square.

Here’s a plot to start the discussion, looking at f(x), f(f(x)), f(f(f(x))), and f(f(f(f(x)))).

One thing that is apparent is that there’s a big difference between applying f once and applying it at least two times. The former can range between 1/e and e, but the latter must be between exp(1/e) and e.

The plots overlap in a way that’s hard to understand, so let’s spread them out, plotting one iteration at a time.

Now we can see the plots becoming flatter as the number of iterations increases. We can also see that even and odd iterations are roughly mirror images of each other. This suggests we should make a plot of at just even or just odd iterates. Here we go:

This shows more clearly how the plots are squaring off pretty quickly as the number of iterations increases.

My first thought when John showed me his plot was that it must have something to do with the contraction mapping theorem. And I suppose it does, but not in a simple way. The function f(x) is not a contraction mapping for any x. But f(f(x)) is a contraction mapping for some x, and further iterates are contractions for more values of x.

Update: See the next post calculates the fixed points of f(f(x)) and demonstrates how the stable fixed points converge and the unstable fixed point does not.

My second thought was that it would be interesting to look at the Fourier transform of these iterates. For any finite number of iterations, the result is a periodic, analytic function, and so eventually the Fourier coefficients must go to zero rapidly. But I suspect they initially go to zero slowly, like those of a square wave would. I intend to update this post after I’ve had a chance to look at the Fourier coefficients.

Update: As expected, the Fourier coefficients decay slowly. I plotted the Fourier sine coefficients for f(f(f(f(x)))) using Mathematica.

    f[x_] := Exp[Sin[x]]
    g[x_] := f[f[f[f[x]]]]
    ListPlot[ Table[
        NFourierSinCoefficient[g[x], x, i], 
        {i, 30}]

This produced the plot below.

The even-numbered coefficients are zero, but the odd-numbered coefficients are going to zero very slowly. Since the function g(x) is infinitely differentiable, for any k you can go far enough out in the Fourier series that the coefficients go to zero faster than nk. But initially the Fourier coefficients decay like 1/n, which is typical of a discontinuous function, like a square wave, rather than an infinitely differentiable function.

By the way, if you replace sine with cosine, you get similar plots, but with half as many flat spots per period.

Related posts

Not quite going in circles

foggy path

Sometimes you feel like you’re running around in circles, not making any progress, when you’re on the verge of a breakthrough. An illustration of this comes from integration by parts.

A common example in calculus classes is to integrate ex sin(x) using integration by parts (IBP). After using IBP once, you get an integral similar to the one you started with. And if you apply IBP again, you get exactly the integral you started with.

At this point you believe all is lost. Apparently IBP isn’t going to work and you’ll need to try another approach.

\begin{align*} \int e^x \sin x \, dx &= -e^x \cos x + \int e^x \cos x \, dx \\ &= e^x(\sin x - \cos x) - \int e^x \sin x \, dx\end{align*}

But then the professor pulls a rabbit out of a hat. By moving the integral on the right side to the left, you can solve for the unknown integral in terms of the debris IBP left along the way.

\int e^x \sin x \, dx = \frac{e^x}{2}(\sin x - \cos x)

So you weren’t going in circles after all. You were making progress when it didn’t feel like it.

It’s not that gathering unknowns to one side is a new idea; you would have seen that countless times before you get to calculus. But that’s not how integration usually works. You typically start with the assigned integral and steadily chip away at it, progressing in a straight line toward the result. The trick is seeing that a simple idea from another context can be applied in this context.

In the calculation above we first let u = ex and v’ = sin(x) on the left. Then when we come to the first integral on the right, we again set u = ex and this time v’ = cos(x).

But suppose you come to the second integral and think “This is starting to look futile. Maybe I should try something different. This time I’ll let ex be the v‘ term rather than the u term.” In that case you really will run in circles. You’ll get exactly the same expression on both sides.

It’s hard to say in calculus, or in life in general, whether persistence or creativity is called for. But in this instance, persistence pays off. If you set u to the exponential term both times, you win; if you switch horses mid stream, you lose.

Another way to look at the problem is that the winning strategy is to be persistent in your approach to IBP, but use your creativity at the end to realize that you’re nearly done, just when it may seem you need to start over.

If you’d like to read more about coming back where you started, see Coming full circle and then this example from signal processing.

Transliterating Hebrew

Yesterday I wrote about cjhebrew, a LaTeX package that lets you insert Hebrew text by using a sort of transliteration scheme. That reminded me of unidecode, a Python package for transliterating Unicode to ASCII, that I wrote about before. I wondered how the two compare, and so this post will answer that question.

Transliteration is a crude approximation. I started to say it’s no substitute for a proper translation, but in fact sometimes it is a substitute for a proper translation. It takes in the smallest context possible—one character—and is utterly devoid of nuance, but it still might be good enough for some purposes. It might, for example, help in searching some text for relevant content worth the effort of a proper translation.

Here’s a short bit of code to display unidecode‘s transliterations of the Hebrew alphabet.

    for i in range(22+5):
        ch = chr(i + ord('א'))
        print(ch, unidecode.unidecode(ch))

I wrote 22 + 5 rather than 27 above to give a hint that the extra values are the final forms of five letters [1]. Also if ord('א') doesn’t work for you, you can replace it with 0x05d0.

Here’s a comparison of the transliterations used in cjhebrew and unidecode. I’ve abbreviated the column headings to make a narrower table.

| Unicode |   | cj | ud |
| U+05d0  | א | '  | A  |
| U+05d1  | ב | b  | b  |
| U+05d2  | ג | g  | g  |
| U+05d3  | ד | d  | d  |
| U+05d4  | ה | h  | h  |
| U+05d5  | ו | w  | v  |
| U+05d6  | ז | z  | z  |
| U+05d7  | ח | .h | KH |
| U+05d8  | ט | .t | t  |
| U+05d9  | י | y  | y  |
| U+05da  | ך | K  | k  |
| U+05db  | כ | k  | k  |
| U+05dc  | ל | l  | l  |
| U+05dd  | ם | M  | m  |
| U+05de  | מ | m  | m  |
| U+05df  | ן | N  | n  |
| U+05e0  | נ | n  | n  |
| U+05e1  | ס | s  | s  |
| U+05e2  | ע | `  | `  |
| U+05e3  | ף | P  | p  |
| U+05e4  | פ | p  | p  |
| U+05e5  | ץ | .S | TS |
| U+05e6  | צ | s  | TS |
| U+05e7  | ק | q  | q  |
| U+05e8  | ר | r  | r  |
| U+05e9  | ש | /s | SH |
| U+05ea  | ת | t  | t  |

The transliterations are pretty similar, despite different design goals. The unidecode module is trying to pick the best mapping to ASCII characters. The cjhebrew package is trying to use mnemonic ASCII sequences to map into Hebrew. The former doesn’t need to be unique, but the latter does. The post on cjhebrew explains, for example, that it uses capital letters for final forms of Hebrew letters.

Here’s the corresponding table for vowel points (niqqud).

| Unicode |   | cj | ud |
| U+05b0  | ְ  | :  | @  |
| U+05b1  | ֱ  | E: | e  |
| U+05b2  | ֲ  | a: | a  |
| U+05b3  | ֳ  | A: | o  |
| U+05b4  | ִ  | i  | i  |
| U+05b5  | ֵ  | e  | e  |
| U+05b6  | ֶ  | E  | e  |
| U+05b7  | ַ  | a  | a  |
| U+05b8  | ָ  | A  | a  |
| U+05b9  | ֹ  | o  | o  |
| U+05ba  | ֺ  | o  | o  |
| U+05bb  | ֻ  | u  | u  |

Related posts

[1] Unicode lists the final forms of letters come before the ordinary form. For example, final kaf has Unicode value U+05da and kaf has value U+05db.

Permutable polynomials

Two polynomials p(x) and q(x) are said to be permutable if

p(q(x)) = q(p(x))

for all x. It’s not hard to see that Chebyshev polynomials are permutable.


Tn(x) = cos (n arccos(x))

where Tn is the nth Chebyshev polyomial. You can take this as a definition, or if you prefer another approach to defining the Chebyshev polynomials it’s a theorem.

Then it’s easy to show that

Tm(Tn(x)) = Tmn (x).


cos(m arccos(cos(n arccos(x)))) = cos(mn arccos(x)).

Then the polynomials Tm and Tn must be permutable because

Tm(Tn(x)) = Tmn (x) = Tn(Tm(x))

for all x.

There’s one more family of polynomials that are permutable, and that’s the power polynomials xk. They are trivially permutable because

(xm)n = (xn)m.

It turns out that the Chebyshev polynomials and the power polynomials are essentially [1] the only permutable sequence of polynomials.

Related posts

[1] Here’s what “essentially” means. A set of polynomials, at least one of each positive degree, that all permute with each other is called a chain. Two polynomials p and q are similar if there is an affine polynomial

λ(x) = ax + b

such that

p(x) = λ-1( q( λ(x) ) ).

Then any permutable chain is similar to either the power polynomials or the Chebyshev polynomials. For a proof, see Chebyshev Polynomials by Theodore Rivlin.