Morse code golf

You can read the title of this post as ((Morse code) golf) or as (Morse (code golf)).

Morse code is a sort of approximate Huffman coding of letters: letters are assigned symbols so that more common letters can be transmitted more quickly. You can read about how well Morse code achieves this design objective here.

But digits in Morse code are kinda strange. I imagine they were an afterthought, tacked on after encodings had been assigned to each of the letters, and so had to avoid encodings that were already in use. Here are the assignments:

    |-------+-------|
    | Digit | Code  |
    |-------+-------|
    |     1 | .---- |
    |     2 | ..--- |
    |     3 | ...-- |
    |     4 | ....- |
    |     5 | ..... |
    |     6 | -.... |
    |     7 | --... |
    |     8 | ---.. |
    |     9 | ----. |
    |     0 | ----- |
    |-------+-------|

There’s no attempt to relate transmission length to frequency. Maybe the idea was that all digits are equally common. While in some contexts this is true, it’s not true in general for mathematical and psychological reasons.

There is a sort of mathematical pattern to the Morse code symbols for digits. For 1 ≤ n ≤ 5, the symbol for n is n dots followed by 5-n dashes. For 6 ≤ n ≤ 9, the symbol is n-5 dashes followed by 10-n dots. The same rule extends to 0 if you think of 0 as 10.

A more mathematically satisfying way to assign symbols would have been binary numbers padded to five places:

    0 -> .....
    1 -> ....-
    2 -> ..._.
    etc.

Because the Morse encoding of digits is awkward, it’s not easy to describe succinctly. And here is where golf comes in.

The idea of code golf is to write the shortest program that does some task. Fewer characters is better, just as in golf the lowest score wins.

Here’s the challenge: Write two functions as small you can, one to encode digits as Morse code and another to decode Morse digits. Share your solutions in the comments below.

Related posts

Squircle corner radius

I’ve written several times about the “squircle,” a sort of compromise between a square and a circle. It looks something like a square with rounded corners, but it’s not. Instead of having flat sizes (zero curvature) and circular corners (constant positive curvature), the curvature varies continuously.

A natural question is just what kind of circle approximates the corners. This post answers that question, finding the radius of curvature of the osculating circle.

The squircle has a parameter p which determines how close the curve is to a circle or a square.

|x|^p + |y|^p = 1

The case p = 2 corresponds to a circle, and in the limit as p goes to infinity you get a square.

We’ll work in the first quadrant so we can ignore absolute values. The curvature at each point is complicated [1] but simplifies in the corner to

2^{\frac{1}{p} - \frac{1}{2}} (p-1)

and the radius of curvature is the reciprocal of this. So for moderately large p, the radius of curvature is approximately √2/(p-1).

In the image at the top of the post, p = 3.5. Here’s an image with a larger value of p, p = 10.

And here’s one with a smaller value, p = 2.5.

When p = 2 we get a circle. When p is between 1 and 2 we get more of a diamond than a square. Notice in the image below with p = 1.5 the osculating circle is larger than the squircle, and the “corner” is nearly the whole side.

Finally, for p between 0 and 1 the sides of the diamond cave in giving a concave shape. Now the osculating circle is on the outside.

Related posts

[1] The general expression is

\frac{(p-1) (x y)^{p+1} \left(x^p+y^p\right)}{\left(y^2 x^{2 p}+x^2 y^{2 p}\right)^{3/2}}

Triple words

A couple days ago I wrote a post about some doubled words I found on my site. Someone asked about triple words, so I looked. Here are some of the things I found.

One example was a post where I commented on a song from Fiddler on the Roof where Teva sings

If I were a rich man,
Yubba dibby dibby dibby dibby dibby dibby dum.

Another example is a post on cryptography describes a very large number as “over 400 million million million million.”

As I mentioned in the post on double words, logarithms of logarithms come up often in number theory. For example, the time it would take Shor’s algorithm to factor an n-digit number is

O( log(n)² log(log(n)) log(log(log(n))) )

I mentioned that in a post and added in a footnote

Obligatory old joke: What sound does a number theorist make when drowning? log log log …

Finally, I wrote a post called Gamma gamma gamma!, an allusion to the WWII film Tora! Tora! Tora!. The post explains how the gamma function, the gamma constant, and the gamma distribution are all related.

Kissing circle

Curvature is a measure of how tightly a curve bends. A circle of radius r has curvature 1/r. So a small circle has high curvature and a big circle has small curvature.

In general the curvature of a curve at a point is defined to be the curvature of the circle that best fits at that point. This circle is called the osculating circle which means the circle that “kisses” the curve at that point.

From Online Etymology Dictionary:

osculate (v.) “to kiss (one another),” 1650s, from Latin osculatus, past participle of osculari “to kiss,” from osculum “a kiss; pretty mouth, sweet mouth,” literally “little mouth,” diminutive of os “mouth”

The center of the osculating circle is called the center of curvature and the radius of the circle is called the radius of curvature.

I’ll give two examples. The first is a set of ellipses that all have the unit circle as their osculating circle on the left end. So they all have curvature 1.

An ellipse with equation

\frac{x^2}{a^2} + \frac{y^2}{b^2} = 1

has varying curvature at each point. At x = –a the curvature simplifies to a/b². So to make the graph above, I used a range of values for a and set each corresponding value of b to √a.

Next, instead of fixing the osculating circle I’ll fix the curve. I’ll use the equation to fit an egg that I’ve used before and plot the osculating circles at each end. The plot below uses a = 3, b = 2, and k = 0.1.

The curvature at the fat end is a(1-ka)/b² and the curvature at the pointy end is a(1+ka)/b². These are derived here.

Setting k = 0 gives the curvature of an ellipse at each end used above.

More on curvature

Double words

Double words such as “the the” are a common source of writing errors. On the other hand, some doubled words are legitimate. You might, for example, find “had had” or “that that” in a grammatically correct sentence.

I’ve been looking through my web site to purge erroneous double words, and found a few doubles that are correct in context but would probably be incorrect elsewhere.

In ordinary English prose, long long is probably not what the author intended. There should either be a comma between the two words or a different choice of words. But in C code snippets, you’ll see long long as a type of integer. Also, it is common in many programming languages for a type and a variable to have the same name with varying capitalization, such as FILE file in C.

There are several pages on my site that refer to the Blum Blum Shub cryptographic random number generator. (The name of this algorithm always makes me think of a line from Night at the Museum.)

There are several pages on this site that use log log, always in the context of number theory. Logarithms of logarithms come up frequently in that context.

I also refer to unknown unknowns. The press ridiculed Donald Rumsfeld mercilessly when he first used this expression, but now the phrase is commonly used because more people understand that it names an important concept. It comes up frequently in statistics because so much attention is focused on known unknowns, even though unknown unknowns are very often the weakest link.

***

By the way, if you’d like to make a list of doubled words in a file, you could run the following shell one-liner:

   egrep -i -o '\<([a-z]+) \1\>' myfile | sort | uniq > doubles

I used something like this on a backup of my site to search for doubled words.

Approximating rapidly divergent integrals

A while back I ran across a paper [1] giving a trick for evaluating integrals of the form

I(M) = \int_a^M \exp(f(x)) \, dx

where M is large and f is an increasing function. For large M, the integral is asymptotically

A(M) = \frac{\exp(f(M))}{f'(M)}.

That is, the ratio of A(M) to I(M) goes to 1 as M goes to infinity.

This looks like a strange variation on Laplace’s approximation. And although Laplace’s method is often useful in practice, no applications of the approximation above come to mind. Any ideas? I have a vague feeling I could have used something like this before.

There is one more requirement on the function f. In addition to being an increasing function, it must also satisfy

\lim_{x\to\infty} \frac{f''(x)}{(f'(x))^2} = 0.

In [1] the author gives several examples, including using f(x) = x². If we wanted to approximate

\int_0^100 \exp(x^2)\, dx

the method above gives

exp(10000)/200 = 4.4034 × 104340

whereas the correct value to five significant figures is 4.4036 × 104340.

Even getting an estimate of the order of magnitude for such a large integral could be useful, and the approximation does better than that.

[1] Ira Rosenholtz. Estimating Large Integrals: The Bigger They Are, The Harder They Fall. The College Mathematics Journal, Vol. 32, No. 5 (Nov., 2001), pp. 322-329

Best approximation of a catenary by a parabola

A parabola and a catenary can look very similar but are not the same. The graph of

y = x²

is a parabola and the graph of

y = cosh(x) = (ex + ex)/2

is a catenary. You’ve probably seen parabolas in a math class; you’ve seen a catenary if you’ve seen the St. Louis arch.

Depending on the range and scale, parabolas and catenaries can be too similar to distinguish visually, though over a wide range enough range the exponential growth of the catenary becomes apparent.

For example, for x between -1 and 1, it’s possible to scale a parabola to match a catenary so well that the graphs practically overlap. The blue curve is a catenary and the orange curve is a parabola.

The graph above looks orange because the latter essentially overwrites the former. The relative error in approximating the catenary by the parabola is about 0.6%.

But when x ranges over -10 to 10, the best parabola fit is not good at all. The catenary is much flatter in the middle and much steeper in the sides. On this wider scale the hyperbolic cosine function is essentially e|x|.

Here’s an intermediate case, -3 < x < 3, where the parabola fits the catenary pretty well, though one can easily see that the curves are not the same.

Now for some details. How are we defining “best” when we say best fit, and how do we calculate the parameters for this fit?

I’m using a least-squares fit, minimizing the L² norm of the error, over the interval [M, M]. That is, I’m approximating

cosh(x)

with

c + kx²

and finding c and k that minimize the integral

\int_{-M}^M (\cosh(x) - c - kx^2)^2\, dx

The optimal values of c and k vary with M. As M increases, c decreases and k increases.

It works out that the optimal value of c is

-\frac{3 \left(M^2 \sinh (M)+5 \sinh (M)-5 M \cosh (M)\right)}{2 M^3}

and the optimal value of k is

\frac{15 \left(M^2 \sinh (M)+3 \sinh (M)-3 M \cosh (M)\right)}{2 M^5}

Here’s a log-scale plot of the L² norm of the error, the square root of the integral above, for the optimal parameters as a function of M.

More on catenaries

Five places the Sierpiński triangle shows up

The Sierpiński triangle is a fractal that comes up in unexpected places. I’m not that interested in fractals, and yet I’ve mentioned the Sierpiński triangle many times on this blog just because I run into it while looking at something else.

The first time I wrote about the Sierpiński triangle was when it came up in the context of a simple random process called the chaos game.

Unbiased chaos game results

Next I ran into Sierpiński in the context of cellular automata, specifically Rule 90. A particular initial condition for this rule leads to the image below. With other initial conditions you don’t get such a clean Sierpiński triangle, but you do get similar variations on the theme.

Rule 90 with one initial bit set

Next I ran into Sierpiński in the context of low-level programming. The following lines of C code prints an asterisk when the bit-wise and of two numbers is non-zero.

    for (int i = 0; i < N; i++) {
        for (int j = 0; j < N; j++)
            printf("%c", (i&j) ? ' ' : '*');
        printf("\n");
    }

A screenshot of the output shows our familiar triangle.

screen shot that looks like Sierpinski triangle

Then later I wrote a post looking at constructible n-gons, n-sided figures that can be constructed using only a straight edge and a compass. These only exist for special values of n. If you write these special values in binary, and replace the 1’s with a black square and the 0’s with a blank, you get yet another Sierpiński triangle.

Finally, if you look at the odd numbers in Pascal’s triangle, they also form a Sierpiński triangle.

Evolute of an egg

The set of lines perpendicular to a curve are tangent to a second curve called the evolute. The lines perpendicular to the ellipse below are tangent to the curve inside called an astroid.

If we replace the ellipse with an egg, we get a similar shape, but less symmetric.

The equation for the egg is described here with parameters a = 3, b = 2, and k = 0.1. The ellipse above has the same a and b but k = 0.

I made the lines slightly transparent, setting alpha = 0.4, so the graph would be darker where many lines cross.

Related post: Envelopes of epicycloids

Sample size calculation

If you’re going to run a test on rabbits, you have to decide how many rabbits you’ll use. This is your sample size. A lot of what statisticians do in practice is calculate sample sizes.

A researcher comes to talk to a statistician. The statistician asks what effect size the researcher wants to detect. Do you think the new thing will be 10% better than the old thing? If so, you’ll need to design an experiment with enough subjects to stand a good chance of detecting a 10% improvement. Roughly speaking, sample size is inversely proportional to the square of effect size. So if you want to detect a 5% improvement, you’ll need 4 times as many subjects as if you want to detect a 10% improvement.

You’re never guaranteed to detect an improvement. The race is not always to the swift, nor the battle to the strong. So it’s not enough to think about what kind of effect size you want to detect, you also have to think about how likely you want to be to detect it.

Here’s what often happens in practice. The researcher makes an arbitrary guess at what effect size she expects to see. Then initial optimism may waver and she decides it would be better to design the experiment to detect a more modest effect size. When asked how high she’d like her chances to be of detecting the effect, she thinks 100% but says 95% since it’s necessary to tolerate some chance of failure.

The statistician comes back and says the researcher will need a gargantuan sample size. The researcher says this is far outside her budget. The statistician asks what the budget is, and what the cost per subject is, and then the real work begins.

The sample size the negotiation will converge on is the budget divided by the cost per sample. The statistician will fiddle with the effect size and probability of detecting it until the inevitable sample size is reached. This sample size, calculated to 10 decimal places and rounded up to the next integer, is solemnly reported with a post hoc justification containing no mention of budgets.

Sample size is always implicitly an economic decision. If you’re willing to make it explicitly an economic decision, you can compute the expected value of an experiment by placing a value on the possible outcomes. You make some assumptions—you always have to make assumptions—and calculate the probability under various scenarios of reaching each conclusion for various sample sizes, and select the sample size that leads to the best expected value.

More on experimental design

[1] There are three ways an A/B test can turn out: A wins, B wins, or there isn’t a clear winner. There’s a tendency to not think enough about the third possibility. Interim analysis often shuts down an experiment not because there’s a clear winner, but because it’s becoming clear there is unlikely to be a winner.