Nomenclatural abomination

David Hogg calls conventional statistical notation a “nomenclatural abomination”:

The terminology used throughout this document enormously overloads the symbol p(). That is, we are using, in each line of this discussion, the function p() to mean something different; its meaning is set by the letters used in its arguments. That is a nomenclatural abomination. I apologize, and encourage my readers to do things that aren’t so ambiguous (like maybe add informative subscripts), but it is so standard in our business that I won’t change (for now).

I found this terribly confusing when I started doing statistics. The meaning is not explicit in the notation but implicit in the conventions surrounding its use, conventions that were foreign to me since I was trained in mathematics and came to statistics later. When I would use letters like f and g for functions collaborators would say “I don’t know what you’re talking about.” Neither did I understand what they were talking about since they used one letter for everything.

Tagged with:
Posted in Statistics

Monads are hard because …

Here’s a nice quip from Luke Gorrie on Twitter:

Monads are hard because there are so many bad monad tutorials getting in the way of finally finding Wadler’s nice paper.

Here’s the paper by Philip Wadler that I expect Luke Gorrie had in mind: Monads for functional programming.

Here’s the key line from Wadler’s paper:

Pure functional languages have this advantage: all flow of data is made explicit. And this disadvantage: sometimes it is painfully explicit.

That’s the problem monads solve: they let you leave implicit some of the repetitive code otherwise required by functional programming. That simple but critical point left out of many monad tutorials.

Dan Piponi wrote a blog post You Could Have Invented Monads! (And Maybe You Already Have) very much in the spirit of Wadler’s paper. He starts with the example of adding logging to a set of functions. You could have every function return not just the return value that you’re interested in but also the updated state of the log. This quickly gets messy. For example, suppose your basic math functions write to an error log if you call them with illegal arguments. That means your square root function, for example, has to take as input not just a real number but also the state of the error log before it was called. Monads give you a way of effectively doing the same thing, but conceptually separating the code that takes square roots from the code that maintains the error log.

As for “so many bad monad tutorials,” see Brent Yorgey on the monad tutorial fallacy.

By the way, this post is not Yet Another Monad Tutorial. It’s simply an advertisement for the tutorials by Philip Wadler and Dan Piponi.

Tagged with:
Posted in Software development

Log semiring

Here’s a strange way to do arithmetic on the real numbers.

First, we’ll need to include +∞ and -∞ with the reals.

We define the new addition of two elements x and y to be -log (exp(-x) + exp(-y) ).

We define the new multiplication to be ordinary addition. (!)

In this new arithmetic +∞ is the additive identity and 0 is the multiplicative identity.

This new algebraic structure is called the log semiring. It’s called a semiring because it satisfies all the properties of a ring except that elements don’t necessarily have additive inverses. We’ll get into the details of the definition below.


Let’s put a subscript S on everything associated with our semiring in order to distinguish them from their more familiar counterparts. Then we can summarize the definitions above as

  • a +S  b = -log (exp(-a) + exp(-b) )
  • a *S  b = a + b
  • 0S = +∞
  • 1S = 0

Note that if we define

f(a, b) = a +S  b


f(a, b) = -g(-a, -b)

where g(a, b) is the soft maximum of a and b.


Finally, we list the axioms of a semiring. Note that these equations all hold when we interpret +, *, 0, and 1 in the context of S, i.e. we imagine that each has a subscript S and is as defined above.

  • (a + b) + c = a + (b + c)
  • 0 + a = a + 0 = a
  • a + b = b + a
  • (a * b) * c = a * (b * c)
  • 1 * a = a * 1 = a
  • a * (b + c) = (a * b) + (a * c)
  • (a + b) * c = (a * c) + (b * c)
  • 0 * a = a * 0 = 0

Each of these follows immediately from writing out the definitions.

Tagged with:
Posted in Math

Do the stars go up or down?

The thing that sparked my interest in category theory was a remark from Ted Odell regarding the dual of a linear transformation. As I recall, he said something like “There’s a reason the star goes up instead of down” and mumbled something about category theory. I took it he didn’t think highly of category theory, but my interest was piqued. Read more ›

Tagged with:
Posted in Math

Writing down an unwritten language

In this post I interview Greg Greenlaw, a friend of mine who served as a missionary to the Nakui tribe in Papua New Guinea and developed their writing system. (Nakui is pronounced like “knock we.”)

JC: When you went to PNG to learn Nakui was there any writing system?

GG: No, they had no way of writing words or numbers. They had names for only seven numbers — that was the extent of their counting system — but they could coordinate meetings more than a few days future by tying an equal number of knots in two vines. Each party would take a vine with them and loosen a knot each morning until they counted down to the appointed time — like and advent calendar, but without numbers!

Read more ›

Tagged with:
Posted in Uncategorized

Imaginary gold, silver, bronze, …

The previous post gave a relationship between the imaginary unit i and the golden ratio. This post highlights a comment to that post explaining that the relationship generalizes to generalizations of the golden ratio.

GlennF pointed out that taking the larger root of the equation

\phi_n = n + \frac{1}{\phi_n}

defines the golden ratio when n = 1, the silver ratio when n = 2, etc. φn is also the continued fraction made entirely of n‘s.

With this definition, we have

2 \sin \left( i \log \phi_n \right)  = ni

Tagged with:
Posted in Math

Imaginary gold

This morning Andrew Stacey posted a beautiful identity I’d never seen before relating the golden ratio ϕ and the imaginary unit i:

2 \sin( i \log(\phi) ) = i

Here’s a proof:

By De Moivre’s formula,

 \sin z = \frac{\exp(iz) - \exp(-iz)}{2i}

and so

2 \sin( i\log\phi) &=& \frac{\exp(-\log\phi) - \exp(\log\phi)}{i} \\ &=& -i \left(\frac{1}{\phi} - \phi\right) \\ &=& i

Related posts:

Golden ratio and special angles

Golden strings and the rabbit ratio

Tagged with:
Posted in Math

Mental callouses

In describing writing his second book, Tom Leinster says

… I’m older and, I hope, more able to cope with stress: just as carpenters get calloused hands that make them insensitive to small abrasions, I like to imagine that academics get calloused minds that allow them not to be bothered by small stresses and strains.

Mental callouses are an interesting metaphor. Without the context above, “calloused minds” would have a negative connotation. We say people are calloused or insensitive if they are unconcerned for other people, but Leinster is writing of people unperturbed by distractions.

You could read the quote above as implying that only academics develop mental discipline, though I’m sure that’s not what was intended. Leinster is writing a personal post about the process of writing books. He’s an academic, and so he speaks of academics.

Not only do carpenters become more tolerant of minor abrasions, they also become better at avoiding them. I’m not sure that I’m becoming more tolerant of stress and distractions as I get older, but I do think I’m getting a little better at anticipating and avoiding stress and distractions.



Tagged with:
Posted in Creativity

Categories, Birds, and Frogs

Freeman Dyson divided mathematicians into birds and frogs in his essay by that title.

Some mathematicians are birds, others are frogs. Birds fly high in the air and survey broad vistas of mathematics out to the far horizon. They delight in concepts that unify our thinking and bring together diverse problems from different parts of the landscape. Frogs live in the mud below and see only the flowers that grow nearby. They delight in the details of particular objects, and they solve problems one at a time.

It’s an interesting metaphor. Like all metaphors it has its limits and Dyson discusses that. Some people are somewhere between a bird and a frog, whatever kind of creature that would be, and some alternate between being birds and frogs.

The other day I thought about Dyson’s classification and wondered whether category theorists would be birds or frogs. At first category theory seems avian, looking for grand patterns across mathematics. But as you wander further in, it seems more batrachian, absorbed in drawing little boxes and arrows.

I find it interesting that category theory can profound or trivial, depending on your perspective.

The motivations and applications are profound. Category theory has been called “metamathematics” because it formalizes analogies between diverse areas of math. But basic category theory itself is very close to its axioms. The path from first principles to common definitions and theorems in category theory is much shorter than, say, the path from the definition of the real numbers to the fundamental theorem of calculus.

(This diagram quantifies the last claim to some extent: the graph of concept dependencies in category theory is more wide than deep, and not that deep. Unfortunately I don’t have a similar diagram for calculus.)

Related post: Concepts, explosions, and developments

Tagged with:
Posted in Math

“Conventional” is relative

I found this line from Software Foundations amusing:

… we can ask Coq to “extract,” from a Definition, a program in some other, more conventional, programming language (OCaml, Scheme, or Haskell) with a high-performance compiler.

Most programmers would hardly consider OCaml, Scheme, or Haskell “conventional” programming languages, but they are conventional relative to Coq. As the authors said, these languages are “more conventional,” not “conventional.”

I don’t mean to imply anything negative about OCaml, Scheme, or Haskell. They have their strengths — I briefly mentioned the advantages of Haskell just yesterday — but they’re odd birds from the perspective of the large majority of programmers who work in C-like languages.

Tagged with: , ,
Posted in Software development

Jon Bentley’s design principles

Design principles from Programming Pearls by Jon Bentley:

  • Work on the right problem.
  • Explore the design space of solutions.
  • Look at the data.
  • Use the back of the envelope.
  • Build prototypes.
  • Make tradeoffs when you have to.
  • Keep it simple.



Posted in Uncategorized

Real World Haskell

I’m reading Real World Haskell because one of my clients’ projects is written in Haskell. Some would say that “real world Haskell” is an oxymoron because Haskell isn’t used in the real world, as illustrated by a recent xkcd cartoon.

It’s true that Haskell accounts for a tiny portion of the world’s commercial software and that the language is more popular in research. (There would be no need to put “real world” in the title of a book on PHP, for example. You won’t find a lot of computer science researchers using PHP for its elegance and nice theoretical properties.) But people do use Haskell on real projects, particularly when correctness is a high priority.[1] In any case, Haskell is “real world” for me since one of my clients uses it. As I wrote about before, applied is in the eye of the client.

I’m not that far into Real World Haskell yet, but so far it’s just what I was looking for. Another book I’d recommend is Graham Hutton’s Programming in Haskell. It makes a good introduction to Haskell because it’s small (184 pages) and focused on the core of the language, not so much on “real world” complications.

A very popular introduction to Haskell is Learn You a Haskell for Great Good. I have mixed feelings about that one. It explains most things clearly and the informal tone makes it easy to read, but the humor becomes annoying after a while. It also introduces some non-essential features of the language up front that could wait until later or be left out of an introductory book.


[1] Everyone would say that it’s important for their software to be correct. But in practice, correctness isn’t always the highest priority, nor should it be necessarily. As the probability of error approaches zero, the cost of development approaches infinity. You have to decide what probability of error is acceptable given the consequences of the errors.

It’s more important that the software embedded in a pacemaker be correct than the software that serves up this blog. My blog fails occasionally, but I wouldn’t spend $10,000 to cut the error rate in half. Someone writing pacemaker software would jump at the chance to reduce the probability of error so much for so little money.

On a related note, see Maybe NASA could use some buggy software.

Tagged with: ,
Posted in Software development

Theory and practice

Nice line from Erik Meijer via Twitter:

Happiness is when you drill a tunnel from two completely different sides (theory <–> practice) and then they line up *exactly*.

Posted in Uncategorized

Where combinator names come from

Today I found out where the one-letter names of some functions in combinatory logic come from. I’d seen these before (for example, in To Mock a Mockingbird) but I had no idea what inspired the names.

These functions — I, K, S, T, and Z — are known as the Schönfinkel combinators, and their names are somewhat mnemonic in German. (Only somewhat. Don’t get your hopes up.)

Definition Name Name origin
λx. x I Identitätsfunktion (identity function)
λx,y. x K Konstanzfunktion (constant function)
λx,y,z. xz(yz) S Verschmelzungsfunktion (amalgamation function)
λx,y,z. xzy T Vertauschungsfunktion (exchange function)
λx,y,z. x(yz) Z Zusammensetzungsfunktion (composition function)

Source: Practical Foundations of Mathematics, footnote on page 89. Available online here.

If you’re not familiar with the notation in the function definitions, see this introduction to lambda calculus.

Tagged with:
Posted in Computing

What good is an old weather forecast?

Why would anyone care about what the weather was predicted to be once you know what the weather actually was? Because people make decisions based in part on weather predictions, not just weather. Eric Floehr of ForecastWatch told me that people are starting to realize this and are increasingly interested in his historical prediction data.

This morning I thought about what Eric said when I saw a little snow. Last Tuesday was predicted to see ice and schools all over the Houston area closed. As it turned out, there was only a tiny amount of ice and the streets were clear. This morning there actually is snow and ice in the area, though not much, and the schools are all open. (There’s snow out in Cypress where I live, but I don’t think there is in Houston proper.)

Aftermath of last Tuesday’s storm

Related posts:

Interview with Eric Floehr
Accuracy versus perceived accuracy
History of weather prediction

Posted in Statistics