Classical programming

Posted on 1 July 2014 by John

The classical education model is based on the trivium of grammar, logic, and rhetoric. See, for example, Dorothy Sayers’ essay The Lost Tools of Learning.

The grammar stage of the trivium could be literal language grammar, but it also applies more generally to absorbing the basics of any subject and often involves rote learning.

The logic stage is more analytic, examining the relationships between the pieces gathered in the grammar stage. Students learn to construct sound arguments.

The rhetoric stage is focused on eloquent and persuasive expression. It is more outwardly focused than the previous stages, more considerate of others. Students learn to create arguments that are not only logically correct, but also memorable, enjoyable, and effective.

It would be interesting to see a classical approach to teaching programming. Programmers often don’t get past the logic stage, writing code that works (as far as they can tell). The rhetoric stage would train programmers to look for solutions that are not just probably correct, but so clear that they are persuasively correct. The goal would be to write code that is testable, maintainable, and even occasionally eloquent.

Parthenon replica in Nashville, TN.

Technical projects fail for non-technical reasons

Posted on 23 June 2014 by John

This morning Aycan Gulez shared on Twitter this quote from Peopleware:

For the majority of the bankrupt projects we studied, there was not a single technological issue to explain the failure.

Gerald Weinberg said something similar in his Second Law of Consulting:

No matter how it looks at first, it’s always a people problem.

Benchmarking C++, Python, R, etc.

Posted on 20 June 2014 by John

The other day Travis Oliphant pointed out an interesting paper: A Comparison of Programming Languages in Economics. The paper benchmarks several programming languages on a computational problem in economics.

All the usual disclaimers about benchmarks apply, your mileage may vary, etc. See the paper for details.

Here I give my summary of their summary of their results. The authors ran separate benchmarks on Mac and Windows. The results were qualitatively the same, so I just report the Windows results here.

Times in the table below are relative to the fastest C++ run.

Language	Time
C++	1.00
Java	2.10
Julia	2.70
CPython	155.31
Python with Numba	1.57
R	505.09
R using compiler package	243.38

The most striking result is that the authors were able to run their Python code 100x faster, achieving performance comparable to C++, by using Numba.

Weak static type systems

Posted on 19 June 2014 by John

Comment by Simon Peyton Jones in an interview:

People often dislike static type systems because they’ve only met weak ones. A weak or not very expressive type system gets in your way all the time. It prevents you from writing functions you want to write that you know are fine. … The solution is not to abandon the type system but to make the type system more expressive.

In particular, he mentions Haskell’s polymorphic types and type inference as ways to make strong static typing convenient to use.

Haskell / Category theory decoder ring

Posted on 10 May 2014 by John

Haskell uses a lot of ideas from category theory, but the correspondence between Haskell and category theory can be a little hard to see at times.

One difficulty is that although Haskell articles use terms like functor and monad from category theory, they seldom actually talk about categories per se. If we’ve got functors, where are the categories? (This reminds me of Darth Vader asking “If this is a consular ship, where is the ambassador?”)

In Haskell literature, everything implicitly lives Hask, the category of Haskell types, or in some subcategory of Hask. This means that the category itself is not the focus of attention. In category theory, functors often operate between very different classes of objects, such as topological spaces and their fundamental groups, and so it’s more important to state what category something lives in.

Another potential stumbling block is to think of Haskell types as categories and values as objects. That would be reasonable, since in computer science an “object” is an instance of a type. But the right correspondence is to think of Haskell types as categorical objects. Instances of types are below the level of abstraction we’re working at. This is analogous to how category theory treats objects as black boxes with no way to talk about what’s inside.

Finally, Haskell monads look a little different from categorical monads. Haskell’s return corresponds directly to unit, usually written as η, in category theory. But Haskell monads have a bind operator >>= while mathematical monads have a join operator μ. These are not equivalent, though you can implement each in terms of the other:

join :: Monad m => m (m a) -> m a
join x = x >>= id

(>>=) :: Monad m => m a -> (a -> m b) -> m b
x >>= f = join (fmap f x)

To read more along these lines, see the Wikibooks article on Haskell and Category theory.

Update: Stephen Diehl suggested I mention the differences between the idealized category Hask and the implementation of the Haskell language. These are discussed here.

Related: Applied category theory

Look-behind regex

Posted on 1 May 2014 by John

Look-behind is one of those advanced/obscure regular expression features that I don’t use frequently enough to remember the syntax, but just frequently enough that I wish I could remember it.

Look-behind can be positive or negative. Look-behind says “match this position only if the preceding text matches (does not match) the following pattern.”

The syntax in Perl and similar regular expression implementations is (?<= … ) for positive look-behind and (?<! … ) for negative look-behind. For the longest time I couldn’t remember whether the next symbol after ? was the direction (i.e. < for behind) or the polarity (= for positive, ! for negative). I was more likely to guess wrong unless I’d used the syntax recently.

The reason I was tempted to get these wrong is that I thought “positive look-behind” and “negative look-behind.” That’s how these patterns are described. But this means the words and symbols come in a different order. If you think look-behind positive and look-behind negative then the words and the symbols come in the same order:

look	`(?`
behind	`<`
positive	`=`
negative	`!`

Maybe this syntax comes more naturally to people who speak French and other languages where adjectives follow the thing they describe. English word order was tripping me up.

By the way, the syntax for look-ahead patterns is simpler: just leave out the <. The default direction for look-around patterns is forward. You don’t have to remember whether the symbol for direction or parity comes first because there is no symbol for direction.

Programming languages and magic

Posted on 29 April 2014 by John

In the context of programming languages, “magic” is often a pejorative term for code that does something other than what it appears to do.

Programmers seem to have a love/hate relationship with magic. Even people who say that don’t like magic (e.g. because it’s hard to debug) end up using it. The Haskell community prides itself on having a transparent language with no magic, and yet monads are slightly magical. The whole purpose of a monad is to hide explicit data flow, though in a principled way. Haskell’s do notation is more magical, and templates are even more magical still. (However, I do hear some Haskellers express disdain for templates.)

People who like magic tend to use the word “automagic” instead. It means about the same thing as “magic” but with a positive connotation.

To conclude with a couple sweeping generalizations, magic fans tend to be tool-oriented (such as Microsoft developers) while magic detractors tend to be language-oriented (such as Haskell developers ).

Update: Someone asked me on Twitter about the difference between abstraction and magic. I’d say abstraction hides details, but magic is actively misleading or ironic.

Patches and specs

Posted on 5 April 2014 by John

From Leslie Lamport:

Every time code is patched, it becomes a little uglier, harder to understand, harder to maintain, bugs get introduced.

If you don’t start with a spec, every piece of code you write is a patch.

Which means the program starts out from Day One being ugly, hard to understand, and hard to maintain.

Making definitions

Posted on 17 March 2014 by John

“The essential virtue of category theory is as a discipline for making definitions, and making definitions is the programmer’s main task in life.”

From Computational Category Theory

Don’t be a technical masochist

Posted on 12 March 2014 by John

There’s an old joke from Henny Youngman:

I told the doctor I broke my leg in two places. He told me to quit going to those places.

Sometimes tech choices are that easy: if something is too hard, stop doing it. A great deal of pain comes from using a tool outside its intended use, and often that’s avoidable.

For example, when regular expressions get too hard, I stop using regular expressions and write a little procedural code. Or when Python is too slow, I try some simple ways of speeding it up, and if that’s not good enough I switch from Python to C++. If something is too hard to do in Windows, I’ll do it in Linux, and vice versa.

Sometimes there’s not a better tool available and you just have to slog through with what you have. And sometimes you don’t have the freedom to use a better tool even though one is available. But a lot of technical pain is self-imposed. If you keep breaking your leg somewhere, stop going there.