When High Performance Computing Is Not High Performance

Everybody cares about codes running fast on their computers. Hardware improvements over recent decades have made this possible. But how well are we taking advantage of hardware speedups?

Consider these two C++ code examples. Assume here n = 10000000.

void sub(int* a, int* b) {
    for (int i=0; i<n; ++i)
        a[i] = i + 1;
    for (int i=0; i<n; ++i)
        b[i] = a[i];
}
void sub(int* a, int* b) {
    for (int i=0; i<n; ++i) {
        const int j = i + 1;
        a[i] = j;
        b[i] = j;
    }
}

Which runs faster? Both are simple and give identical results (assuming no aliasing). However on modern architectures, depending on the compilation setup, one will generally run significantly faster than the other.

In particular, Snippet 2 would be expected to run faster than Snippet 1. In Snippet 1, elements of the array “a”, which is too large to be cached, must be retrieved from memory after being written, but this is not required for Snippet 2. The trend for over two decades has been for compute speed of newly delivered systems to grow much faster than memory speed, and the disparity is extreme today. The performance of these kernels is bound almost entirely by memory bandwidth speed. Thus Snippet 2, a fused loop version of Snippet 1, improves speed by reducing main memory access.

Libraries like C++ STL are unlikely to help, since this operation is too specialized to expect a library to support it (especially the fused loop version). Also, the compiler cannot safely fuse the loops automatically without specific instructions that the pointers are unaliased, and even then is not guaranteed to do so.

Thankfully, high level computer languages since the 1950s have raised the programming abstraction level for all of us. Naturally, many of us would like to just implement the required business logic in our codes and let the compiler and the hardware do the rest. But sadly, one can’t always just throw the code on a computer and expect it to run fast. Increasingly, as hardware becomes more complex, giving attention to the underlying architecture is critical to getting high performance.

Leading zeros

The confusion between numbers such as 7 and 007 comes up everywhere. We know they’re different—James Bond isn’t Agent 7—and yet the distinction isn’t quite trivial.

How should software handle the two kinds of numbers? The answer isn’t as simple as “Do what the user expects” because different users have different expectations.

Excel

If you type 007 into Excel, by default the software will respond as if to say “Got it. Seven.” If you configure a cell to be text, then it will retain the leading zeros. Many people find this surprising, myself included.

But you can be sure that Microsoft has good reasons for the default behaviors it chooses. These are often business reasons rather than technical reasons. Microsoft wants to please the majority of its user base, not tech wizards. Not only are wizards an unprofitable minority, wizards can take care of themselves.

Zip codes

Someone relayed the following conversation to me recently.

“It took me longer than I thought, but I got the zip codes wrangled.”

“Leading zeros trip you up?”

“Yeah, how did you guess?”

“This isn’t my 01st rodeo.”

I’ve run into this, as has almost everyone who has ever worked with zip codes. The Boston zip code 02134 is not the number 2,134.

Octal

In Perl the expression (02134 > 2000) evaluates to false. That is because in some software, including the perl interpreter, a leading zero indicates that a number is written in octal, i.e. base 8. So 02134 represents 2134eight = 1116ten, which is less than 2000ten.

Update: I’d forgotten that C acts the same way until Wayne reminded me in the comments.  I don’t think I’ve ever (deliberately) used that feature in C.

Dates

I’m an American, and I use American-style dates in public correspondence. But privately I use YYYY-MM-DD dates so that dates always sort as intended, regardless of whether a particular piece of software interprets these symbols as numbers, text, or dates.

Computer science versus software engineering

From a computer science perspective, the root of the problem is not being explicit about data types. In computer science lingo, 7 and 2134 are integers, while 007 and 02134 are “words” built on the “alphabet” consisting of the digits 0 through 9. Integers and words have different data types. Furthermore, 007 and 02134 are not just words but representations of different data types: one is a serial number and the other is a postal code. And neither is not an octal number.

Objects of different data types have may have similar text representations, but these representations are to be interpreted differently. And they have different sort orders, which may not correspond to their sort order as text. End of discussion.

This is fine for computer science, but it doesn’t address the software engineering problem of meeting user expectations. It will not do to say “Just make the user specify his types.” The average user doesn’t know what that means.

So what do you do? The software could make educated guesses, but then what? Ask the user for confirmation that the software guessed correctly? Or presume the guess was correct but provide a way to fix the assumption in case it was not? Demand that the user be more specific? The solution depends on context.

Even if you want to meet the expectations of a particular group, such as Excel users or Perl programmers, those expectations may evolve over time. We expect different behavior from software than we did a generation ago. But we also expect backward compatibility! So even within an individual you have conflicting expectations. There is no simple solution, even for such a simple problem of how to handle leading zeros.

Naming Awk

The Awk programming language was named after the initials of its creators. In the preface to a book that just came out, The AWK Programming Language, Second Edition, the authors give a little background on this.

Naming a language after its creators shows a certain paucity of imagination. In our defense, we didn’t have a better idea, and by coincidence, at some point in the process we were in three adjacent offices in the order Aho, Weinberger, and Kernighan.

By the way, here’s a nice line from near the end of the book.

Realistically, if you’re going to learn only one programming language, Python is the one. But for small programs typed at the command line, Awk is hard to beat.

A small programming language

Paul Graham said “Programming languages teach you not to want what they don’t provide.” He meant that as a negative: programmers using less expressive languages don’t know what they’re missing. But you could also take that as a positive: using a simple language can teach you that you don’t need features you thought you needed.

Awk

I read the original awk book recently, published in 1988. It’s a small book for a small language. The language has grown since 1988, especially the Gnu implementation gawk, and yet from the beginning the language had a useful set of features. Most of what has been added since then is of no use to me.

How I use awk

It has been years since I’ve written an awk program that is more than one line. If something would require more than one line of awk, I probably wouldn’t use awk. I’m not morally opposed to writing longer awk programs, but awk’s sweet spot is very short programs typed at the command line.

At one point when I was saying how I like little awk programs, someone suggested I use Perl one-liners instead because then I’d have access to Perl’s much richer set of features, in particular Perl regular expressions. Along those lines, see these notes on how to write Perl one-liners to mimic sed, grep, and awk.

But when I was reading the awk book I thought about how I rarely need the the features awk doesn’t have, not for the way I use awk. If I were writing a large program, not only would I want more features, I’d want a different language.

Now my response to the suggestion to use Perl one-liners would be that the simplicity of awk helps me focus by limiting my options. Awk is a jig. In Paul Graham’s terms, awk teaches me not to want what it doesn’t provide.

Regular expressions

At first I wished awk were more expressive is in its regular expression implementation. But awk’s minimal regex syntax is consistent with the aesthetic of the rest of the language. Awk has managed to maintain its elegant simplicity by resisting calls to add minor conveniences that would complicate the language. The maintainers are right not to add the regex features I miss.

Awk does not support, for example, \d for digits. You have to type [0-9] instead. In exchange for such minor inconveniences you get a simple but adequate regular expression implementation that you could learn quickly. See notes on awk’s regex features here.

The awk book describes regular expressions in four leisurely pages. Perl regular expressions are an order of magnitude more complex, but not an order of magnitude more useful.

 

Simple example of Kleisli composition

Mars Climate Orbiter, artist conception, via NASA

When a program needs to work with different systems of units, it’s best to consistently use one system for all internal calculations and convert to another system for output if necessary. Rigidly following this convention can prevent bugs, such as the one that caused the crash of the Mars Climate Orbiter.

For example, maybe you need to work in degrees and radians. It would be sensible to do all calculations in radians, because that’s what software libraries expect, and output results in degrees, because that’s what humans expect.

Now suppose you have a function that takes in a length and doubles it, and another function takes in a length and triples it. Both functions take in length in kilometers but print the result in miles.

You would like the composition of the two functions to multiply a length by six. And as before, the composition would take in a speed in kilometers and return a speed in miles.

Here’s how we could implement this badly.

    miles_per_km = 5/8 # approx

    def double(length_km): 
        return 2*length_km*miles_per_km

    def triple(length_km): 
        return 3*length_km*miles_per_km

    length_km = 8
    d = double(length_km)
    print("Double: ", d)
    t = triple(d)
    print("Triple: ", t)

This prints

    Double: 10.0
    Triple: 18.75

The second output should be 30, not 18.5. The result is wrong because we converted from kilometers to miles twice. The correct implementation would be something like the following.

    miles_per_km = 0.6213712

    def double(length_km): 
        d = 2*length_km
        print("Double: ", d*miles_per_km)
        return d

    def triple(length_km): 
        t = 3*length_km
        print("Triple: ", t*miles_per_km)
        return t

    length_km = 8
    d = double(length_km)
    t = triple(d)

This prints the right result.

    Double: 10.0 
    Triple: 30.0

In abstract terms, we don’t want the composition of f and g to be simply gf.

We have a function f from X to Y that we think of as our core function, and a function T that translates the output. Say f doubles its input and T translates from kilometers to miles. Let f* be the function that takes X to TY, i.e. the combination of f and translation.

Now take another function g from Y to Z and define g* as the function that takes Y to TZ. We want the composition of f* and g* to be

g* ∘ f* = T ∘ g ∘ f.

In the example above, we only want to convert from kilometers to miles once. This is exactly what Kleisli composition does. (“Kleisli” rhymes with “highly.”)

Kleisli composition is conceptually simple. Once you understand what it is, you can probably think of times when it’s what you wanted but you didn’t have a name for it.

Writing code to encapsulate Kleisli composition takes some infrastructure (i.e. monads), and that’s a little complicated, but the idea of what you’re trying to achieve is not. Notice in the example above, what the functions print is not what they return; the print statements are a sort of side channel. That’s the mark of a monad.

Kleisli categories

The things we’ve been talking about are formalized in terms of Kleisli categories. You start with a category C and define another category that has the same objects as C does but has a different notion of composition, i.e. Kleisli composition.

Given a monad T on C, the Kleisli category CT has the same objects as C. An arrow f* from X to Y in CT corresponds to an arrow f from X to TY in C. In symbols,

HomCT(X, Y) = HomC(X, TY).

Mr. Kleisli’s motivation for defining his categories was to answer a more theoretical question—whether all monads arise from adjunctions—but more practically we can think of Kleisli categories as a way of formalizing a variation on function composition.

Related posts

Getting pulled back in

“Just when I thought I was out, they pull me back in.” — Michael Corleone, The Godfather, Part 3

My interest in category theory goes in cycles. Something will spark my interest in it, and I’ll dig a little further. Then I reach my abstraction tolerance and put it back on the shelf. Then sometime later something else comes up and the cycle repeats. Each time I get a little further.

A conversation with a client this morning brought me back to the top of the cycle: category theory may be helpful in solving a concrete problem they’re working on.

I’m skeptical of applied category theory that starts with categories. I’m more bullish on applications that start from the problem domain, a discussion something like this.

“Here’s a pattern that we’re trying to codify and exploit.”

“Category theory has a name for that, and it suggests you might also have this other pattern or constraint.”

“Hmm. That sounds plausible. Let me check.”

I think of category theory as a pattern description language, a way to turn vague analogies into precise statements. Starting from category theory and looking for applications is less likely to succeed.

When I left academia the first time, I got a job as a programmer. My first assignment was to make some change to an old Fortran program, and I started asking a lot of questions about context. My manager cut me off saying “You’ll never get here from there.” I had to work bottom-up, starting from the immediate problem. That lesson has stuck with me ever since.

Sometimes you do need to start from the top and work your way down, going from abstract to concrete, but less often that I imagined early in my career.

Code katas taken more literally

Karate class

Code katas are programming exercises intended to develop programming skills, analogous to the way katas develop martial art skills.

But literal katas are choreographed. They are rituals rather than problem-solving exercises. There may be an element of problem solving, such as figuring how to better execute the prescribed movements, but katas are rehearsal rather than improvisation.

CodeKata.com brings up the analogy to musical practice in the opening paragraph of the home page. But musical practice is also more ritual than problem-solving, at least for classical music. A musician might go through major and minor scales in all 12 keys, then maybe a chromatic scale over the range of the instrument, then two different whole-tone scales, etc.

A code kata would be more like a jazz musician improvising a different melody to the same chord changes every day. (Richie Cole would show off by improvising over the chord changes to Cherokee in all twelve keys. I don’t know whether this was a ritual for him or something he would pull out for performances.)

This brings up a couple questions. What would a more literal analog of katas look like for programming? Would these be useful?

I could imagine someone going through a prescribed sequence of keystrokes that exercise a set of software features that they wanted to keep top of mind, sorta like practicing penmanship by writing out a pangram.

This is admittedly a kind of an odd idea. It makes sense that the kinds of exercises programmers are interested in require problem solving rather than recall. But maybe it would appeal to some people.

***

Image “karate training” by Genista is licensed under CC BY-SA 2.0 .

Visualizing C operator precedence

Here’s an idea for visualizing C operator precedence. You snake your way through the diagram starting from left to right.

Operators at the same precedence level are on the same horizontal level.

Following the arrows for changing directions, you move from left-to-right through the operators that associate left-to-right and you move right-to-left through the operators that associate right-to-left.

Although this diagram is specifically for C, many languages follow the same precedence with minor exceptions. For example, all operators that Perl shares with C follow the same precedence as C.

visualization of C operator precedence

Related posts

Tool recursion

“Literature about Lisp rarely resists that narcissistic pleasure of describing Lisp in Lisp.” — Christian Queinnec, Lisp in Small Pieces

 

Applying software development tools to themselves has a dark side and a light side.

There’s a danger of becoming obsessed with one’s tools and never getting around to using them. If it’s your job to cut down a tree, there’s some benefit to sharpening your ax, but you can’t only sharpen your ax. At some point you hit diminishing return and it’s time to start chopping.

But there are benefits to self-referential systems, such as macros that use Lisp to generate Lisp, or writing a C compiler in C, or using Emacs to tweak Emacs. There’s a kind of consistency that results, and there can be a compound return on effort. But as with writing a recursive procedure, there has to be a base case, a point at which you stop recursing. Otherwise you go down the black hole of becoming absorbed in your tool and never using it for work.

Even though I’ve used Emacs for a long time, I’ve never understood the recursive fascination some people have with it. For example, part of the elevator pitch for Emacs is that it’s self-documenting. You can pull up help on Emacs from inside Emacs. But you can also type your questions into a search engine without having to learn an arcane help API. What’s the advantage to the former?

For one thing, using Emacs help inside Emacs works without a network connection. For another, you avoid the risk of being distracted by something you see when you’re using your web browser. But the most subtle benefit is the compound effect of self-reference. You can use the same navigation commands in the help system that you can when editing text, you can execute code snippets in place, etc.

When I hear “Isn’t it cool that you can do X in X?” my first thought is “Yeah, that sounds cool” but my second thought is “But why would you want to do that? Sounds like it could be really hard.” I’m starting to appreciate that there are sometimes long-term benefits to these sort of recursive tool uses even if they’re not optimal in the short run.

Nota bene

NB

I was looking at the J programming language yesterday and I was amused to see that it uses “NB.” to mark the rest of a line of source code as a comment, just like # in Python or // in C++. This makes comments in J look like comments in English prose.

“NB” abbreviates the Latin phrase nota bene meaning “note well.” It’s been used to mark side notes in English for about three centuries.

Most programming languages couldn’t use “NB” or “NB.” as a comment marker because it would inconsistent with conventions for identifiers, but J’s unconventional code syntax allows it to use conventional English notation for comments.

Why J?

I was looking at J because I have a project looking at its younger sister Remora. As described in this paper,

Remora is a higher-order, rank-polymorphic array-processing programming language, in the same general class of languages as APL and J. It is intended for writing programs to be executed on parallel hardware.

J keeps the array-oriented core of APL but drops its infamous symbols. Remora syntax is even closer to the mainstream, being written like a Lisp. (Some might object that Lisp isn’t mainstream, but it sure is compared to APL or J.)

APL comment symbol

Learning about J’s comment marker made me curious what its APL counterpart was. APL had custom symbols for everything, including comments. Comments began with ⍝ (U+235D), the idea being that the symbol looked like a lamp, giving light to the poor soul mentally parsing code.

U+235D APL FUNCTIONAL SYMBOL UP SHOE JOT

The full name for the lamp symbol is “APL FUNCTIONAL SYMBOL UP SHOE JOT.” Since this section of code is explicitly for APL symbols, why not call the symbol  COMMENT or LAMP rather than UP SHOE JOT?

I suppose the comment symbol looks like the bottom of a shoe. There’s also a symbol ⍦ (U+2366) [1] with the name “APL FUNCTIONAL SYMBOL DOWN SHOE STILE”

APL FUNCTIONAL SYMBOL DOWN SHOE STILE

and so “up” and “down” must refer to the orientation of the part of the symbol that looks like ∩ and ∪. But what about “jot” and “stile”?

A jot is a small character. The name is related to the Greek letter iota (ι) and the Hebrew letter yod (י). But if ∩ and ∪ are a shoe, the “jot” is a fairly large circle. Does “jot” have some other meaning?

A “stile” is a step or a rung, as in a turnstile. I suppose the vertical bar on top of ∪ is a stile.

Related posts

[1] What is this character for in APL? Unicode includes it as an APL symbol, but it’s not included in Wikipedia’s list of APL symbols.