The right level of abstraction

Mark Dominus wrote a blog post yesterday entitled Why I never finish my Haskell programs (part 1 of ∞). In a nutshell, there’s always another layer of abstraction. “Instead of just adding lists of numbers, I can do addition-like operations on list-like containers of number-like things!”

Is this a waste of time? It depends entirely on context.

I can think of two reasons to pursue high levels of abstraction. One is reuse. You have multiple instances of things that you want to handle simultaneously. The other reason is clarity. Sometimes abstraction makes things simpler, even if you only have one instance of your abstraction. Dijkstra had the latter in mind when he said

The purpose of abstraction is not to be vague, but to create a new semantic level in which one can be absolutely precise.

Both of these can backfire. You could make your code so reusable (in your mind) that nobody else wants to use it. Your bird’s eye view can become a Martian’s eye view that loses essential details. [1]

It’s easy, and often appropriate, to criticize high levels of abstraction. I could imagine asking “Just how often do you need to do addition-like operations on list-like containers of number-like things? We’ve got to ship by Friday. Why don’t you just add lists of numbers for now.”

And yet, sometimes what seems like excessive abstraction can pay off. I remember an interview with John Tate a few years ago in which he praised Alexander Grothendieck.

He just had an instinct for the right degree of generality. Some people make things too general, and they’re not of any use. But he just had an instinct to put whatever theory he thought about in the most general setting that was still useful. Not generalization for generalization’s sake but the right generalization. He was unbelievable.

I was taken aback by Tate saying that Grothendieck found just the right level of abstraction. But Tate is in a position to judge and I am not.

From my perspective, Grothendieck’s work, what glimpses I’ve seen, looks gratuitously abstract. Basic category theory is about as abstract as my mind can go, but category theory was the floor of the castle Grothendieck built in the sky. And yet he built his castle to solve specific challenging problems in number theory, and succeeded. (Maybe his castle in the sky turned into a Winchester Mansion later in life. I can’t say.)


[1] I’m more sympathetic to the clarity argument than the reuse argument. The former gives immediate feedback. You try something because you think it will make things more clear. Did it, at least in your opinion? Does anyone else find it helpful? But reuse is speculative because it happens in the future. (If you have several cases in hand that you want to handle uniformly, that’s a safer bet. You might just call that “use” rather than “reuse.” My skepticism is more about anticipated reuse.)

In software development in particular, I believe it’s easier to make your code re-editable than reusable. It’s easier to predict that code will need to do something different in the future than it is to predict exactly what that something will be.

Antidepressants for van Gogh

Van Gogh stamp

In a recent interview, Tyler Cowen discusses complacency, (neruo-)diversity, etc.

Let me give you a time machine and send you back to Vincent van Gogh, and you have some antidepressants to make him better. What actually would you do, should you do, could you do? We really don’t know. Maybe he would have had a much longer life and produced more wonderful paintings. But I worry about the answer to that question.

And I think in general, for all the talk about diversity, we’re grossly undervaluing actual human diversity and actual diversity of opinion. Ways in which people—they can be racial or ethnic but they don’t have to be at all—ways in which people are actually diverse, and obliterating them somewhat. This is my Toquevillian worry and I think we’ve engaged in the massive social experiment of a lot more anti-depressants and I think we don’t know what the consequences are. I’m not saying people shouldn’t do it. I’m not trying to offer any kind of advice or lecture.

I don’t share Cowen’s concern regarding antidepressants. I haven’t thought about it before. But I am concerned with how much we drug restless boys into submission. (Girls too, of course, but it’s usually boys.)

Grateful for failures

old saxophone

I’ve been thinking lately about different things I’ve tried that didn’t work out and how grateful I am that they did not.

The first one that comes to mind is my academic career. If I’d been more successful with grants and publications as a postdoc, it would have been harder to decide to leave academia. I’m glad I left when I did.

When I was in high school I was a fairly good musician. At one point decided that if I made the all-state band I would major in music. Thank God I didn’t make it.

I’ve looked back at projects that I hoped to get, and then realized how it’s a good thing that they didn’t come through.

In each of these examples, I’ve been forced to turn away from something I was moderately good at to pursue something that’s a better fit for me.

I wonder what failure I’ll be grateful for next.


How about one good one?

I’m no fan of tobacco companies or their advertising tactics, but I liked the following story.

When the head of a mammoth [advertising] agency solicited the Camel Cigarette account, he promised to assign thirty copywriters to it, but the canny head of R. J. Reynolds replied, “How about one good one?” Then he gave his account to a young copywriter called Bill Esty, in whose agency it has remained for twenty-eight years.

One really good person can accomplish more than thirty who aren’t so good, especially in creative work.

Source: Confessions of an Advertising Man


Well, F = ma.

Three or four very short stories on the difficulty of learning to use simple things. Depends whether you count the last section as a story.

* * *

When I was taking freshman physics and we were stuck on a problem, the professor would say “Well, F = ma.”

True, but absolutely useless. Yes, we know that F = ma. (Force equals mass times acceleration.) Nobody thought “Oh, that’s it. I was thinking F = ma2. That explains everything.” Newton’s laws are simple (in a sense) but subtle to apply. The difficult part isn’t the abstract principles but their application to concrete problems.

* * *

The heart of Bayesian statistics is not much more complicated than F = ma. Its the statement that

posterior ∝ likelihood × prior.

It takes a few years to learn how to apply that equation well. And when people try to help, their advice sounds about as useless as “Well, F = ma.”

* * *

Learning to use Unix was hard. When I asked for help getting started, a lab assistant said “Go read the man pages.” That’s about as hostile as saying “Want to learn English? Read a dictionary.” Fortunately I knew other people who were helpful. One of them told me about the book.

But still, it took a while to get the gestalt of Unix. I knew how to use a handful of utilities, and kept thinking everything would be fine once I knew maybe 10x as many utilities. Then one day I was talking with a friend who seemed fluent working with Unix. I asked him how he did a few things and realized he used the same tools I did, but used them better. It was almost as if he’d said “I just use F = ma” except when he said it things clicked.

* * *

The motivation for this post, the thing that brought these stories to mind, was listening to a podcast. The show had some good advice, things that I know I need to do, but nothing I hadn’t heard many times before. The hard part is working out what the particulars mean for me personally.

It often takes someone else to help us see what’s right in front of us. I’m grateful for the people who have helped me work out the particulars of things I was convinced of but couldn’t see how to apply. Sometimes I have the pleasure of being able to do that for someone else.

If it were easy …

“If it were easy, someone would have done it.” Maybe not.

Maybe the thing is indeed easy, and has been done before. Then someone was the first to do it. The warning that it had been done before didn’t apply to this person, even though it would apply to the subsequent people with the same idea.

This reminds me of the story of two economists walking down the street. They notice a $20 bill on the sidewalk and the first asks “Aren’t you going to pick it up?” The second replies “No, it’s not really there. If it were, someone would have picked it up by now.”

Sometimes a solution is easy, but nobody has had the audacity to try it. Or maybe circumstances have changed so that something is easy now that hasn’t been before.

Sometimes a solution is easy for you, if not for many others. See how much less credible the opening sentence sounds with for you inserted: “If it were easy for you, someone would have done it.”

Structure in jazz and math

Last night I went to a concert by the Branford Marsalis Quartet. One of the things that impressed me about the quartet was how creative they are while also being squarely within a tradition. People who are not familiar with jazz may not realize how structured it is and how much it respects tradition. The spontaneous and creative aspects of jazz are more obvious than the structure. In some ways jazz is more tightly structured than classical music. To use Francis Schaeffer’s phrase, there is form and freedom, freedom within form.

Every field has its own structure, its tropes, its traditions. Someone unfamiliar with the field can be overwhelmed, not having the framework that an insider has to understand things. They may think something is completely original when in fact the original portion is small.

In college I used to browse the journals in the math library and be completely overwhelmed. I didn’t learn until later that usually very little in a journal article is original, and even the original part isn’t that original. There’s a typical structure for a paper in PDEs, for example, just as there are typical structures for romantic comedies, symphonies, or space operas. A paper in partial differential equations might look like this:

  1. Motivation / previous work
  2. Weak formulation of PDE
  3. Craft function spaces and PDE as operator
  4. A priori estimates imply operator properties
  5. Well posedness results
  6. Regularity

An expert knows these structures. They know what’s boilerplate, what’s new, and just how new the new part is. When I wrote something up for my PhD advisor I remember him saying “You know what I find most interesting?” and pointing to one inequality. The part he found interesting, the only part he found interesting, was not that special from my perspective. It was all hard work for me, but only one part of it stood out as slightly original to him. An expert in partial differential equations sees a PDE paper the way a professional musician listens to another or the way a chess master sees a chess board.

While a math journal article may look totally incomprehensible, an expert in that specialization might see 10% of it as somewhat new. An interesting contrast to this is the “abc conjecture.” Three and a half years ago Shinichi Mochizuki proposed a proof of this conjecture. But his approach is so entirely idiosyncratic that nobody has been able to understand it. Even after a recent conference held for the sole purpose of penetrating this proof, nobody but Mochizuki really understands it. So even though most original research is not that original, once in a while something really new comes out.

Related posts


I was listening to a classic music station yesterday, and I heard the story of a professional pianist whose hand was injured in an accident. He then started learning trumpet and two years later he was a professional trumpeter. I didn’t catch the musician’s name.

I was not surprised that a professional in one instrument could become a professional in another, but I was surprised that he did it in only two years. It probably helped that he was no longer able to play piano; I imagine if he wanted to learn trumpet in addition to piano he would not have become so proficient so quickly.

If you go by the rule of thumb that it takes about 10 years to master anything, this professional pianist was 80% of the way to becoming a professional trumpeter before he touched a trumpet. Or to put it another way, 80% of being a professional musician who plays trumpet is becoming a professional musician.

Transferable skills are more difficult to acquire and more valuable than the context in which they’re exercised.

Related posts

The Mozart Myth

I don’t know how many times I’ve heard about how Mozart would compose entire musical scores in his head and only write them down once they were finished. Even authors who stress that creativity requires false starts and hard work have said that Mozart may have been an exception. But maybe he wasn’t.

In his new book How to Fly a Horse, Kevin Ashton says that the Mozart story above is a myth based on a forged letter. According to Ashton,

Mozart’s real letters—to his father, to his sister, and to others—reveal his true creative process. He was exceptionally talented, but he did not write by magic. He sketched his compositions, revised them, and sometimes got stuck. He could not work without a piano or harpsichord. He would set work aside and return to it later. … Masterpieces did not come to him complete in uninterrupted streams of imagination, nor without an instrument, nor did he write them whole and unchanged. The letter is not only forged, it is false.

Related posts


Looking ten years ahead

From Freeman Dyson:

Economic forecasting is useful for predicting the future up to about ten years ahead. Beyond ten years the quantitative changes which the forecast accesses are usually sidetracked or made irrelevant by qualitative changes in the rules of the game. Qualitative changes are produced by human cleverness … or by human stupidity … Neither cleverness nor stupidity are predictable.

Source: Infinite in All Directions, Chapter 10, Engineers’ Dreams.