Humble Lisp programmers

Maybe from the headline you were expecting a blank post? No, that’s not where I’m going.

Yesterday I was on Amazon.com and noticed that nearly all the books they recommended for me were either about Lisp or mountain climbing. I thought this was odd, and mentioned it on Twitter. Carl Vogel had a witty reply: “I guess they weren’t sure whether you want to figuratively or literally look down on everyone.”

The stereotype Lisp programmer does look down on everyone. But this is based on a tiny, and perhaps unrepresentative, sample of people writing about Lisp compared to the much larger number of people who are writing in Lisp.

Lisp has been around for over 50 years and shows no signs of going away. There are a lot of people writing Lisp in obscurity. Kate Gregory said something similar about C++ developers, calling them the dark matter of programmers because there are lot of them but they don’t make much of a splash. They’re quietly doing their job, not speaking at conferences or writing much about their language.

I imagine there are a lot of humble Lisp programmers. It takes some humility to commit to an older technology, especially given the pervasive neomania of the programmer community. It also takes some humility to work on projects that have been around for years or that are deep within the infrastructure of more visible projects, which is where I expect a lot of Lisp lives.

You can do very clever things in Lisp, but you don’t have to. As Ed Post famously said, “The determined Real Programmer can write FORTRAN programs in any language.” There must be a lot of code out there that writes (f x) instead of f(x) but otherwise isn’t that different from FORTRAN.

Formal methods let you explore the corners

I heard someone say the other day that the advantage of formal software validation methods is that they let you explore the corners, cases where intuition doesn’t naturally take you.

This made me think of corners in the geometric sense. If you have a sphere in a box in high dimensions, nearly all the volume is in the corners, i.e. outside the sphere. This is more than a metaphor. You can think of software options geometrically, with each independent choice corresponding to a dimension. Paths through a piece of software that are individually rare may account for nearly all use when considered together.

With a circle inside a square, nearly 78.5% of the area is inside the circle. With a ball sitting inside a 3-D box, 52.4% of the volume is inside the ball. As the dimension increases, the proportion of volume inside the sphere rapidly decreases. For a 10-dimensional sphere sitting in a 10-dimensional box, 0.25% of the volume is in the sphere. Said another way, 99.75% of the volume is in the corners.

When you go up to 100 dimensions, the proportion of volume inside the sphere is about 2 parts in 1070, a 1 followed by 70 zeros [1]. If 100 dimensions sounds like pure fantasy, think about a piece of software with more than 100 features. Those feature combinations multiply like geometric dimensions [2].

Here’s a little Python code you could use to see how much volume is in a sphere as a function of dimension.

    from scipy.special import gamma
    from math import pi

    def unit_sphere_volume(n): 
        return pi**(0.5*n)/gamma(0.5*n + 1)

    def unit_cube_volume(n): 
        return 2**n

    def ratio(n):
        return unit_sphere_volume(n) / unit_cube_volume(n)

    print( [ratio(n) for n in range(1, 20)] )

* * *

[1] There are names for such extremely large numbers. These names are hardly ever used—scientific notation is much more practical— but they’re fun to say. 1070 is ten duovigintillion in American nomenclature, ten undecilliard in European.

[2] Geometric dimensions are perfectly independent, but software feature combinations are not. In terms of logic, some combinations may not be possible. Or in terms of probability, the probability of exploring some paths is conditional on the probability of exploring other paths. Even so, there are inconceivably many paths through any large software system. And in large-scale operations, events that should “never happen” happen regularly.

Literate programming: presenting code in human order

Presentation order

People best understand computer programs in a different order than compilers do. This is a key idea of literate programming, and one that distinguishes literate programs from heavily commented programs.

Traditional source code, no matter how heavily commented, is presented in the order dictated by the compiler. The computer is the primary audience. Literate programming is more humanistic in the sense that the primary audience is a human. The computer has to go to extra effort to arrange the code for its needs. As Donald Knuth describes it in his book on literate programming,

The practitioner of literate programming … strives for a program that is comprehensible because its concepts have been introduced in an order that is best for human understanding, using a mixture of formal and informal methods that nicely reinforce each other. [emphasis added]

There are two steps in processing literate programs: weaving and tangling. You take files containing prose and code, and weave them into documentation and tangle them into source code. Tools like Sweave and Pweave focus on the weave process, as their names imply. The weave side of literate programming has gotten the most attention.

A half-hearted approach to literate programming doesn’t require much of a tangle process. A well-commented program has no tangle step at all. A *weave document that follows the order of the source code has a trivial tangle step: save the code to its own file, manually or automatically, but don’t rearrange it. But a full-fledged literate program may make the tangle program work harder, rearranging code fragments from human-friendly to compiler-friendly order.

Careful explanation vs. unit tests

The most obvious feature of literate programming is that it requires careful explanation. Here’s more from the paragraph I quoted above, filling in the part I left out.

The practitioner of literate programming can be regarded as an essayist, whose main concern is with explanation and excellence of style. Such an author, with thesaurus in hand, chooses the names of variables carefully and explains what each variable means. He or she strives for a program that is comprehensible …

The discipline of explaining every piece of code leads to better code. It serves a similar purpose to writing unit tests. I saw somewhere—I can’t remember where now— that Knuth hates unit testing and sees it as redundant effort. Presumably this is because unit testing and literate programming overlap. Unit tests are a kind of commentary on code, explaining how it used, exploring its limitations, etc.

Knuth understands that literate programming doesn’t replace the need for testing, just unit testing. He explained somewhere—again I forget where—that he would test TeX by spending days at a time trying fiendishly to break it.

My misunderstanding and experience

When I read Knuth’s book, I could see the value of carefully explaining code. What I didn’t appreciate was the value of presenting code in a different order than the order of source code.

I’m working on a project now where a few lines of code may require a few paragraphs of explanation. That’s what got me thinking about literate programming. My intention was to write my documentation in the same order as the code. It took a while to realize I had stumbled on an ideal application of literate programming: a complicated algorithm that needs to be explained carefully, both in order to hand over to the client and to reduce the chances of errors. The best order to understand this algorithm is definitely not top-down going through the code.

Why literate programming has not taken off

I think I understand better now why literate programming hasn’t gained much of an audience. I used to think that it was because developers hate writing prose. That’s part of it. Most programmers I’ve worked with would much rather write a hundred lines of unit tests than write one complete sentence.

But that’s not the whole story. There are quite a few programmers who are willing and able to write prose. Why don’t more of them use literate programming?

I think part of the reason is that having a non-trivial tangle process is a barrier to adoption. A programmer can decide to start writing more extensive comments, gradually working up to essay-like explanations. But it’s one thing to say “I’m going to heavily comment my C code” and quite another to say “I’m not going to write C per se any more. I’m going to write CWEB files that compile to C.” Even if a programmer wants to write CWEB in secret, just checking in the tangled C files, other programmers will edit these C files and the changes won’t be reflected in the CWEB source. Also, the output of tangle is less readable than ordinary code. The programmer secretly using CWEB as a preprocessor would appear to be writing undocumented code.

Tricky code benefits from a literate presentation, but routine code does not benefit so much. You either have to have two ways of writing code—straight source for routine code and literate programs for the tricky parts—or impose the overhead of literate programming everywhere. Most code is mundane and repetitive, challenging because of its volume rather than its cleverness. Knuth doesn’t write this kind of code. He only writes code that benefits from a literate presentation.

To write a good literate program, not only do you need to be able to program, and need to be willing and able to write good prose, on top of that you need to have a good sense for story telling, arranging the code for the benefit of other readers. If this is done poorly, the result is harder to understand than traditional programs.

I may use literate programming more now that I’m starting to understand it, at least for my own benefit and hopefully for the benefit of clients. I usually deliver algorithms or libraries, not large applications, and so it wouldn’t be too much work to create two versions of my results. I could create a literate program, then weave a report, and manually edit the tangled code into a traditional form convenient for the client.

Agile software development and homotopy

One of the things I learned from my tenure as a software project manager was that a project is more likely to succeed if there’s a way to get where you want to go continuously. You want to move a project from A to B gradually, keeping a working code base all along the way. At the end of each day, the software may not be fully functional, but it should at least build. Anything that requires a big bang change, tearing the system apart for several days and putting it back together, is less likely to succeed.

This is very much like the idea of homotopy from topology, a continuous deformation of one thing into another. No discontinuities along the way—no ripping, no jumping suddenly from one thing to another.

Interview with Chris Toomey of Upcase

The other day I spoke to Chris Toomey from thoughtbot. Chris runs Upcase, thoughtbot’s online platform for learning about Rails, test-driven development, clean code, and more. I was curious about his work with Ruby on Rails since I know little about that world. And at a little deeper level, I wanted to get his thoughts on how programming languages are used in practice, static vs dynamic, strongly typed vs weakly typed, etc.

Chirs Toomey

JC: Chris, I know you do a lot of work with Ruby on Rails. What do you think of Ruby without Rails? Would you be as interested in Ruby if the Rails framework had been written in some other language?

CT: Let me back up a little bit and give you some of my background. I started out as an engineer and I used VB because it was what I had for the task at hand. Then when I decided to buckle down and become a real developer I chose Python because it seemed like the most engineering-oriented alternative. It seemed less like an enterprise language, more small and nimble. I chose Python over Ruby because of my engineering background. Python seemed more serious, while Ruby seemed more like a hipster language. Ruby sounded frivolous, but I kept hearing good things about it, especially with Rails. So like a lot of people I came to Ruby through Rails. It was the functionality and ease of use that got me hooked, but I do love Ruby as a language, the beauty and expressiveness of it. It reads more like prose than other languages. It’s designed for people rather than machines. But it’s also a very large language and hard to parse because of that. Over time though I’ve seen people abuse the looseness, the freedom in Ruby, and that’s caused me to look at stricter options like Haskell and other functional languages.

JC: I only looked at Ruby briefly, and when I saw the relative number of numerical libraries for Python and Ruby I thought “Well, looks like it’s Python for me.”

It seems like Ruby bears some resemblance to Perl, for better or worse.

CT: Absolutely. Ruby has two spiritual ancestors. One is Perl and the other is Smalltalk. I think both of those are great, and many of the things I love about Ruby come from that lineage. Perl contributed the get-things-done attitude, the looseness and terseness, the freedom to interact at any level of abstraction.

It’s kinda odd. I keep coming back to The Zen of Python. One of the things it says is that explicit is better than implicit, and I really think that’s true. And yet I work in Ruby and Rails where implicit is the name of the game. So I have some cognitive dissonance over that. I love Ruby on Rails, but I’m starting to look at other languages and frameworks to see if something else might fit as well.

JC: Do you have the freedom to choose what language and framework you work in? Do clients just ask for a website, or do they dictate the technology?

CT: We have a mix. A lot of clients just want a web app, but some, especially large companies, want us to use their technology stack. So while we do a lot of Rails, we also do some Python, Haskell, etc.

JC: Do you do everything soup-to-nuts or do you have some specialization?

CT: We have three roles at thoughtbot: designer, web developer, and mobile developer. The designers might do some JavaScript, but they mostly focused on user experience, testing, and design.

JC: How do you keep everything straight? The most intimidating thing to me about web development is all the diverse tools in play: the language for your logic, JavaScript, CSS, HTML, SQL, etc.

CT: There’s definitely some of that, but we outsource some parts of the stack. We host applications on Heroku, giving them responsibility for platform management. They run on top of AWS so they handle all the scaling issues so we can focus on the code. We’ll deploy to other environments if our client insists, but our preference is to go with Heroku.

Similarly, Rails has a lot of functionality for the database layer, so we don’t write a lot of SQL by hand. We’re all knowledgeable of SQL, but we’re not DBA-level experts. We scale up on that as necessary, but we want to focus on the application.

JC: Shifting gears a little bit, how do you program differently in a dynamic language like Ruby than you would in a stricter language like C++? And is that a good thing?

CT: One thing about Ruby, and dynamic languages in general, is that testing becomes all the more critical. There are a lot of potential runtime errors you have to test for. Whereas with something like Haskell you can program a lot of your logic into the type system. Ruby lets you work more freely, but Haskell leads to more robust applications. Some of our internal software at thoughtbot is written in Haskell.

JC: I was excited about using Haskell, but when I used it on a production project I ran into a lot of frustrations that you wouldn’t anticipate from working with Haskell in the small.

CT: Haskell does seem to have a more aggressive learning curve than other languages. There’s a lot of Academia in it, and in a way that’s good. The language hasn’t compromised its vision, and it’s been able to really develop some things thoroughly. But it also has a kind of academic heaviness to it.

There’s a language out there called Elm that’s inspired by Haskell and the whole ML family of languages that compiles down to JavaScript. It presents a friendlier interface to the whole type-driven, functional way of thinking. The developers of the language have put a lot of effort into making it approachable, without having to understand comonads and functors and all that.

JC: My difficulties with Haskell weren’t the theory but things like the lack of tooling and above all the difficulty of package management.

CT: Cabal Hell.

JC: Right.

CT: My understanding is that that’s improved dramatically with new technologies like Stack. We’re scaling up internally on Haskell. That’s the next area we’d like to get into. I’ll be able to say more about that down the road.

* * *

Check out Upcase for training materials on tools like Vim, Git, Rails, and Tmux.

The magic / boilerplate trade-off

Phil Webb had an insightful tweet the other day.

Programming environments oscillate between boilerplate and magic. APIs tend to start out with all the wires exposed. Programming is tedious, but nothing is hidden. Development is hard, but debugging is easy. See, for example, the hello-world program for Win32.

Programmers get tired of this, and create levels of abstraction. Boilerplate is reduced and development gets easier. And if the abstraction is done well, debugging doesn’t get harder, or not much harder. But this abstraction doesn’t go far enough. Programmers feel like things should be easier still. That’s when magic comes in. Magic differs from abstraction in that it not only hides details, it’s actively misleading. Something appears to work one way, the desired way, but something quite different is going on. Development gets easier, but debugging gets much harder. If the magic gets to be too much, developers look to start over with something more transparent, and the cycle begins again.

Magician

Wizards, in the sense of code generators, are closely related to magic. Whereas magic abuses language features to create illusions, wizards generate boilerplate code and become a sort of meta language.

Webb’s comment about boilerplate vs magic came to mind this morning, not from looking at software, but at math. Over time mathematicians discover better ways to organize material, turning some theorems into definitions and vice versa. This makes things easier in the long run, but creates a barrier to entry in the short term. Math moves from boilerplate to magic, from relatively concrete and tedious to abstract but less obviously motivated. The definitions seem magical to the beginner because the applications have been delayed.

When I see this in an area I understand well, I think it’s clever. When I see it in an area I don’t know well, it feels like some sort of guild is trying to keep me out. I know that this isn’t the case—the machinery of mathematics is always created for practical reasons, not to intentionally intimidate anyone—but it can feel that way. It’s not so much that the motivation is deliberately hidden but that it is obscured. An unfortunate aspect of mathematics culture is that people are reluctant to discuss motivation in writing. It’s more likely to come out in conversation or in lectures.

More magic posts

Dimensional analysis and types

ruler and protractor on grid paper

It’s spooky how well dimensional analysis catches errors. If you’re trying to calculate a number of horses, does your final result have units of horses? If it has units of cats or kilograms, something has gone wrong. This is such a simple idea, it’s remarkable that it’s even worth checking.

Dimensions are analogous to type systems for programming languages. Type systems prevent you, for example, from adding a number and a word or from trying to take the square root of an image. You could think of dimensional analysis as a subset of type theory.

Is more better?

If a minimal amount of type checking, i.e. checking units of measurement, is effective at catching errors, more type checking should catch more errors, right?. Although I mostly agree with this argument, it leaves out cost. Stronger type checking catches more errors, but at what cost? Dimensional analysis is practically free. It can literally be a back-of-an-envelope calculation and has great return on effort. What about type systems?

Most people would agree that some minimal amount of type checking is worth the effort. The controversy is over when it is no longer economical to add additional structure. Haskell has a much more expressive type system than less formal languages, and yet the Haskell community is looking for ways to make the type system even more expressive. Some would say we’ve yet to discover the point where additional typing isn’t worth the effort.

Context

As with most technical controversies, the resolution depends on context. You don’t want types to get in your way when you’re writing a quick-and-dirty script. But you might greatly appreciate them in large, mission-critical projects.

It takes skill to get the most benefit from a type system. A good type system won’t automatically do anything for your code. You could actively work against the type system by writing “stringly typed” code, for example. Every function takes in a string and returns a string. You could passively work with the type system, using built-in types as intended but not creating any new types. Or you could actively work with the type system, creating new types so that more logical errors will become compiler errors.

An example of passively using a type system would be distinguishing integers and floating point numbers, say using integers for counting things, floating point numbers for measurements like temperature and mass, and strings for character data. You could argue that there aren’t many natural types in such program, and so a strong type system wouldn’t be that helpful. A more active approach would be, for example, to introduce different types for temperatures and masses. You could go much further than this, organizing data into more complex types.

All other things being equal, I like strong, static typing. But there are usually other factors that outweigh my typing preferences: availability of libraries, convenience, client requirements, etc.

Learning (needlessly) hard technology

A few years ago, a friend told me he was thinking about learning a certain technology because it was really hard to use. This was not something that had to be complex to solve a complex problem, but something that was unnecessarily complex. Why would anyone do that?

His reasoning was that as a consultant, he could make good money supporting a technology that’s hard to use. My friend would have more integrity than to recommend something that he didn’t think was a good solution. Perhaps he was thinking of saying something like this to a client: “I wouldn’t recommend this technology if you were starting from scratch. But since you’re invested in it, I’ll help you with it or help you migrate to something else.”

That sounds like an unpleasant way to earn a living. It also sounds risky. If something really is unnecessarily complex, better alternatives are likely to arise, perhaps suddenly. (This assumes people are free to choose alternatives, not prohibited by law, for example.)

Learning a technology that’s complex for good reasons could be a smart and ethical move. The work is harder at lower levels of abstraction, but someone has to solve the problems others would rather not think about. And since not as many people can do that work, it should pay better and be more secure.

There are a couple dangers, however, associated with choosing a more difficult technology. One is the temptation to use it where it isn’t needed. The other is that the set of problems where it is needed may shrink over time.

More programming posts

The success of OOP

Allen Wirfs-Brock gave the following defense of OOP a few days ago in a series of six posts on Twitter:

A young developer approached me after a conf talk and said, “You must feel really bad about the failure of object-oriented programming.” I was confused. I said, “What do you mean that object-orient programming was a failure. Why do you think that?”

He said, “OOP was supposed to fix all of our software engineering problems and it clearly hasn’t. Building software today is just as hard as it was before OOP. came along.”

“Have you ever look at the programs we were building in the early 1980s? At how limited their functionality and UIs were? OOP has been an incredible success. It enabled us to manage complexity as we grew from 100KB applications to today’s 100MB applications.”

Of course OOP hasn’t solved all software engineering problems. Neither has anything else. But OOP has been enormously successful in allowing ordinary programmers to write much larger applications. It has become so pervasive that few programmers consciously think about it; it’s simply how you write software.

I’ve written several posts poking fun at the excesses of OOP and expressing moderate enthusiasm for functional programming, but I appreciate OOP. I believe functional programming will influence object oriented programming, but not replace it.

Related:

Algorithmic wizardry

Last week I wrote a short commentary on James Hague’s blog post Organization skills beat algorithmic wizardry. This week that post got more traffic than my server could handle. I believe it struck a chord with experienced software developers who know that the challenges they face now are not like the challenges they prepared for in school.

Although I completely agree that “algorithmic wizardry” is over-rated in general, my personal experience has been a little different. My role on projects has frequently been to supply a little bit of algorithmic wizardry. I’ve often been asked to look into a program that is taking too long to run and been able to speed it up by an order of magnitude or two by improving a numerical algorithm. (See an example here.)

James Hague says that “rarely is there some … algorithm that casts a looming shadow over everything else.” I believe he is right, though I’ve been called into projects precisely on those rare occasions when an algorithm does cast a shadow over everything else.