# Big data and the law

Excerpt from the new book Big Data of Complex Networks:

Big Data and data protection law provide for a number of mutual conflicts: from the perspective of Big Data analytics, a strict application of data protection law as we know it today would set an immediate end to most Big Data applications. From the perspective of the law, Big Data is either a big threat … or a major challenge for international and national lawmakers to adopt today’s data protection laws to the latest technological and economic developments.

The author of the chapter on legal matters is Swiss and writes primarily in a European context, though all countries face similar problems.

I’m not a lawyer, though I sometimes work with lawyers, and sometimes help companies with the statistical aspects of HIPAA law. But as a layman the observation above sounds reasonable to me, that strict application of the law could bring many applications to a halt, for better and for worse.

In my opinion the regulations around HIPAA and de-identification are mostly reasonable. The things it prohibits mostly should be prohibited. And it has a common sense provision in the form of expert determination. If your data uses fall outside the regulation’s specific recommendations but don’t endanger privacy, you can have an expert can certify that this is the case.

# Branch cuts and Common Lisp

“Nearly everything is really interesting if you go into it deeply enough.” — Richard Feynman

If you thumb through Guy Steele’s book Common Lisp: The Language, 2nd Edition, you might be surprised how much space is devoted to defining familiar functions: square root, log, arcsine, etc. He gives some history of how these functions were first defined in Lisp and then refined by the ANSI (X3JI3) standardization committee in 1989.

There are three sources of complexity:

1. Complex number arguments
2. Multi-valued functions
3. +0 and -0

## Complex arguments

The functions under discussion are defined for either real or complex inputs. This does not complicate things much in itself. Defining some functions for complex arguments, such as the exp function, is simple. The complication comes from the interaction of complex arguments with multi-valued functions and floating point representations of zero.

## Multi-valued functions

The tricky functions to define are inverse functions, functions where we have to make a choice of range.

### Real multi-valued functions

Let’s restrict our attention to real numbers for a moment. How do you define the square root of a positive number x? There are two solutions to the equation y2 = x, and √x is defined to be the positive solution.

What about the arcsine of x? This is the number whose sine is x. Except there is a “the” number. There are infinitely many numbers whose sine is x, so we have to make a choice. It seems natural to chose values in an interval symmetric about 0, so we take arcsine of x to be the number between -π/2 and π/2 whose sine is x.

Now what about arctangent? As with arcsine, we have to make a choice because for any x there are infinitely many numbers y whose tangent is x. Again it’s convenient to define the range to be in an interval symmetric about zero, so we define the arctangent of x to be the number y between -π/2 and π/2 whose tangent is x. But now we have a subtle complication with tangent we didn’t have with sine because tangent is unbounded. How do we want to define the tangent of a vertical angle? Should we call it ∞ or -∞? What do we want to return if someone asks for the arctangent of ±∞? Should we return π/2 or -π/2?

### Complex multi-valued functions

The discussion shows there are some minor complications in defining inverse functions on the real line. Things get more complicated when working in the complex plane. To take the square root example, it’s easy to say we’re going to define square root so that the square root of a positive number is another positive number. Fine. But which solution to z2 = w should we take to be the square root of a complex number w, such as 2 + 3i or -5 + 17i?

Or consider logarithms. For positive numbers x there is only one real number y such at exp(y) = x. But what if we take a negative value of x such as -1? There’s no real number whose exponential is -1, but there is a complex number. In fact, there are infinitely many complex numbers whose exponential is -1. Which one should we choose?

## Floating point representations of zero

A little known feature of floating point arithmetic (specified by the IEEE 754 standard) is that there are two kinds of zero: +0 and -0. This sounds bizarre at first, but there are good reasons for this, which I explain here. But defining functions to work properly with  two kinds of zero takes a lot of work. This was the main reason the ANSI Common Lisp committee had to revise their definitions of several transcendental functions. If a function has a branch cut discontinuity along the real axis, for example, you want your definition to be continuous as you approach x + 0i from above and as you approach x -0i from below.

## The Common Lisp solution

I’ll cut to the chase and present the solution the X3J13 came up with. For a discussion of the changes this required and the detailed justifications, see Guy Steele’s book.

The first step is to carefully define the two-argument arctangent function (atan y x) for all 16 combinations of y and x being positive, negative, +0, or -0. Then other functions are defined as follows.

1. Define phase in terms of atan.
2. Define complex abs in terms of real sqrt.
3. Define complex log in terms of phase, complex abs, and real log.
4. Define complex sqrt in terms of complex log.
5. Define everything else in terms of the functions above.

The actual implementations may not follow this sequence, but they have to produce results consistent with this sequence.

The phase of z is defined as the arctangent with arguments the imaginary and real parts of z.

The complex log of z is defined as log |z| + i phase(z).

Square root of z is defined as exp( log(z) / 2 ).

The inverses of circular and hyperbolic functions are defined as follows.

Note that there are many ways to define these functions that seem to be equivalent, and are equivalent in some region. Getting the branch cuts right is what makes this complicated.

# Ultra-reliable software

From a NASA page advocating formal methods:

We are very good at building complex software systems that work 95% of the time. But we do not know how to build complex software systems that are ultra-reliably safe (i.e. P_f < 10^-7/hour).

Developing medium-reliability and high-reliability software are almost entirely different professions. Using typical software development procedures on systems that must be ultra-reliable would invite disaster. But using extremely cautious development methods on systems that can afford to fail relatively often would be an economic disaster.

# Technological allegiances

I used to wonder why people “convert” from one technology to another. For example, someone might convert from Windows to Linux and put a penguin sticker on their car. Or they might move from Java to Ruby and feel obligated to talk about how terrible Java is. They don’t add a new technology, they switch from one to the other. In the words of Stephen Sondheim, “Is it always or, and never and?”

Rivalries seem sillier to outsiders the more similar the two options are. And yet this makes sense. I’ve forgotten the psychological term for this, but it has a name: Similar things compete for space in your brain more than things that are dissimilar. For example, studying French can make it harder to spell English words. (Does literature have two t’s in French and one in English or is it the other way around?) But studying Chinese doesn’t impair English orthography.

It’s been said that academic politics are so vicious because the stakes are so small [1]. Similarly, there are fierce technological loyalties because the differences with competing technologies are so small, small enough to cause confusion. My favorite example: I can’t keep straight which languages use else if, elif, elseif, … in branching.

If you have to learn two similar technologies, it may be easier to devote yourself exclusively to one, then to the other, then use both and learn to keep them straight.

Related post: Ford-Chevy arguments in technology

[1] I first heard this attributed to Henry Kissinger, but there’s no agreement on who first said it. Several people have said similar things.

# Speed and correctness

Comment from Paul Phillips on making things easy to understand:

It’s always been “We can’t do it that way. It would be too slow.”

You know what’s slow? Spending all day trying to figure out why it doesn’t work. That’s slow. That’s the slowest thing I know.

# Gentle introduction to R

The R language is closely tied to statistics. It’s ancestor was named S, because it was a language for Statistics. The open source descendant could have been named ‘T’, but its creators chose to call it’R.’

Most people learn R as they learn statistics: Here’s a statistical concept, and here’s how you can compute it in R. Statisticians aren’t that interested in the R language itself but see it as connective tissue between commands that are their primary interest.

This works for statisticians, but it makes the language hard for non-statisticians to approach. Years ago I managed a group of programmers who supported statisticians. At the time, there were no books for learning R without concurrently learning statistics. This created quite a barrier to entry for programmers whose immediate concern was not the statistical content of an R program.

Now there are more books on R, and some are more approachable to non-statisticians. The most accessible one I’ve seen so far is Learning Base R by Lawrence Leemis. It gets into statistical applications of R—that is ultimately why anyone is interested in R—but it doesn’t start there. The first 40% or so of the book is devoted to basic language features, things you’re supposed to pick up by osmosis from a book focused more on statistics than on R per se. This is the book I wish I could have handed my programmers who had to pick up R.

# Efficiency of C# on Linux

This week I attended Mads Torgersen’s talk Why you should take another look at C#. Afterward I asked him about the efficiency of C# on Linux. When I last looked into it, it wasn’t good. A few years ago I asked someone on my team to try running some C# software on Linux using Mono. The code worked correctly, but it ran several times slower than on Windows.

Mads said that now with .NET Core, C# code runs about as fast on Linux as Windows. Maybe a few percent slower on Linux. Scott Hanselman joined the conversation and explained that with .NET Core, the same code runs on every platform. The Mono project duplicated the much of the functionality of the .NET framework, but it was an entirely independent implementation and not as efficient.

I had assumed the difference was due to compiler optimizations or lack thereof, but Scott and Mads said that the difference was mostly the code implementation. There are some compiler optimizations that are better on the Windows side, and so C# might run a little faster on Windows, but the performance is essentially the same on both platforms.

I could recommend C# to a client running Linux if there’s a 5% performance penalty, but a 500% performance penalty was a show-stopper. Now I’d consider using C# on a project where I need more performance than Python or R, but wanted to use something easier to work with than C++.

Years ago I developed software with the Microsoft stack, but I moved away from Microsoft tools when I quit doing the kind of software development the tools are geared for. So I don’t write C# much any more. It’s been about a year since I had a project where I needed to write C# code. But I like the C# language. You can tell that a lot of thought has gone into the design, and the tool support is great. Now that the performance is better on Linux I’d consider using it for numerical simulations.

# Humble Lisp programmers

Maybe from the headline you were expecting a blank post? No, that’s not where I’m going.

Yesterday I was on Amazon.com and noticed that nearly all the books they recommended for me were either about Lisp or mountain climbing. I thought this was odd, and mentioned it on Twitter. Carl Vogel had a witty reply: “I guess they weren’t sure whether you want to figuratively or literally look down on everyone.”

The stereotype Lisp programmer does look down on everyone. But this is based on a tiny, and perhaps unrepresentative, sample of people writing about Lisp compared to the much larger number of people who are writing in Lisp.

Lisp has been around for over 50 years and shows no signs of going away. There are a lot of people writing Lisp in obscurity. Kate Gregory said something similar about C++ developers, calling them the dark matter of programmers because there are lot of them but they don’t make much of a splash. They’re quietly doing their job, not speaking at conferences or writing much about their language.

I imagine there are a lot of humble Lisp programmers. It takes some humility to commit to an older technology, especially given the pervasive neomania of the programmer community. It also takes some humility to work on projects that have been around for years or that are deep within the infrastructure of more visible projects, which is where I expect a lot of Lisp lives.

You can do very clever things in Lisp, but you don’t have to. As Ed Post famously said, “The determined Real Programmer can write FORTRAN programs in any language.” There must be a lot of code out there that writes (f x) instead of f(x) but otherwise isn’t that different from FORTRAN.

# Kalman filters and functional programming

A few weeks ago I started a series of posts on various things you could do with a functional fold. In the first post I mentioned that the idea came from a paper by Brian Beckman on Kalman filters and folds:

This post was inspired by a paper by Brian Beckman (in progress) that shows how a Kalman filter can be implemented as a fold. From a Bayesian perspective, the thing that makes the Kalman filter work is that a certain multivariate normal model has a conjugate prior. This post shows that conjugate models more generally can be implemented as folds over the data. That’s interesting, but what does it buy you? Brian’s paper discusses this in detail, but one advantage is that it completely separates the accumulator function from the data, so the former can be tested in isolation.

At the time Brian was working on one big paper in private. This has since been split into several papers and they’re now public.

# Formal methods let you explore the corners

I heard someone say the other day that the advantage of formal software validation methods is that they let you explore the corners, cases where intuition doesn’t naturally take you.

This made me think of corners in the geometric sense. If you have a sphere in a box in high dimensions, nearly all the volume is in the corners, i.e. outside the sphere. This is more than a metaphor. You can think of software options geometrically, with each independent choice corresponding to a dimension. Paths through a piece of software that are individually rare may account for nearly all use when considered together.

With a circle inside a square, nearly 78.5% of the area is inside the circle. With a ball sitting inside a 3-D box, 52.4% of the volume is inside the ball. As the dimension increases, the proportion of volume inside the sphere rapidly decreases. For a 10-dimensional sphere sitting in a 10-dimensional box, 0.25% of the volume is in the sphere. Said another way, 99.75% of the volume is in the corners.

When you go up to 100 dimensions, the proportion of volume inside the sphere is about 2 parts in 1070, a 1 followed by 70 zeros [1]. If 100 dimensions sounds like pure fantasy, think about a piece of software with more than 100 features. Those feature combinations multiply like geometric dimensions [2].

Here’s a little Python code you could use to see how much volume is in a sphere as a function of dimension.

    from scipy.special import gamma
from math import pi

def unit_sphere_volume(n):
return pi**(0.5*n)/gamma(0.5*n + 1)

def unit_cube_volume(n):
return 2**n

def ratio(n):
return unit_sphere_volume(n) / unit_cube_volume(n)

print( [ratio(n) for n in range(1, 20)] )


* * *

[1] There are names for such extremely large numbers. These names are hardly ever used—scientific notation is much more practical— but they’re fun to say. 1070 is ten duovigintillion in American nomenclature, ten undecilliard in European.

[2] Geometric dimensions are perfectly independent, but software feature combinations are not. In terms of logic, some combinations may not be possible. Or in terms of probability, the probability of exploring some paths is conditional on the probability of exploring other paths. Even so, there are inconceivably many paths through any large software system. And in large-scale operations, events that should “never happen” happen regularly.

# Literate programming: presenting code in human order

## Presentation order

People best understand computer programs in a different order than compilers do. This is a key idea of literate programming, and one that distinguishes literate programs from heavily commented programs.

Traditional source code, no matter how heavily commented, is presented in the order dictated by the compiler. The computer is the primary audience. Literate programming is more humanistic in the sense that the primary audience is a human. The computer has to go to extra effort to arrange the code for its needs. As Donald Knuth describes it in his book on literate programming,

The practitioner of literate programming … strives for a program that is comprehensible because its concepts have been introduced in an order that is best for human understanding, using a mixture of formal and informal methods that nicely reinforce each other. [emphasis added]

There are two steps in processing literate programs: weaving and tangling. You take files containing prose and code, and weave them into documentation and tangle them into source code. Tools like Sweave and Pweave focus on the weave process, as their names imply. The weave side of literate programming has gotten the most attention.

A half-hearted approach to literate programming doesn’t require much of a tangle process. A well-commented program has no tangle step at all. A *weave document that follows the order of the source code has a trivial tangle step: save the code to its own file, manually or automatically, but don’t rearrange it. But a full-fledged literate program may make the tangle program work harder, rearranging code fragments from human-friendly to compiler-friendly order.

## Careful explanation vs. unit tests

The most obvious feature of literate programming is that it requires careful explanation. Here’s more from the paragraph I quoted above, filling in the part I left out.

The practitioner of literate programming can be regarded as an essayist, whose main concern is with explanation and excellence of style. Such an author, with thesaurus in hand, chooses the names of variables carefully and explains what each variable means. He or she strives for a program that is comprehensible …

The discipline of explaining every piece of code leads to better code. It serves a similar purpose to writing unit tests. I saw somewhere—I can’t remember where now— that Knuth hates unit testing and sees it as redundant effort. Presumably this is because unit testing and literate programming overlap. Unit tests are a kind of commentary on code, explaining how it used, exploring its limitations, etc.

Knuth understands that literate programming doesn’t replace the need for testing, just unit testing. He explained somewhere—again I forget where—that he would test TeX by spending days at a time trying fiendishly to break it.

## My misunderstanding and experience

When I read Knuth’s book, I could see the value of carefully explaining code. What I didn’t appreciate was the value of presenting code in a different order than the order of source code.

I’m working on a project now where a few lines of code may require a few paragraphs of explanation. That’s what got me thinking about literate programming. My intention was to write my documentation in the same order as the code. It took a while to realize I had stumbled on an ideal application of literate programming: a complicated algorithm that needs to be explained carefully, both in order to hand over to the client and to reduce the chances of errors. The best order to understand this algorithm is definitely not top-down going through the code.

## Why literate programming has not taken off

I think I understand better now why literate programming hasn’t gained much of an audience. I used to think that it was because developers hate writing prose. That’s part of it. Most programmers I’ve worked with would much rather write a hundred lines of unit tests than write one complete sentence.

But that’s not the whole story. There are quite a few programmers who are willing and able to write prose. Why don’t more of them use literate programming?

I think part of the reason is that having a non-trivial tangle process is a barrier to adoption. A programmer can decide to start writing more extensive comments, gradually working up to essay-like explanations. But it’s one thing to say “I’m going to heavily comment my C code” and quite another to say “I’m not going to write C per se any more. I’m going to write CWEB files that compile to C.” Even if a programmer wants to write CWEB in secret, just checking in the tangled C files, other programmers will edit these C files and the changes won’t be reflected in the CWEB source. Also, the output of tangle is less readable than ordinary code. The programmer secretly using CWEB as a preprocessor would appear to be writing undocumented code.

Tricky code benefits from a literate presentation, but routine code does not benefit so much. You either have to have two ways of writing code—straight source for routine code and literate programs for the tricky parts—or impose the overhead of literate programming everywhere. Most code is mundane and repetitive, challenging because of its volume rather than its cleverness. Knuth doesn’t write this kind of code. He only writes code that benefits from a literate presentation.

To write a good literate program, not only do you need to be able to program, and need to be willing and able to write good prose, on top of that you need to have a good sense for story telling, arranging the code for the benefit of other readers. If this is done poorly, the result is harder to understand than traditional programs.

I may use literate programming more now that I’m starting to understand it, at least for my own benefit and hopefully for the benefit of clients. I usually deliver algorithms or libraries, not large applications, and so it wouldn’t be too much work to create two versions of my results. I could create a literate program, then weave a report, and manually edit the tangled code into a traditional form convenient for the client.

# Agile software development and homotopy

One of the things I learned from my tenure as a software project manager was that a project is more likely to succeed if there’s a way to get where you want to go continuously. You want to move a project from A to B gradually, keeping a working code base all along the way. At the end of each day, the software may not be fully functional, but it should at least build. Anything that requires a big bang change, tearing the system apart for several days and putting it back together, is less likely to succeed.

This is very much like the idea of homotopy from topology, a continuous deformation of one thing into another. No discontinuities along the way — no ripping, no jumping suddenly from one thing to another.

* * *

Need help with agile software forecasting?

# Category theory and Koine Greek

When I was in college, I sat in on a communication workshop for Latin American preachers. This was unusual since I’m neither Latin American nor a preacher, but I’m glad I was there.

I learned several things in that workshop that I’ve used ever since. For example, when you’re gesturing about something moving forward in time, move your hand from left to right from the audience’s perspective. Since English speakers (and for the audience of this workshop, Spanish speakers) read from left to right, we think of time progressing from left to right. If you see someone talking about time moving forward, but you see motion from right to left, you feel a subtle cognitive dissonance. (Presumably you should reverse this when speaking to an audience whose primary language is Hebrew or Arabic.)

Another lesson from that workshop, the one I want to focus on here, is that you don’t always need to convey how you arrived at an idea. Specifically, the leader of the workshop said that if you discover something interesting from reading the New Testament in Greek, you can usually present your point persuasively using the text in your audience’s language without appealing to Greek. This isn’t always possible—you may need to explore the meaning of a Greek word or two—but you can use Greek for your personal study without necessarily sharing it publicly. The point isn’t to hide anything, only to consider your audience. In a room full of Greek scholars, bring out the Greek.

This story came up in a recent conversation with Brent Yorgey about category theory. You might discover something via category theory but then share it without discussing category theory. If your audience is well versed in category theory, then go ahead and bring out your categories. But otherwise your audience might be bored or intimidated, as many people would be listening to an argument based on the finer points of Koine Greek grammar. Microsoft’s LINQ software, for example, was inspired by category theory principles, but you’d be hard pressed to find any reference to this because most programmers don’t want to know or need to know where it came from. They just want to know how to use it.

Some things may sound profound when expressed in esoteric language, such as category theory or Koine Greek, that don’t seem so profound in more down-to-earth language. Expressing yourself in a different language helps filter out pedantry from useful ideas. (On the other hand, some things that looked like pure pedantry have turned out to be very useful. Some hairs are worth splitting.)

Sometimes you have to introduce a new terms because there isn’t a colloquial counterpart. Monads are a good example, a concept from category theory that has entered software development. A monad is what it is, and analogies to burritos and other foods don’t really help. Better to introduce the term and say plainly what it is.

* * *

More on applied category theory

# Interview with Chris Toomey of Upcase

The other day I spoke to Chris Toomey from thoughtbot. Chris runs Upcase, thoughtbot’s online platform for learning about Rails, test-driven development, clean code, and more. I was curious about his work with Ruby on Rails since I know little about that world. And at a little deeper level, I wanted to get his thoughts on how programming languages are used in practice, static vs dynamic, strongly typed vs weakly typed, etc.

JC: Chris, I know you do a lot of work with Ruby on Rails. What do you think of Ruby without Rails? Would you be as interested in Ruby if the Rails framework had been written in some other language?

CT: Let me back up a little bit and give you some of my background. I started out as an engineer and I used VB because it was what I had for the task at hand. Then when I decided to buckle down and become a real developer I chose Python because it seemed like the most engineering-oriented alternative. It seemed less like an enterprise language, more small and nimble. I chose Python over Ruby because of my engineering background. Python seemed more serious, while Ruby seemed more like a hipster language. Ruby sounded frivolous, but I kept hearing good things about it, especially with Rails. So like a lot of people I came to Ruby through Rails. It was the functionality and ease of use that got me hooked, but I do love Ruby as a language, the beauty and expressiveness of it. It reads more like prose than other languages. It’s designed for people rather than machines. But it’s also a very large language and hard to parse because of that. Over time though I’ve seen people abuse the looseness, the freedom in Ruby, and that’s caused me to look at stricter options like Haskell and other functional languages.

JC: I only looked at Ruby briefly, and when I saw the relative number of numerical libraries for Python and Ruby I thought “Well, looks like it’s Python for me.”

It seems like Ruby bears some resemblance to Perl, for better or worse.

CT: Absolutely. Ruby has two spiritual ancestors. One is Perl and the other is Smalltalk. I think both of those are great, and many of the things I love about Ruby come from that lineage. Perl contributed the get-things-done attitude, the looseness and terseness, the freedom to interact at any level of abstraction.

It’s kinda odd. I keep coming back to The Zen of Python. One of the things it says is that explicit is better than implicit, and I really think that’s true. And yet I work in Ruby and Rails where implicit is the name of the game. So I have some cognitive dissonance over that. I love Ruby on Rails, but I’m starting to look at other languages and frameworks to see if something else might fit as well.

JC: Do you have the freedom to choose what language and framework you work in? Do clients just ask for a web site, or do they dictate the technology?

CT: We have a mix. A lot of clients just want a web app, but some, especially large companies, want us to use their technology stack. So while we do a lot of Rails, we also do some Python, Haskell, etc.

JC: Do you do everything soup-to-nuts or do you have some specialization?

CT: We have three roles at thoughtbot: designer, web developer, and mobile developer. The designers might do some JavaScript, but they mostly focused on user experience, testing, and design.

JC: How do you keep everything straight? The most intimidating thing to me about web development is all the diverse tools in play: the language for your logic, JavaScript, CSS, HTML, SQL, etc.

CT: There’s definitely some of that, but we outsource some parts of the stack. We host applications on Heroku, giving them responsibility for platform management. They run on top of AWS so they handle all the scaling issues so we can focus on the code. We’ll deploy to other environments if our client insists, but our preference is to go with Heroku.

Similarly, Rails has a lot of functionality for the database layer, so we don’t write a lot of SQL by hand. We’re all knowledgeable of SQL, but we’re not DBA-level experts. We scale up on that as necessary, but we want to focus on the application.

JC: Shifting gears a little bit, how do you program differently in a dynamic language like Ruby than you would in a stricter language like C++? And is that a good thing?

CT: One thing about Ruby, and dynamic languages in general, is that testing becomes all the more critical. There are a lot of potential runtime errors you have to test for. Whereas with something like Haskell you can program a lot of your logic into the type system. Ruby lets you work more freely, but Haskell leads to more robust applications. Some of our internal software at thoughtbot is written in Haskell.

JC: I was excited about using Haskell, but when I used it on a production project I ran into a lot of frustrations that you wouldn’t anticipate from working with Haskell in the small.

CT: Haskell does seem to have a more aggressive learning curve than other languages. There’s a lot of Academia in it, and in a way that’s good. The language hasn’t compromised its vision, and it’s been able to really develop some things thoroughly. But it also has a kind of academic heaviness to it.

There’s a language out there called Elm that’s inspired by Haskell and the whole ML family of languages that compiles down to JavaScript. It presents a friendlier interface to the whole type-driven, functional way of thinking. The developers of the language have put a lot of effort into making it approachable, without having to understand comonads and functors and all that.

JC: My difficulties with Haskell weren’t the theory but things like the lack of tooling and above all the difficulty of package management.

CT: Cabal Hell.

JC: Right.

CT: My understanding is that that’s improved dramatically with new technologies like Stack. We’re scaling up internally on Haskell. That’s the next area we’d like to get into. I’ll be able to say more about that down the road.

* * *

Check out Upcase for training materials on tools like Vim, Git, Rails, and Tmux.

# The magic / boilerplate trade-off

Phil Webb had an insightful tweet the other day.

Programming environments oscillate between boilerplate and magic. APIs tend to start out with all the wires exposed. Programming is tedious, but nothing is hidden. Development is hard, but debugging is easy. See, for example, the hello-world program for Win32.

Programmers get tired of this, and create levels of abstraction. Boilerplate is reduced and development gets easier. And if the abstraction is done well, debugging doesn’t get harder, or not much harder. But this abstraction doesn’t go far enough. Programmers feel like things should be easier still. That’s when magic comes in. Magic differs from abstraction in that it not only hides details, it’s actively misleading. Something appears to work one way, the desired way, but something quite different is going on. Development gets easier, but debugging gets much harder. If the magic gets to be too much, developers look to start over with something more transparent, and the cycle begins again.

Wizards, in the sense of code generators, are closely related to magic. Whereas magic abuses language features to create illusions, wizards generate boilerplate code and become a sort of meta language.

Webb’s comment about boilerplate vs magic came to mind this morning, not from looking at software, but at math. Over time mathematicians discover better ways to organize material, turning some theorems into definitions and vice versa. This makes things easier in the long run, but creates a barrier to entry in the short term. Math moves from boilerplate to magic, from relatively concrete and tedious to abstract but less obviously motivated. The definitions seem magical to the beginner because the applications have been delayed.

When I see this in an area I understand well, I think it’s clever. When I see it in an area I don’t know well, it feels like some sort of guild is trying to keep me out. I know that this isn’t the case—the machinery of mathematics is always created for practical reasons, not to intentionally intimidate anyone—but it can feel that way. It’s not so much that the motivation is deliberately hidden but that it is obscured. An unfortunate aspect of mathematics culture is that people are reluctant to discuss motivation in writing. It’s more likely to come out in conversation or in lectures.

Related posts: