Opening black boxes

Rookie programmers don’t know how to reuse code. They write too much original code because they either don’t know about libraries or they don’t know how to use them. And if they do reuse someone else’s code, they copy and paste it, creating maintenance problems.

The next step in professional development is learning to reuse code. Encapsulation! Black boxes! Buy, don’t build! etc.

But this emphasis on reuse and black boxes can go too far. We can be intimidated by these black boxes and afraid to open them. We can come to believe the black boxes were created by superior beings. We can spend more time inferring the behavior of the black boxes than it would take to open them up or rewrite them. Then we pile leaky abstraction on top of leaky abstraction when we treat our own code as black boxes.

Joe Armstrong said in Coders at Work

Over the years I’ve kind of made a generic mistake … to not open the black box. … It’s worthwhile seeing if the direct route is quicker than the packaged route.

Several of the programmers who were interviewed in the book made similar remarks. They contribute part of their success to being unafraid of black boxes. They gained experience and confidence by taking things apart to see how they work.

Donald Knuth once said in an interview

I also must confess to a strong bias against the fashion for reusable code. To me, “re-editable code” is much, much better than an untouchable black box or toolkit. I could go on and on about this. … you’ll never convince me that reusable code isn’t mostly a menace.

Knuth returns to this theme in Coders at Work.

There’s this overemphasis on reusable software where you never get to open up the box … It’s nice to have these black boxes but, almost always, if you can look inside the box you can improve it …

Well, Knuth can almost always improve any code he finds. Less talented programmers need to be more humble. But too often programmers who are talented enough to make improvements are reluctant to do so. As Yeats said in his poem The Second Coming,

The best lack all conviction, while the worst are full of passionate intensity.

In any discussion of opening black boxes, someone will bring up the analogy of cars: Not everyone needs to know how a car works inside. I would agree that drivers no longer need to understand how a car works, but automotive engineers do. The problem isn’t users who don’t understand how software works, it’s software developers who don’t understand how software works.

Of course software libraries are extremely valuable. Knuth goes too far when he says reusable code is usually a menace. But I see a disturbing lack of curiosity among programmers. They are far too willing to use code they don’t understand.

Related post: Reusable code versus re-editable code

14 thoughts on “Opening black boxes

  1. I think it depends upon what the code is for.

    I don’t expect people to understand how crypto code works, and I don’t WANT them messing with it. When a string-manipulation library is well tested, debugged, and optimized, I don’t expect that most programmers can go modify it to any improvement, and they’re more likely to expose it to new bugs, such as index-out-of-bounds errors. And I certainly don’t want people screwing with network libraries unless they really know what they’re doing.

    On the other hand, yes, clearly, some higher-level toolkit that I wrote might easily be improved on by my colleagues. One has to use judgment.

  2. A library or toolkit that is developed with all the “best practices” and using frameworks such as Boost/STL and so on is effectively a black box whether the code is available or not. It’s too difficult to navigate or make meaningful changes to the code without access to the internal development design documents and long time spent understanding the architecture and details. Faced with such libraries, reimplementing it yourself or simply treating it as a black box may well be the sensible thing to do.

  3. I fully agree with your posted statements John. It does depend very much though in the methods that different individuals use to learn.

    I personally find it considerably easier to start with soming pre-built and learn by adapting it. Instead of having to work from a blank sheet, and having to figure our syntaxes and methods from scratch, you start from something that has been done. This becomes considerably easier where the original developer(s) have applied a black box re-use principle. Going back to Janne’s comment the design is almost entirely reflected in the structure of the code.

    One good example of this that I have applied in the past is the modification of a well known PHP based open source bulletin board PHPBB(2). I went from a person that had never touched PHP, to a developer that managed to create a Portal Mod for the project. This was because a) PHP is all encapsulating b). it used libraries (black box modules) of code and c) the design was reflected in the code.

    All that said, it depends on the mind of the developer. Some developers think better with a blank sheet and an uncluttered mind – developing code from scratch. Others, like myself, need something to get things moving along.

  4. Doug, along those lines, I remember a company about ten years ago that spoke highly of Microsoft’s first e-commerce framework. They went on to say how they replaced every single line of it with their own code, but they were glad to have had the Microsoft code to start with.

  5. Opening a black box may be fine if said box is an inanimate object.
    However most black boxes are not inanimate objects, they are more like living animals. And you have to be very careful when opening live animals or you will kill them.
    What I want to say is that by modifying actively maintained libraries you won’t benefit from evolutions and bug fixes unless you maintain a separate branch, which is costly. You may also lose the efficiency benefits of shared libraries.

  6. I think I have a similar viewpoint to Knuth here. The problem with a black-box is that it is sealed. So while you know the code you need is in there, the API doesn’t expose it, or exposes it in a terrible fashion. By forcing yourself to work through the given API you won’t be solving the problem in an ideal fashion. If you could modify a few things in the box you’d be much better off.

    If a library is completely closed, without even the option to review the source, let alone modify it, I will actually agree with Knuth and say it is a menace. Often defects arise in using a library simply because the documentation isn’t clear enough, and looking at the code helps solve the problem.

  7. On the other hand – if code are improved here and there, and different pieces starting to depending on their own treaked version, it is going to be hard to understand the system as a whole. You will need to know all those variations and why they did it.

    Mature components should be extended through open closed principle. If a particular use has a particular need, then it should be extended appropriately – by writing extensions – but not modifying the internals.

    Code reuse is most likely going to benefit the maintainance by having just a version of code to read. As most software projects are, authoring is just small part of the cost, maintainance is the real nightmare.

Leave a Reply

Your email address will not be published. Required fields are marked *