Here’s something I wish I’d understood early in my career. From Merlin Mann:
If a project doesn’t have an owner, it’s like a chainsaw on a rope swing. Why would anyone even go near that?
According to Richard Feynman, the most important event of the 19th century was the discovery of the laws of electricity and magnetism.
From a long view of the history of mankind — seen from, say, ten thousand years from now — there can be little doubt that the most significant event of the 19th century will be judged as Maxwell’s discovery of the laws of electrodynamics. The American Civil War will pale into provincial insignificance in comparison with this important scientific event of the same decade.
From The Feynman Lectures on Physics, Volume 2.
Related post: Grand unified theory of 19th century math
Discussions of software architecture give the impression that the only concern is the problem domain: how to structure a content management system, how to structure a word processor, etc. This leaves out the people who will be developing the software.
How much do you trust your software developers? How much do you trust their skill and their integrity? Do you want to get out of your developers’ way or do you want to protect yourself against incompetent developers?
This is uncomfortable to talk about, and so the decision is usually left implicit. Nobody wants to say out loud that they’re designing software for an army of mediocre programmers to implement, but that is the default assumption. And rightfully so. Most developers have middling ability, by definition.
(At this point we could go on a rabbit trail debating cause and effect. People rise (and sink) to expectations. One could argue that the assumption of mediocrity is self-fulfilling, and to some extent it is. On the other hand, treating a script kiddie like Donald Knuth isn’t going to make him into another Donald Knuth.)
When outstanding programmers complain about common approaches to developing software, they may not consider that most software is not written by outstanding programmers and what a difference that makes. For example, I’ve heard countless great programmers complain about Java. But Java wasn’t written for great programmers. It was written for average programmers. The restrictions in the language that great programmers chafe at are beneficial to teams of average programmers.
If you trust that your developers are highly competent and self-disciplined, you’ll organize your software differently than if you assume developers have mediocre skill and discipline. One way this shows up is the extent that you’re willing to rely on convention to maintain order. For example, the architecture behind Emacs is remarkably simple and highly dependent on convention. This approach has served Emacs quite well, but it wouldn’t work for a large team of mediocre developers. (It also wouldn’t work for software controlling a car’s brakes. Bugs in text editors don’t have the same consequences.)
In general, I see more reliance on convention in open source projects than in enterprise projects. A possible explanation is that open source projects have more motivated developers. Not all open source developers are volunteers, but many are. And not only are volunteers more motivated, they’re also easier to dismiss than employees. If someone’s code isn’t up to standard, the project can simply refuse to use their code. In theory the same could be said of an enterprise software project, but in practice it’s not that simple.
In the latest episode of EconTalk, Russ Roberts mentions Jens Rasmussen’s classification of errors into three categories: slips, mistakes, and violations.
So, a slip is: you just do something you immediately realize wasn’t what you meant to do — pushed the wrong button, locked yourself out of your house, forgot your car keys. Mistakes are things you do because your view of the world is wrong. So, you took out a subprime mortgage and bought a house because you thought house prices would continue to rise and you would be able to remortgage your house. Then there’s a violation — something you know is against the rules but you did it anyway, for whatever reason.
Perhaps it would be useful to classify errors more continuously based on how long it takes to regret them. For example, you might know you’ve made a slip a split second later. It typically takes longer to realize you’ve made a mistake. And violating a regulation, assuming it’s a wise regulation, may not lead to negative consequences for quite some time.
One advantage of crude models is that we know they are crude and will not try to read too much from them. With more sophisticated models,
… there is an awful temptation to squeeze the lemon until it is dry and to present a picture of the future which through its very precision and verisimilitude carries conviction. Yet a man who uses an imaginary map, thinking it is a true one, is like to be worse off than someone with no map at all; for he will fail to inquire whenever he can, to observe every detail on his way, and to search continuously with all his senses and all his intelligence for indications of where he should go.
From Small is Beautiful by E. F. Schumacher.
Obviously crude models are not always better. But I like to have some evidence that a complex model is worthwhile before I invest too much effort in it. And I’m well aware of forces that reward complexity for its own sake.
Update (15 May 2015): From Simple Rules by Donald Sull and Kathleen M. Eisenhardt:
We often assume that the best way to make a decision is by considering all the factors that might influence our choice and weighing their relative importance. Psychologists have found, however, that people tend to overweigh peripheral variables at the expense of critical ones when they try to take all factors into account. … Simple rules minimize the risk of overweighing peripheral considerations by focusing on the criteria most critical for making good decisions.
Beginning musicians think that sheet music contains more information than it does. It’s all they can do to play the notes on the page. Only later do they realize that sheet music is at best a good approximation of what a composer has in mind. Even when they think they’re just playing what’s on the page, their performance is informed by experience not captured in the sheet music.
A decade ago there was a lot of talk of DNA being the blueprint or software of life. But DNA sequences have not been as useful as anticipated. DNA is more like the sheet music of life. The same DNA can be expressed many different ways, just like a piece of sheet music.
Not only is DNA not source code, in a sense even source code is not source code! Source code in the technical sense, a set of computer language files used to build a program, is not source code in the colloquial sense of “everything you need to know.”
In a way, source files are enough to specify a program. Even so, it takes implicit knowledge not in the source code to build a program from the source code. And it takes implicit knowledge to know how to operate the software, no matter thorough the documentation.
In a very literal sense, a program’s binary form is enough to understand how it works. But only at an agonizingly low level. You could trace how values are loaded into registers and still not know what the software does in any useful sense. Even small amounts of source code can be hard to understand without context. See examples here.
Practically speaking, it takes far more than source code to be able to maintain a program, especially if you want to come up to speed quickly. There’s always extra knowledge needed outside the code. This extra knowledge may be so widespread within a community as to be invisible. It’s still there, though it may take someone outside the community to see it.
For example, I often run into trouble when I need to install something on Linux. When I ask a friend for help, he’ll say “Really? I installed it just fine.” Then I explain where I got stuck and he’ll say “Oh yeah, everyone knows …” This isn’t to pick on Linux. I see the same thing from the other side when I help people with Windows. Some projects require more implicit knowledge than others, but all projects require some. The implicit knowledge may be institutional memory within a company or part of a distributed culture, but it’s always there.
Related post: Scripting and the last mile problem
A few days ago I got a review copy of The Manga Guide to Relativity (ISBN 1593272723). This is an English translation of a book first published in Japanese a couple years ago.
I assume the intended audience, at least for the original Japanese edition, is familiar with manga and wants to learn about relativity. I came from the opposite perspective, more familiar with relativity than manga, so I paid more attention to the background than the foreground. My experience was more like reading The Relativity Guide to Manga.
I expected The Manga Guide to Relativity to be something like The Cartoon Guide to Genetics. However, the former has much less scientific content than the latter. A fair amount of the relativity book is background story, and the substantial parts are repetitive. As I recall, the genetics book was much more dense with information, though presented humorously.
Some parents and teachers will buy The Manga Guide to Relativity to introduce children to science in an entertaining genre. These folks may be surprised to discover the sexual undertones in the book. Americans typically equate comics with children, but the book was originally written for a Japanese audience that does not have the same view.
New Twitter tip accounts
Here are three and a half ways to subscribe to this blog.
What’s the half way? You can use Outlook as an RSS reader, so that’s sorta method 1.5.
Update: This post is obsolete because Twitter ended their RSS support in June 2013.
You can subscribe to any of my Twitter accounts using the RSS feeds listed here.
* * *
At least for now, you can construct a URL to a Twitter account RSS feed by starting with
and appending the account name. For example,
is the RSS feed for Dave Richeson’s Twitter account @divbyzero.
The following table gives links to RSS feeds for each of my daily tip accounts.
According to computer scientist Donald Knuth, someone who has written numerous books, writing software is more difficult than writing books.
The most important lesson I learned during the past nine years [1977 – 1986, when Knuth developed TeX] is that software is hard; and it takes a long time. From now on I shall have significantly greater respect for every successful software tool that I encounter. …
The amount of technical detail in a large system is one thing that makes programming more demanding than book-writing. Another is that programming demands a significantly higher standard of accuracy. Programs don’t simply have to make sense to another human being, they must make sense to a computer.
Emphasis in the original. Taken from Selected Papers on Computer Science. In another paper in the same collection Knuth says
I was surprised to learn that the writing of programs for TeX and for METAFONT proved to be much more difficult than all the other things I had done (like proving theorems and writing books).
I’m starting a new daily tip Twitter account: RLangTip. This account will have one regularly scheduled tip per day, Monday through Friday, on the R language and related topics. I’ll also throw in a few unscheduled tweets now and then.
Some say that aerodynamics can’t explain how a bumblebee flies. Perhaps that was once the case, but as far as I know there are no difficulties now. The bumblebee story persists as an urban legend. And it makes a nice metaphor for things that work better in practice than in theory.
Here’s the passage that brought the bumblebee story to mind.
Almost every software engineering principle that has become generally accepted as useful and valuable, Emacs flouts. The code is 24 years old, huge, and written by hundreds of different people. By rights, the whole thing should blow up. But it works — and works rather well.
This comes from Jim Blandy’s chapter in Beautiful Architecture. Blandy explains that Emacs’ architecture has allowed it to thrive despite some apparent disadvantages. Emacs is mostly written in its own programming language, Emacs Lisp.
Emacs Lisp has no object system, it’s module system is just a naming convention, all the fundamental text editing operations use implicit global arguments, and even local variables aren’t quite local.
In short, Emacs expects developers to be self-disciplined and does not enforce a great deal of external discipline. However, because the software is so light on bureaucracy, it is easy to customize and to contribute to.
TeX is another bumblebee project. Like Emacs, it has thrived for decades without following currently fashionable software techniques. Knuth implies in this presentation that TeX would have been a dismal failure if it had used technologies that are trendy now.
Donald Knuth explains how he combines theory and practice:
This has always been the main credo of my professional life. I have always tried to develop theories that shed light on the practical things I do, and I’ve always tried to do a variety of practical things so that I have a better chance of discovering rich and interesting theories. It seems to me that my chosen field, computer science — information processing — is a field where theory and practice come together more than in any other discipline, because of the nature of computing machines. …
History teaches us that the greatest mathematicians of past centuries combined theory and practice in their own careers. …
The best theory is inspired by practice. The best practice is inspired by theory.
Taken from Selected Papers on Computer Science.