From the category archives:

Software development

Interview with Clojure author

by John on March 8, 2010

Simple-talk has an interview with Rich Hickey, author of the programming language Clojure (pronounced “closure”). Clojure is a dialect of Lisp designed to run on top of the Java Virtual Machine. The language is also being ported to the .NET framework as Clojure CLR.

Two things stood out to me in the interview: a comparison of Lisp with C++, and a discussion of complexity.

You’ll often hear a programmer argue that language X is better than language Y.  To support their argument, they’ll say they wrote a program in Y, then wrote it in X in less time. For example, someone might argue that Ruby is better than Python because they were able to rewrite their web site using Ruby in half the time it took to write the original Python version. Such arguments are weak because you can write anything faster the second time. The first implementation required analysis and design that the second implementation can reuse entirely or at least learn from.

Rich Hickey argues that he can develop programs in Lisp faster than in C++. He offers as support that he first wrote something in Lisp and then took three times longer to rewrite it in C++. This is just a personal anecdote, not a scientific study, but it carries more weight than the usual anecdote because he’s claiming the first language was more efficient than the second.

In his discussion of incidental complexity, complexity coming from ones tools rather than from the intrinsic complexity of the problem being solved, Hickey says

I think programmers have become inured to incidental complexity, in particular by confusing familiar or concise with simple. And when they encounter complexity, they consider it a challenge to overcome, rather than an obstacle to remove. Overcoming complexity isn’t work, it’s waste.

The phrase “confusing familiar or concise with simple” is insightful. I never appreciated the arguments about the complexity of C++ until I got a little distance from the language; C++ was so familiar I didn’t appreciate how complex it is until I had a break from writing it. Also, simple solutions are usually concise, but concise solutions may not be simple. I chuckle whenever I hear someone say a problem was simple to solve because they were able to solve it in one line — one long stream of entirely mysterious commands.

Thanks to Omar Gomez for pointing out the interview article.

Related posts:

A little simplicity goes a long way
I disagree with Torvalds about C++
Baklava code

{ 3 comments }

What do you learn just in case you’ll need it in the future, and what do you learn just in time when you do need it?

In general, you learn things in school just in case you’ll need them later. Then once you get a job, you learn more things just in time when you need them.

When you learn just in time, you’re highly motivated. There’s no need to imagine whether you might apply what you’re learning since the application came first. But you can’t learn everything just in time. You have to learn some things before you can imagine using them. You need to have certain patterns in your head before you can recognize them in the wild.

Years ago someone told me that he never learned algebra and has never had a need for it. But I’ve learned algebra and use it constantly. It’s a lucky thing I was the one who learned algebra since I ended up needing it. But of course it’s not lucky. I would not have had any use for it either if I’d not learned it.

The difference between just-in-case and just-in-time is like the difference between training and trying. You can’t run a marathon by trying hard. The first person who tried that died. You have to train for it. You can’t just say that you’ll run 26 miles when you need to and do nothing until then.

Software developers prefer just in time learning. There’s so much out there that you aren’t going to need. You can’t learn every detail of every operating system, every programming language, every library etc. before you do any real work. You can only remember so much arbitrary information without a specific need for it. Even if you could learn it all in the abstract, you’d be decades into your career without having produced anything. On top of that, technological information has a short shelf life, so it’s not worthwhile to learn too much that you’re not sure you have a need for.

On the other hand, you need to know what’s available, even if you’re only going to learn the details just in time. You can’t say “I need to learn about version control system now” if you don’t even know what version control is. You need to have a survey knowledge of technology just in case. You can learn APIs just in time. But there’s a big gray area in between where it’s hard to know what is worthwhile to learn and when.

Related posts:

Software that gets used
Why programmers write unneeded code
Don’t standardize education, personalize it
Worthless technical books

{ 22 comments }

Amateur software

by John on February 10, 2010

I’m growing increasingly frustrated with amateur software. Before I explain why, let me first be clear on what I do not mean by amateur.

  • Amateur does not mean low quality. Some amateur software is outstanding, and some professional software is terrible.
  • Amateur does not mean open source. Some amateur projects are open source and some are not.

I’m using “amateur software” to mean software projects developed by volunteers. I imagine most amateur software is written by professional developers. These are folks paid to write software for a company by day who then work on something else they love by night.

Open source software is not necessarily amateur software. Linux, for example, is now professional software. Around 75% of Linux kernel development is carried out by people paid to work on Linux. Some of the best software is both open source and at least partially professional.

Volunteers do what they want to do by definition. The problem is that the reverse is also true: volunteers do not do what they do not want to do. And for software developers, writing documentation usually falls in the “do not want to do” column. So does making software easy to install. So does testing in multiple environments.

When a company has an interest in a piece of software, they can pay people to do the tasks the volunteers don’t want to do. In fact, if they’re smart, they will concentrate their efforts precisely on the tasks volunteers don’t want to do. In this way even one or two paid staff can make an enormous contribution to a largely volunteer project.

Some amateur projects are highly polished. These may be small projects lead by rare individuals who pay attention to details beyond pure software development. More often, these are large mature projects that have so many volunteers that they have a few who are willing to do tasks that most developers do not want to do.

Related posts:

Shallow bugs versus reported bugs
Software profitability in the middle
Hard to spend money

{ 11 comments }

You can’t force people to provide metadata

by John on February 7, 2010

I ran across a long rant from Steve Yegge this evening about junior programmers. In a nutshell, Yegge says they like to play around with metadata rather than getting real work done.

Here’s an insightful observation Yegge makes along the way.

And Haskell, OCaml and their ilk … try to force people to model everything. Programmers hate that. These languages will never, ever enjoy any substantial commercial success, for the exact same reason the Semantic Web is a failure. You can’t force people to provide metadata for everything they do. They’ll hate you.

Related post:

Probability of semantic markup being correct

{ 9 comments }

Parameterizations are the bane of statistical software. One of the most common errors is to assume that one software package uses the same parameterization as another package. For example, some packages specify the exponential distribution in terms of the mean but others use the rate. [click to continue...]

{ 4 comments }

Little programs versus big programs

by John on February 3, 2010

From You Are Not a Gadget:

Little programs are delightful to write in isolation, but the process of maintaining large-scale software is always miserable. … Technologists wish every program behaved like a brand-new, playful little program, and will use any available psychological strategy to avoid thinking about computers realistically.

Related posts:

Writes large, correct programs
Why there will always be programmers

{ 3 comments }

New Python podcast: A little bit of Python

by John on February 1, 2010

There’s a new Python podcast: A little bit of Python with Michael Foord, Brett Cannon, Jesse Noller, Steve Holden, and Andrew Kuchling.

So far I’ve found the first episode most interesting. It discusses the “moratorium”, the plan to give Python library authors time catch up with Python 3 before extending the core language further. This sounds like a very smart move.

Related posts:

Good enough for Google and NASA
Plain Python

{ 1 comment }

Software sins of omission

by John on January 12, 2010

The Book of Common Prayer contains the confession

… we have left undone those things which we ought to have done, and we have done those things which we ought not to have done.

The things left undone are called sins of omission; things which ought not to have been done are called sins of commission.

In software testing and debugging, we focus on sins of commission, code that was implemented incorrectly. But according to Robert Glass, the majority of bugs are sins of omission. In Frequently Forgotten Fundamental Facts about Software Engineering Glass says

Roughly 35 percent of software defects emerge from missing logic paths, and another 40 percent are from the execution of a unique combination of logic paths.

If these figures are correct, three out of four software bugs are sins of omission, errors due to things left undone. These are bugs due to contingencies the developers did not think to handle. Three quarters seems like a large proportion, but it is plausible. I know I’ve often written plenty of bugs that amounted to not considering enough possibilities, particularly in graphical user interface software. It’s hard to think of everything a user might do and all the ways a user might arrive at a particular place. (When I first wrote user interface applications, my reaction to a bug report would be “Why would anyone do that?!” If everyone would just use my software the way I do, everything would be OK. )

It matters whether bugs are sins of omission or sins of commission. Different kinds of bugs are caught by different means. Developers have come to appreciate the value of unit testing lately, but unit tests primarily catch sins of commission. If you didn’t think to program something in the first place, you’re not likely to think to write a test for it. Complete test coverage could only find 25% of a projects bugs if you assume 75% of the bugs come from code that no one thought to write.

The best way to spot sins of omission is a fresh pair of eyes. As Glass says

Rigorous reviews are more effective, and more cost effective, than any other error-removal strategy, including testing. But they cannot and should not replace testing.

One way to combine the benefits of unit testing and code reviews would be to have different people write the unit tests and the production code.

Related posts:

The most subtle of the seven deadly sins
Shallow bugs versus reported bugs
Negative space in operating systems

{ 7 comments }

Camtasia as a software deployment tool

by John on January 10, 2010

Last week .NET Rocks mentioned a good idea in passing: start a screencast tool like Camtasia before you do a software install. Michael Learned, told the story of a client that asked him to take screen shots of every step in the installation of Microsoft’s Team Foundation Server. Carl Franklin commented “What a great idea to throw Camtasia on there and record the whole process.”

It would be better if the installation process were scripted and not just recorded, but sometimes that’s not practical. Sometimes clicking a few buttons is absolutely necessary or at least far easier than writing a script. And even if you think your entire process is automated with a script, a screencast might be a good idea. It could record little steps you have to do in order to run your script, details that are easily forgotten.

Another way to use this idea would be to have one person do a practice install on a test server while recording the process. Then another person could document and script the process by studying the video. This would be helpful when the person who knows how to do the installation lacks either the verbal skills to explain the process or the scripting skills to automate it.

Related posts:

Rotating programmers
Automated software builds
Programming the last mile

{ 4 comments }

Better tools, less productivity?

by John on January 6, 2010

Can better tools make you less productive? Here’s a quote from Frequently Forgotten Fundamental Facts about Software Engineering by Robert Glass:

Most software tool and technique improvements account for about a 5- to 30-percent increase in productivity and quality. … Learning a new tool or technique actually lowers programmer productivity and product quality initially. You achieve the eventual benefit only after overcoming this learning curve.

If you’re always learning new tools, you may be less productive than if you stuck with your old tools a little longer, even if the new tools really are better. And especially if you’re a part-time developer, you may never reach the point where a new tool pays for itself before you throw it away and pick up a new one. Kathleen Dollard wrote an editorial to this effect in 2004 entitled Save The Hobbyist Programmer.

Miners know they have a significant problem when the canary they keep with them stops singing. Hobbyist/part-time programmers are our industry’s version of the canary, and they have stopped singing. People who program four to eight hours a week are being cut out of the picture because they can’t increase their skills as fast as technology changes. That’s a danger signal for the rest of us.

So what do you do? Learn quickly or change slowly. The first option is to commit to learning a new tool quickly, invest heavily in up-front training, and use the tool as much as you can before the next one comes along.  This is the favored option for ambitious programmers who want to maximize their marketability by always using the latest tools.

The second option is to develop a leap frog strategy, letting some new things pass you by.  The less time you spend per week programming, the less often you should change tools. Change occasionally, yes, but wait for big improvements.

Related posts:

Doing good work with bad tools
Fear of tech commitment
Three-hour per week language

{ 3 comments }

Fear of tech commitment

by John on December 29, 2009

According to the stereotypes, men fear committing to relationships. I find that hard to relate to. But I can relate to fear of technological commitment. I don’t want to take the time to learn something well that’s going to go away in a year. Like anyone else I want to pick the best tool for the job, but sometimes I’ve invested too much time in evaluation.

In a panel discussion on whether software development has become too complex, one of the major complaints was the bewildering number of options. The implicit assumption is that one must evaluate every option. This is an emotional reaction driven by fear of missing out.

Looking back on technologies that have come and gone, the best option was never orders of magnitude better than the second best option. We expect that the choices facing us now matter a great deal, despite knowing that similar decisions in the past didn’t matter that much.

Not only are some of our choices not so important, they don’t last so long either. We act as if we’re picking the technology we’re going to use for the rest of our lives. In reality, we may be picking the technology we’re going to use for the next year.

Very often it’s not worth the deliberation to pick the “best” technology. Pick a good one and don’t look back.

Related posts:

Shallow bugs versus reported bugs
Three-hour per week language
Doing good work with bad tools

{ 2 comments }

The most productive programmers are orders of magnitude more productive than average programmers. But salaries usually fall within a fairly small range in any company. Even across the entire profession, salaries don’t vary that much. If some programmers are 10x more productive than others, why aren’t they paid 10x as much?

Joel Spolsky gave a couple answers to this question in his most recent podcast. First, programmer productivity varies tremendously across the profession, but it may not vary so much within a given company. Someone who is 10x more productive than his colleagues is likely to leave, either to work with other very talented programmers or to start his own business. Second, extreme productivity may not be obvious. This post elaborates on this second reason.

How can someone be 10x more productive than his peers without being noticed? In some professions such a difference would be obvious. A salesman who sells 10x as much as his peers will be noticed, and compensated accordingly. Sales are easy to measure, and some salesmen make orders of magnitude more money than others. If a bricklayer were 10x more productive than his peers this would be obvious too, but it doesn’t happen: the best bricklayers cannot lay 10x as much brick as average bricklayers. Software output cannot be measured as easily as dollars or bricks. The best programmers do not write 10x as many lines of code and they certainly do not work 10x longer hours.

Programmers are most effective when they avoid writing code. They may realize the problem they’re being asked to solve doesn’t need to be solved, that the client doesn’t actually want what they’re asking for. They may know where to find reusable or re-editable code that solves their problem. They may cheat. But just when they are being their most productive, nobody says “Wow! You were just 100x more productive than if you’d done this the hard way. You deserve a raise.” At best they say “Good idea!” and go on.  It may take a while to realize that someone routinely comes up with such time-saving insights. Or to put it negatively, it may take a long time to realize that others are programming with sound and fury but producing nothing.

The romantic image of an über-programmer is someone who fires up Emacs, types like a machine gun, and delivers a flawless final product from scratch. A more accurate image would be someone who stares quietly into space for a few minutes and then says “Hmm. I think I’ve seen something like this before.”

Related posts:

Writes large correct programs
Experienced programmers and lines of code

{ 96 comments }

Solver Foundation optimization library

by John on December 23, 2009

Microsoft’s Solver Foundation is a numerical optimization library capable of solving problems involving millions of variables and millions of constraints. When I listened Scott Hanselman interview Nathan Brixius from Microsoft’s Solver Foundation team, I expected Brixius to say that Solver Foundation was written in C++ at its core and had a thin C# veneer to make it callable from .NET applications. Instead, he said that Solver Foundation is entirely written in managed code.

Even in heavy-duty numerical code the bottlenecks may not be numerical. The inner loops of the software would execute faster if they were written in C++, but Solver Foundation solves optimization problems about as quickly as other packages written in lower-level languages.

{ 5 comments }

The virtual machine of the Internet

by John on December 10, 2009

From Douglas Crockford’s talk The State and Future of JavaScript:

There’s pressure to make it [JavaScript] a better compilation target. Now, this is a big surprise. Everybody thought that the Java VM was going to be the VM of the internet, but it turns out that JavaScript language is the VM [virtual machine] of the internet. People are writing in Java, and Python, and lots of other languages, and then translating it into JavaScript because JavaScript, for all of its security problems, actually has a much better security model than everybody else.

Related posts:

Zero-knowledge password management in JavaScript
JavaScript: A picture is worth a thousand words
Programming language subsets

{ 0 comments }

This is one of my favorite quotes from Starbucks’ coffee cups:

When I was young I was mislead by flash cards into believing that xylophones and zebras were much more common.

Alphabet books treat every letter as equally important even though letters like X and Z are far less common than letters like E and T. Children need to learn the entire alphabet eventually, and there are only 26 letters, so teaching all the letters at once is not bad. But uniform emphasis doesn’t scale well. Learning a foreign language, or a computer language, by learning words without regard to frequency is absurd. The most common words are far more common than the less common words, and so it makes sense to learn the most common words first.

John Miles White has applied this idea to learning R. He did a keyword frequency analysis for R and showed that the frequency of the keywords follows Zipf’s law or something similar. I’d like to see someone do a similar study for other programming languages.

It would be interesting to write a programming language tutorial that introduces the keywords in the approximately the order of their frequency. Such a book might be quite unorthodox, and quite useful.

White points out that when teaching human languages in a classroom, “the usefulness of a word tends to be confounded with its respectability.” I imagine something similar happens with programming languages. Programs that produce lists of Fibonacci numbers or prime numbers are the xylophones and zebras of the software world.

Related posts:

Zebras and xylophones part II: learning Spanish
Rate of regularizing English verbs
Four reasons we don’t apply the 80-20 rule
R, the good parts

{ 5 comments }