Different kinds of software complexity

It has always seemed odd to me that computer science refers to the number of operations necessary to execute an algorithm as the “complexity” of the algorithm. For example, an algorithm that takes O(N3) operations is more complex than an algorithm that takes O(N2) operations. The big-O analysis of algorithms is important, but “complexity” is the wrong name for it.

Software engineering is primarily concerned with mental complexity, the stress on your brain when working on a piece of code. Mental complexity might be unrelated to big-O complexity. They might even be antithetical: a slow, brute-force algorithm can be much easier to understand that a fast, subtle algorithm.

Of course mental complexity is subjective and hard to measure. The McCabe complexity metric, counting the number of independent paths through a chunk of code, is easy to compute and surprisingly helpful, even though it’s a crude approximation to what you’d really like to measure. (SourceMonitor is a handy program for computing McCabe complexity.)

Kolmogorov complexity is another surrogate for mental complexity. The Kolmogorov complexity of a problem is the length of the shortest computer program that could solve that problem. Kolmogorov complexity is difficult to define precisely — are we talking Java programs? Lisp programs? Turing machines? — and practically impossible to compute, but it’s a useful concept.

I like to think of a variation of Kolmogorov complexity even harder to define, the shortest program that “someone who programs like I do” could write to solve a problem. I started to describe this variation as “impractical,” but in some ways it’s very practical. It’s not practical to compute, but it’s a useful idealization to think about as long as you don’t take it too seriously.

One thought on “Different kinds of software complexity

  1. That’s very interesting.

    A long time ago I learned about information theory and saw some fascinating results obtained when examining encoding systems such as the English language and the genome. I was especially interested in the realtionship between efficiency of encoding systems and the probability of errors in transmission. IIRC these or the relationship between them were similar between English and the genome, but it was a long time ago.

    Of couse similar considerations apply in software languages, if you look at the efficiency in encoding algorithms and the probability of bugs.

    In both cases lowered efficiency in encoding information can allow for lowered probability of error. Think about golf-style Perl or assembly as compared to C#. Or even golf style Perl compared to Strict Perl.

Comments are closed.