Aaron Evans condensed a good deal of software engineering experience down to less than 140 characters:
It’s amazing how much cleaner your code looks the third time writing it. First time, hack; Second over-engineer; Third = goldilocks.
Michael Fogus posted on Twitter this morning
Computing: the only industry that becomes less mature as more time passes.
The immaturity of computing is used to excuse every ignorance. There’s an enormous body of existing wisdom but we don’t care.
I don’t know whether computing is becoming less mature, though it may very well be on average, even if individual developers become more mature.
One reason is that computing is a growing profession, so people are entering the field faster than they are leaving. That lowers average maturity.
Another reason is chronological snobbery, alluded to in Fogus’s second tweet. Chronological snobbery is pervasive in contemporary culture, but especially in computing. Tremendous hardware advances give the illusion that software development has advanced more than it has. What could I possibly learn from someone who programmed back when computers were 100x slower? Maybe a lot.
How many ways can you make change for a dollar? This post points to two approaches to the problem, one computational and one analytic.
SICP gives a Scheme program to solve the problem:
(define (count-change amount) (cc amount 5)) (define (cc amount kinds-of-coins) (cond ((= amount 0) 1) ((or (< amount 0) (= kinds-of-coins 0)) 0) (else (+ (cc amount (- kinds-of-coins 1)) (cc (- amount (first-denomination kinds-of-coins)) kinds-of-coins))))) (define (first-denomination kinds-of-coins) (cond ((= kinds-of-coins 1) 1) ((= kinds-of-coins 2) 5) ((= kinds-of-coins 3) 10) ((= kinds-of-coins 4) 25) ((= kinds-of-coins 5) 50)))
Concrete Mathematics explains that the number of ways to make change for an amount of n cents is the coefficient of z^n in the power series for the following:
Later on the book gives a more explicit but complicated formula for the coefficients.
Both show that there are 292 ways to make change for a dollar.
The classical education model is based on the trivium of grammar, logic, and rhetoric. See, for example, Dorothy Sayers’ essay The Lost Tools of Learning.
The grammar stage of the trivium could be literal language grammar, but it also applies more generally to absorbing the basics of any subject and often involves rote learning.
The logic stage is more analytic, examining the relationships between the pieces gathered in the grammar stage. Students learn to construct sound arguments.
The rhetoric stage is focused on eloquent and persuasive expression. It is more outwardly focused than the previous stages, more considerate of others. Students learn to create arguments that are not only logically correct, but also memorable, enjoyable, and effective.
It would be interesting to see a classical approach to teaching programming. Programmers often don’t get past the logic stage, writing code that works (as far as they can tell). The rhetoric stage would train programmers to look for solutions that are not just probably correct, but so clear that they are persuasively correct. The goal would be to write code that is testable, maintainable, and even occasionally eloquent.
Parthenon replica in Nashville, TN.
In the context of programming languages, “magic” is often a pejorative term for code that does something other than what it appears to do.
Programmers seem to have a love/hate relationship with magic. Even people who say that don’t like magic (e.g. because it’s hard to debug) end up using it. The Haskell community prides itself on having a transparent language with no magic, and yet monads are slightly magical. The whole purpose of a monad is to hide explicit data flow, though in a principled way. Haskell’s
do notation is more magical, and templates are even more magical still. (However, I do hear some Haskellers express disdain for templates.)
People who like magic tend to use the word “automagic” instead. It means about the same thing as “magic” but with a positive connotation.
To conclude with a couple sweeping generalizations, magic fans tend to be tool-oriented (such as Microsoft developers) while magic detractors tend to be language-oriented (such as Haskell developers ).
Update: Someone asked me on Twitter about the difference between abstraction and magic. I’d say abstraction hides details, but magic is actively misleading or ironic.
For a daily dose of computer science and related topics, follow @CompSciFact on Twitter.
From Leslie Lamport:
Every time code is patched, it becomes a little uglier, harder to understand, harder to maintain, bugs get introduced.
If you don’t start with a spec, every piece of code you write is a patch.
Which means the program starts out from Day One being ugly, hard to understand, and hard to maintain.
“The essential virtue of category theory is as a discipline for making definitions, and making definitions is the programmer’s main task in life.”
There’s an old joke from Henny Youngman:
I told the doctor I broke my leg in two places. He told me to quit going to those places.
Sometimes tech choices are that easy: if something is too hard, stop doing it. A great deal of pain comes from using a tool outside its intended use, and often that’s avoidable.
For example, when regular expressions get too hard, I stop using regular expressions and write a little procedural code. Or when Python is too slow, I try some simple ways of speeding it up, and if that’s not good enough I switch from Python to C++. If something is too hard to do in Windows, I’ll do it in Linux, and vice versa.
Sometimes there’s not a better tool available and you just have to slog through with what you have. And sometimes you don’t have the freedom to use a better tool even though one is available. But a lot of technical pain is self-imposed. If you keep breaking your leg somewhere, stop going there.
I found this line from Software Foundations amusing:
… we can ask Coq to “extract,” from a Definition, a program in some other, more conventional, programming language (OCaml, Scheme, or Haskell) with a high-performance compiler.
Most programmers would hardly consider OCaml, Scheme, or Haskell “conventional” programming languages, but they are conventional relative to Coq. As the authors said, these languages are “more conventional,” not “conventional.”
I don’t mean to imply anything negative about OCaml, Scheme, or Haskell. They have their strengths — I briefly mentioned the advantages of Haskell just yesterday — but they’re odd birds from the perspective of the large majority of programmers who work in C-like languages.
I’m reading Real World Haskell because one of my clients’ projects is written in Haskell. Some would say that “real world Haskell” is an oxymoron because Haskell isn’t used in the real world, as illustrated by a recent xkcd cartoon.
It’s true that Haskell accounts for a tiny portion of the world’s commercial software and that the language is more popular in research. (There would be no need to put “real world” in the title of a book on PHP, for example. You won’t find a lot of computer science researchers using PHP for its elegance and nice theoretical properties.) But people do use Haskell on real projects, particularly when correctness is a high priority. In any case, Haskell is “real world” for me since one of my clients uses it. As I wrote about before, applied is in the eye of the client.
I’m not that far into Real World Haskell yet, but so far it’s just what I was looking for. Another book I’d recommend is Graham Hutton’s Programming in Haskell. It makes a good introduction to Haskell because it’s small (184 pages) and focused on the core of the language, not so much on “real world” complications.
A very popular introduction to Haskell is Learn You a Haskell for Great Good. I have mixed feelings about that one. It explains most things clearly and the informal tone makes it easy to read, but the humor becomes annoying after a while. It also introduces some non-essential features of the language up front that could wait until later or be left out of an introductory book.
* * *
 Everyone would say that it’s important for their software to be correct. But in practice, correctness isn’t always the highest priority, nor should it be necessarily. As the probability of error approaches zero, the cost of development approaches infinity. You have to decide what probability of error is acceptable given the consequences of the errors.
It’s more important that the software embedded in a pacemaker be correct than the software that serves up this blog. My blog fails occasionally, but I wouldn’t spend $10,000 to cut the error rate in half. Someone writing pacemaker software would jump at the chance to reduce the probability of error so much for so little money.
On a related note, see Maybe NASA could use some buggy software.
Many people have drawn Venn diagrams to locate machine learning and related ideas in the intellectual landscape. Drew Conway’s diagram may have been the first. It has at least been frequently referenced.
By this classification, Hector Cuesta’s new book Practical Data Anaysis is located toward the “hacking skills” corner of the diagram. No single book can cover everything, and this one emphasizes practical software knowledge more than mathematical theory or details of a particular problem domain.
The biggest strength of the book may be that it brings together in one place information on tools that are used together but whose documentation is scattered. The book is great source for sample code. The source code is available on GitHub, though it’s more understandable in the context of the book.
Much of the book uses Python and related modules and tools including:
It also uses D3.js (with JSON, CSS, HTML, …), MongoDB (with MapReduce, Mongo Shell, PyMongo, …), and miscellaneous other tools and APIs.
There’s a lot of material here in 360 pages, making it a useful reference.
* * *
For daily tips on data science, follow @DataSciFact on Twitter.
Diomidis Spinellis gives a list of 10 software tool sins in The Tools at Hand episode of his Tools of the Trade podcast. Here are his points, but turned around. For each sin he lists, I give the opposite as a virtue.
10. Maintain API documentation with the source code.
9. Integrate unit testing in development.
8. Track bugs electronically.
7. Let the compiler do what it can do better than you.
6. Learn how to script your tools to work together.
5. Pay attention to compiler warnings and fix them.
4. Use a version control system.
3. Use tools to find definitions rather than scanning for them.
2. Use a debugger.
1. Use tools that eliminate repetitive manual editing.
I turned the original list around because I believe it’s easier to agree that the things above are good than it is to see that their lack is bad. Some items are opposites, like #5: you either pay attention to warnings or you ignore them. But some are not, like #8. Tracking bugs electronically is a good idea, but I wouldn’t call tracking bugs on paper a “sin.”
Related post: Reducing development friction comments on another podcast from Diomidis Spinellis.
For a daily dose of computer science and related topics, follow @CompSciFact on Twitter.
Diomidis Spinellis gave an insightful list of ways to reduce software development friction in the Tools of the Trade podcast episode The Frictionless Development Environment Scorecard.
The first item on his list grabbed my attention:
Are my personal settings and preferences consistent on all the computers I’m using? Are they stored under version control? Can I install them on a new computer using a single command?
Listening to the podcast provoked me to finally sync my
.emacs files on all my computers so that I now have the exact same file on all computers, maintained under version control. (Xah Lee gave me some sample code for creating the branching logic I needed for a few differences between Windows and Linux.)
Here is a small sample of questions from the podcast.
The last question from the podcast summarizes the whole list:
Do I regularly evaluate my development environment to pinpoint and eliminate the sources of friction? Do I help my colleagues do the same?
When you have an array of things, do you name the array with a plural noun because it contains many things, or you you name it with a singular noun because each thing it contains is singular? For example, if you have a collection of words, should you name it
Does it make any difference if you’re using some container other than an array? For example if you have a dictionary (a.k.a. map, hash, associative array, etc.) counting word frequencies, should it be
I’ve never had a convention that I consciously follow. But I’ve often stopped to wonder which way I should name things. One approach may look right when I declare a variable and another when I use it.
Damian Conway has a reasonable suggestion in his book Perl Best Practices. (There are many things in that book that are good advice for people who never touch Perl.) He recommends using plural names for most arrays and singular names for dictionaries and arrays used like dictionaries.
Because hash entries are typically accessed individually, it makes sense for the hash itself to be named in the singular. That convention causes the individual accesses to read more naturally in the code. … On the other hand, array values are more often processed collectively … So it makes sense to name them in the plural, after the group of items they store. … If, however, an array is to be used as a random-access look-up table, name it in the singular, using the same conventions as a hash.
Dorothy Parker said “It’s not the tragedies that kill us; it’s the messes.”
Sometime that’s how I feel about computing. I think of messes such as having to remember that arc tangent is
atan in R and Python, but
arctan in NumPy and
bc. Or that C, Python, and Perl use
elsif respectively. Or did I switch those last two?
These trivial but innumerable messes keep us from devoting our full energy to bigger problems.
One way to reduce these messes is to use fewer tools. Then you know less to be confused about. If you only use Python, for example, then
elif is just how it is. But knowing more tools is worth the added mess, up to a point. Past some point, however, new tools add more mental burden than utility. You have to find the optimal combination of tools for yourself, and that combination will change over time.
To use fewer tools, you may need to use more complex tools. Maybe you can replace a list of moderately complex but inconsistent tools with one tool that is more complex but internally consistent.