Just-in-case revisited

Just-in-time learning means learning something just when you need it. The alternative is just-in-case, learning something in case you need it. I discussed this in an earlier post, and today I’d like to add a little to that discussion.

There are some things you need to know (or at least be familiar with) before you have a chance to use them. Here’s a variation on that idea: some things you need to have practiced before you need them in order to overcome an effort barrier.

Suppose you tell yourself that you’ll learn to use Photoshop or GIMP when you need to. Then you need to edit a photo. Faced with the prospect of learning either of these software packages, you might decide that the photo in question looks good enough after all.

There are things that in principle you could learn just-in-time, though in practice this is not psychologically feasible. The mental “activation energy” is too high. Some things you need to practice before hand, not because you couldn’t look them up when needed, but because they would be too daunting to learn when needed.

Related post: Bicycle skills

Why a little knowledge is a dangerous thing

Alexander Pope famously said

A little learning is a dangerous thing;
Drink deep, or taste not the Pierian spring:
There shallow draughts intoxicate the brain,
And drinking largely sobers us again.

I’ve been thinking lately about why a little knowledge is often a dangerous thing, and here’s what I’ve come to.

Any complex system has many causes acting on it. Some of these are going to be more legible than others. Here I’m using “legible” in a way similar to how James Scott uses the term. As Venkatesh Rao summarizes it,

A system is legible if it is comprehensible to a calculative-rational observer looking to optimize the system from the point of view of narrow utilitarian concerns and eliminate other phenomenology. It is illegible if it serves many functions and purposes in complex ways, such that no single participant can easily comprehend the whole. The terms were coined by James Scott in Seeing Like a State.

People who have a little knowledge of a subject are only aware of some of the major causes that are acting, and probably they are aware of the most legible causes. They have an unbalanced view because they are aware of the forces pushing in one direction but not aware of other forces pushing in other directions.

A naive view may be unaware of a pair of causes in tension, and may thus have a somewhat balanced perspective. And an expert may be aware of both causes. But someone who knows about one cause but not yet about the other is unbalanced.


When I first started working at MD Anderson Cancer Center, I read a book on cancer called One Renegade Cell. After reading the first few chapters, I wondered why we’re not all dead. It’s easy to see how cancer can develop from one bad cell division and kill you a few weeks later. It’s not as easy to understand why that doesn’t usually happen. The spreading of cancer is more legible than natural defenses against cancer.

I was recently on the phone with a client who had learned enough about data deidentification to become worried. I explained that there were also reasons to not be as worried, but that they’re more complicated, less legible.

What to do

Theories are naturally biased toward causes that are amenable to theory, toward legible causes. Practical experience and empirical data tend to balance out theory by providing some insight into less legible causes.

A little knowledge is dangerous not so much because it is partial but because it is biased; it’s often partial in a particular way, such as theory lacking experience. If you spiral in on knowledge in a more balanced manner, with a combination of theory and experience, you might not be as dangerous along the way.

When theory and reality differ, the fault lies in the theory. More on that in my next post. Theory necessarily leaves out complications, and that’s what makes it useful. The art is knowing which complications can be safely ignored under which circumstances.

Related posts

Make boring work harder

I was searching for something this morning and ran across several pages where someone blogged about software they wrote to help write their dissertations. It occurred to me that this is a pattern: I’ve seen a lot of writing tools that came out of someone writing a dissertation or some other book.

The blog posts leave the impression that the tools required more time to develop than they would save. This suggests that developing the tools was a form of moral compensation, procrastinating by working on something that feels like it’s making a contribution to what you ought to be doing.

Even so, developing the tools may have been a good idea. As with many things in life, it makes more sense when you ask “Compared to what“? If the realistic alternative to futzing around with scripts was to write another chapter of the dissertation, then developing the tools was not the best use of time, assuming they don’t actually save more time than they require.

But if the realistic alternative was binge watching some TV series, then writing the tools may have been a very good use of time. Any time the tools save is profit if the time that went into developing them would otherwise have been wasted.

Software developers are often criticized for developing tools rather than directly developing the code they’re paid to write. Sometimes these tools really are a good investment. But even when they’re not, they may be better than the realistic alternative. They may take time away from Facebook rather than time away from writing production code.

Another advantage to tool building, aside from getting some benefit from time that otherwise would have been wasted, is that it builds momentum. If you can’t bring yourself to face the dissertation, but you can bring yourself to write a script for writing your dissertation, you might feel more like facing the dissertation afterward.

Related post: Automate to save mental energy, not time

A different kind of computational survival

Abandoned shopping mall

Last year I wrote a post about being a computational survivalist, someone able to get their work done with just basic command line tools when necessary. This post will be a different take on the same theme.

I just got a laptop from an extremely security-conscious client. I assume it runs Windows 10 and that I will not be able to install any software without an act of Congress. I don’t know yet because I haven’t booted it up. And in fact I cannot boot it up yet because I don’t have the badge yet to unlock it.

If being able to work with just default command line tools is like wilderness survival, being able to work with only consumer software is like urban survival, like trying to live in an abandoned shopping mall.

There must be some scientific software on the laptop. I imagine I may have to re-implement from scratch some tools that aren’t installed. I’ve been in that situation before.

One time I was an expert witness on a legal case and had to review the other side’s software. I could only work on their laptop, from their attorney’s office, with no network connection and no phone. I could request some software to be installed before I arrived, so I asked them to put Python on the laptop. I could bring books into the room with the laptop, so I brought the Python Cookbook with me.

If you don’t have grep, sed, or awk, but you do have Perl, you and roll your own version of the utilities in a few lines of code. For example, see Perl as a better grep.

I always use LaTeX for writing math, but the equation editor in Microsoft Word supports a large amount of LaTeX syntax. Or at least it did when I last tried it a decade ago.

The Windows command line has more Unix-like utilities than you might imagine. Several times I’ve typed a Unix command at the Windows cmd.exe prompt, thought “Oh wait. I’m on Windows, so that won’t work,” and the command works. The biggest difference between the Windows and Linux command lines is not the utilities per se. You can install many of the utilities, say through GOW.

The biggest difference in command lines is that on Windows, each utility parses its own arguments, whereas on Linux the shell parses the arguments first and passes the result to the utilities. So, for example, passing multiple files to a utility may or may not work on Windows, depending on the capability of the utility. On Linux, this just works because it is the shell itself rather than the utilities launched from the shell that orchestrates the workflow.

I expect this new project will be very interesting, and worth putting up with the minor annoyances of not having my preferred tools at my fingertips. And maybe it won’t be as hard as I imagine to request new software. If not, it can be fun to explore workarounds.

It’s sort of a guilty pleasure to find a way to get by without the right tool for the job. It would be a waste of time under normal circumstances, and not something the client should be billed for, but you can hack with a clear conscience when you’re forced into doing so.

The worst tool for the job

I don’t recall where I read this, but someone recommended that if you need a tool, buy the cheapest one you can find. If it’s inadequate, or breaks, or you use it a lot, then buy the best one you can afford. (Update: Thanks to Jordi for reminding me in the comments that this comes from Kevin Kelly.)

If you follow this strategy, you’ll sometimes waste a little money by buying a cheap tool before buying a good one. But you won’t waste money buying expensive tools that you rarely use. And you won’t waste money by buying a sequence of incrementally better tools until you finally buy a good one.

The advice above was given in the context of tools you’d find in a hardware store, but I’ve been thinking about it in the context of software tools. There’s something to be said for having crude tools that are convenient for small tasks, and sophisticated tools that are appropriate for big tasks, but not investing much in the middle. That’s kind of what I was getting at in my recent post From shell to system.

I’m making a bunch of diagrams for a new project, and the best tool for the job would probably be Adobe Illustrator because professionals routinely use it to make high-quality vector art. But I’m not doing that. I’m drawing ASCII art diagrams, just boxes and arrows drawn in plain text. Something like the drawing below.

  +--------------+ compiles to +---+  
  | Foo language | ----------> | C |  
  +--------------+             +---+  
         | embeds into
    | Bar DSL |

The crude nature of ASCII art is a feature, not a bug. There is no temptation to be precious [*] about the aesthetics since the end product isn’t going to win any design awards in any case. There are compelling incentives to keep the diagrams small and simple. It encourages keeping the focus on content and give up on aesthetics once you hit diminishing return, which occurs fairly quickly.

Drawing ASCII diagrams is clumsy, even with tools that make it easier. Wouldn’t it be faster to use a tool meant for drawing? Well, yes and no. Drawing individual graphic elements would be faster in a drawing tool. But inevitably I’d spend more time on the appearance of the graphs, and so ultimately it would be slower.

The initial motivation for making ASCII diagrams was to keep diagrams and source code in the same file, not to eliminate the temptation to spend too much time tweaking graphics. The latter was a positive unintended consequence.


I’m not doing this completely bare-knuckles. Emacs has tools like artist-mode that make it easier than manually positioning every character. And I’m using  DITAA sometimes to compile the plain text diagrams into graphics more appropriate for pasting into a report. The example above compiles to the image below.

DITAA example

More on how this works here.


[*] Not precious as in valuable, but precious as in affectedly or excessively refined. As in filmmaker Darren Doane’s slogan “We are not precious here.”

Distracted by the hard part

Last night I was helping my daughter with calculus homework. I told her that a common mistake was to forget what the original problem was after getting absorbed in sub-problems that have to be solved. I saw this over and over when I taught college.

Then a few minutes later, we both did exactly what I warned her against. She took the answer to a difficult sub-problem to be the final answer. I checked her work and confirmed that it was correct, until I saw we hadn’t actually answered the original question.

As I was waking up this morning, I realized I was about to make the same mistake on a client’s project. The goal was to write software to implement a function f which is a trivial composition of two other functions g and h. These two functions took a lot of work, including a couple levels of code generation. I felt I was done after testing g and h, but I forgot to write tests for f, the very thing I was asked to deliver.

This is a common pattern that goes beyond calculus homework and software development. It’s why checklists are so valuable. We resist checklists because they insult our intelligence, and yet they greatly reduce errors. Experienced people in every field can skip a step, most likely a simple step, without some structure to help them keep track.

Related posts

One of these days I’m going to figure this out

If something is outside your grasp, it’s hard to know just how far outside it is.

Many times I’ve intended to sit down and understand something thoroughly, and I’ve put it off for years. Maybe it’s a programming language that I just use a few features of, or a book I keep seeing references to. Maybe it’s a theorem that keeps coming up in applications. It’s something I understand enough to get by, but I feel like I’m missing something.

I’ll eventually block off some time to dive into whatever it is, to get to the bottom of things. Then in a fraction of the time I’ve allocated, I do get to the bottom and find out that I wasn’t that far away. It feels like swimming in a water that’s just over your head. Your feet don’t touch bottom, and you don’t try to touch bottom because you don’t know how far away bottom is, but it was only inches away.

A few years ago I wrote about John Conway’s experience along these lines. He made a schedule for the time he’d spend each week working on an open problem in group theory, and then he solved it the first day. More on his story here. I suspect that having allocated a large amount of time to the problem put him in a mindset where he didn’t need a large amount of time.

I’ve written about this before in the context of simplicity and stress reduction: a little simplicity goes a long way. Making something just a little bit simpler can make an enormous difference. Maybe you only reduce the objective complexity by 10%, but you feel like you’ve reduced it by 50%. Just as you can’t tell how far away you are from understanding something when you’re almost there, you also can’t tell how complicated something really is when you’re overwhelmed. If you can simplify things enough to go from being overwhelmed to not being overwhelmed, that makes all the difference.

Quantum leaps

A literal quantum leap is a discrete change, typically extremely small [1].

A metaphorical quantum leap is a sudden, large change.

I can’t think of a good metaphor for a small but discrete change. I was reaching for such a metaphor recently and my first thought was “quantum leap,” though that would imply something much bigger than I had in mind.

Sometimes progress comes in small discrete jumps, and only in such jumps. Or at least that’s how it feels.

There’s a mathematical model for this called the single big jump principle. If you make a series of jumps according to a fat tailed probability distribution, most of your progress will come from your largest jump alone.

Your distribution can be continuous, and yet there’s something subjectively discrete about it. If you have a Lévy distribution, for example, your jumps can be any size, and so they are continuous in that sense. But when the lion’s share of your progress comes from one jump, it feels discrete, as if the big jump counted and the little ones didn’t.

Related posts

[1] A literal quantum leap, such an electron moving from one energy level to another in a hydrogen atom, is on the order of a billionth of a billionth of a joule. A joule is roughly the amount of energy needed to bring a hamburger to your mouth.

The hard part in becoming a command line wizard

I’ve long been impressed by shell one-liners. They seem like magical incantations. Pipe a few terse commands together, et voilà! Out pops the solution to a problem that would seem to require pages of code.

Source http://dilbert.com/strip/1995-06-24

Are these one-liners real or mythology? To some extent, they’re both. Below I’ll give a famous real example. Then I’ll argue that even though such examples do occur, they may create unrealistic expectations.

Bentley’s exercise

In 1986, Jon Bentley posted the following exercise:

Given a text file and an integer k, print the k most common words in the file (and the number of their occurrences) in decreasing frequency.

Donald Knuth wrote an elegant program in response. Knuth’s program runs for 17 pages in his book Literate Programming.

McIlroy’s solution is short enough to quote below [1].

    tr -cs A-Za-z '
    ' |
    tr A-Z a-z |
    sort |
    uniq -c |
    sort -rn |
    sed ${1}q

McIlroy’s response to Knuth was like Abraham Lincoln’s response to Edward Everett at Gettysburg. Lincoln’s famous address was 50x shorter than that of the orator who preceded him [2]. (Update: There’s more to the story. See [3].)

Knuth and McIlroy had very different objectives and placed different constraints on themselves, and so their solutions are not directly comparable. But McIlroy’s solution has become famous. Knuth’s solution is remembered, if at all, as the verbose program that McIlroy responded to.

The stereotype of a Unix wizard is someone who could improvise programs like the one above. Maybe McIlroy carefully thought about his program for days, looking for the most elegant solution. That would seem plausible, but in fact he says the script was “written on the spot and worked on the first try.” He said that the script was similar to one he had written a year before, but it still counts as an improvisation.

Why can’t I write scripts like that?

McIlroy’s script was a real example of the kind of wizardry attributed to Unix adepts. Why can’t more people quickly improvise scripts like that?

The exercise that Bentley posed was the kind of problem that programmers like McIlroy solved routinely at the time. The tools he piped together were developed precisely for such problems. McIlroy didn’t see his solution as extraordinary but said “Old UNIX hands know instinctively how to solve this one in a jiffy.”

The traditional Unix toolbox is full of utilities for text manipulation. Not only are they useful, but they compose well. This composability depends not only on the tools themselves, but also the shell environment they were designed to operate in. (The latter is why some utilities don’t work as well when ported to other operating systems, even if the functionality is duplicated.)

Bentley’s exercise was clearly text-based: given a text file, produce a text file. What about problems that are not text manipulation? The trick to being productive from a command line is to turn problems into text manipulation problems.  The output of a shell command is text. Programs are text. Once you get into the necessary mindset, everything is text. This may not be the most efficient approach to a given problem, but it’s a possible strategy.

The hard part

The hard part on the path to becoming a command line wizard, or any kind of wizard, is thinking about how to apply existing tools to your particular problems. You could memorize McIlroy’s script and be prepared next time you need to report word frequencies, but applying the spirit of his script to your particular problems takes work. Reading one-liners that other people have developed for their work may be inspiring, or intimidating, but they’re no substitute for thinking hard about your particular work.


You get faster at anything with repetition. Maybe you don’t solve any particular kind of problem often enough to be fluent at solving it. If someone can solve a problem by quickly typing a one-liner in a shell, maybe they are clever, or maybe their job is repetitive. Or maybe both: maybe they’ve found a way to make semi-repetitive tasks repetitive enough to automate. One way to become more productive is to split semi-repetitive tasks into more creative and more repetitive parts.

More command line posts

[1] The odd-looking line break is a quoted newline.

[2] Everett’s speech contained 13,607 words while Lincoln’s Gettysburg Address contained 272, a ratio of almost exactly 50 to 1.

[3] See Hillel Wayne’s post Donald Knuth was Framed. Here’s an excerpt:

Most of the “eight pages” aren’t because Knuth is doing LP [literate programming], but because he’s Donald Knuth:

  • One page is him setting up the problem (“what do we mean by ‘word’? What if multiple words share the same frequency?”) and one page is just the index.
  • Another page is just about working around specific Pascal issues no modern language has, like “how do we read in an integer” and “how do we identify letters when Pascal’s character set is poorly defined.”
  • Then there’s almost four pages of handrolling a hash trie.

The “eight pages” refers to the length of the original publication. I described the paper as 17 pages because that the length in the book where I found it.

Stigler’s law and human nature

Stigler’s law of eponymy states that no scientific discovery is named after the first person to discover it. Stephen Stigler acknowledged that he was not the first to realize this.

Of course this is just an aphorism. Sometimes discoveries are indeed named after their discoverers. But the times when this isn’t the case are more interesting.

Stigler’s law reveals something about human nature. The people who get credit for an idea are the ones who make their discovery known. They work in the open rather than in obscurity or secrecy. They may be the first to explain an idea well, or the first to popularize it, or the first to apply it to a problem that people care about. Apparently people value these things more than they value strict accounts of historical priority.

Related post: Stigler’s law and Avogadro’s number