Illegible work

When James Scott uses the word legible, he doesn’t refer to handwriting that is clear enough to read. He uses the word more broadly to mean something that is easy to classify, something that is bureaucrat-friendly. A thing is illegible if it is hard to pigeonhole. I first heard the term from Venkatesh Rao’s essay A Big Little Idea Called Legibility.

Much of the work I do is illegible. If the work were legible, companies would have an employee who does it [1] and they wouldn’t call a consultant.

Here’s a template for a conversation I’ve had a few times:

“We’ve got kind of an unusual problem. It’s related to some things I’ve seen you write about. Have you heard of …?”

“No, what’s that?”

“Never mind. We’d like you to help us with a project. …”

Years ago, when people heard that I worked in statistics they’d ask what programming language I worked in. They expected me to say R or SAS or something like that, but I’d say C++. Not that I recommend doing statistics in C++ [2] in general, but people came to me with unusual projects that they couldn’t get done with standard tools. If an R module would have done what they wanted, they wouldn’t have knocked on my door.

Doing illegible work is a lot of fun, but it’s hard to market. Try typing “Someone who can help with a kinda off the wall math / computer science project” into Google. It’s not helpful. Search engines can only answer questions that are legible to search engines. Illegible work is more likely to come from word of mouth than from a search engine.


[1] Sometimes companies call a consultant because they have occasional need for some skill, something they do not need often enough to justify hiring a full-time employee to do. Or maybe they have the skills in house to do a project but don’t have anyone available. Or maybe they want an outside auditor. But in this post I’m focusing on weird projects.

[2] When I mention C++, I know some people are thinking “But isn’t C++ terribly complex?” Why yes, yes it is. But my colleagues and I already knew C++, and we stuck to a sane subset of the language. It was not unusual to rewrite R code in C++ and make it 100x faster.

“Why don’t you just use C?” These days I’m more likely to write C than C++.  Clients don’t want me to write enterprise applications, just small numerical libraries, and they usually ask for C.

What use is mental math in 2022?

Now that most people are carrying around a powerful computer in their pocket, what use is it to be able to do math in your head?

Here’s something I’ve noticed lately: being able to do quick approximations in mid-conversation is a superpower.

Zoom call

When I’m on Zoom with a client, I can’t say “Excuse me a second. Something you said gave me an idea, and I’d like to pull out my calculator app.” Instead, I can say things like “That would require collecting four times as much data. Are you OK with that?”

There’s no advantage to being able to do calculations to six decimal places on the spot like Mr. Spock, and I can’t do that anyway. But being able to come up with one significant figure or even an order-of-magnitude approximation quickly keeps the conversation flowing.

I have never had a client say something like “Could you be more precise? You said between 10 and 15, and our project is only worth doing if the answer is more than 13.2.” If they did say something like that, I’d say “I will look at this more carefully offline and get back to you with a more precise answer.”

I’m combining two closely-related but separate skills here. One is the ability to simple calculations. The other is the ability to know what to calculate, how to do so-called Fermi problems. These problems are named after Enrico Fermi, someone who was known for being able to make rough estimates with little or no data.

A famous example of a Fermi problem is “How many piano tuners are there in New York?” I don’t know whether this goes back to Fermi himself, but it’s the kind of question he would ask. Of course nobody knows exactly how many piano tuners there are in New York, but you could guess about how many piano owners there are, how often a piano needs to be tuned, and how many tuners it would take to service this demand.

The piano tuner example is more complicated than the kinds of calculations I have to do on Zoom calls, but it may be the most well-known Fermi problem.

In my work with data privacy, for example, I often have to estimate how common some combination of personal characteristics is. Of course nobody should bet their company on guesses I pull out of the air, but it does help keep a conversation going if I can say on the spot whether something sounds like a privacy risk or not. If a project sounds feasible, then I go back and make things more precise.

Related links

A Bayesian approach to pricing

Suppose you want to determine how to price a product and you initially don’t know what the market is willing to pay. This post outlines some of the things you might think about, and how Bayesian modeling might help.

This post is not the final word on the subject, or even my final word on the subject. It is essentially a reply to a friend’s question turned into a blog post rather than an email.

Prior information

You must have some idea, however vague, what the market value of your product is. If you had absolutely no idea what a product is worth, you wouldn’t be considering it as a business opportunity.

There is always prior information. As a former colleague would say, when you want to measure the distance to the moon, you know not to pick up a yard stick. Whenever you do an experiment, something motivated you to do the experiment.

This is an ideal application of Bayesian statistics because you have valuable prior information before you have data. Until you have a moderate amount of data, your prior information may be more useful than your data.

Some people will say you should only act on data, not on subjective prior knowledge, but this is impossible. When you offer your product for the first time, you have no data. All you have to go on is prior information. You could hide your prior information in the design of an experiment rather than making it explicit in a prior distribution, but it’s still there.


Assume the market price for some product can be modeled by a random variable X that depends on some parameter θ. I’m not saying that the price is random in any philosophical sense, only that it is useful to model it as random. More on this line of thinking here.

By modeling market price as a random variable rather than a single number, we’re acknowledging that it has some fuzziness to it. Different customers are willing to pay different amounts for the same product. Maybe the prices they’re willing to pay are tightly distributed around some center, or maybe there’s substantial variance.

When we make a sale, or fail to make a sale, we learn something about θ. But notice that we don’t observe X per se, we observe whether a particular sample from X was above or below the offer price. You’re not conducting a survey where you ask “What is the highest price p that you’d be willing to pay” and get a candid answer. You make an offer x, and it is either accepted or rejected. You observe whether x < p or x > p.

This means the likelihood function is similar to what you’d see in modeling survival data, but a little different. When someone dies, you fully observe their survival time. But if you follow up with someone and they’re still alive, you only know a lower bound on their survival time, not the survival time itself. We say the data is censored because we haven’t yet observed everything we want to know.

Survival data is usually asymmetric, censored on one side but not the other. You could have two-sided censoring, but that’s less common.

With pricing your data is always censored in both directions. You either get a lower bound or an upper bound on what someone would have been willing to pay.

After each offer and response, you can update your estimate of θ. Each interaction gives you a better idea of the distribution on θ.


Now suppose after numerous observations you’re moderately confident in your knowledge of θ. Now what?

One response is “Well then you charge what the market is most likely to bear.” That’s kind of a simplistic optimization. It implicitly assumes you’re OK with a 50% chance of a sale going through. Maybe your business is struggling and you don’t have many leads. Then you want a higher conversion rate. Or maybe you’re doing well, have plenty leads, and are OK with a low conversion rate. This is especially the case if the distribution on market price has a lot of variance; if the variance is low it makes more sense to think of “the” price as if it were a single number.

So far I have implicitly assumed that the only consequence of asking too much is a lost sale. But if you ask for too much, you might lose future sales, even if you get the current sale. I’ve also assumed that customers always prefer lower prices. That’s not the case. Asking too little for a product can hurt your credibility, for example.

Estimating willingness to pay is complicated, and determining what to do once you’ve made that estimate is complicated as well. This post is just a sketch of the thought process a company might go through.

Related posts

Wire gauge and user perspective

wire gauge measurement device

Wire gauge is a perennial source of confusion: larger numbers denote smaller wires. The reason is that gauge numbers were assigned from the perspective of the manufacturing process. Thinner wires require more steps in production. This is a common error in user interface design and business more generally: describing things from your perspective rather than from the customer’s perspective.


When you order food at a restaurant, the person taking your order may rearrange your words before repeating them back to you. The reason may be that they’re restating it in manufacturing order, the order in which the person preparing the food needs the information.


A rheostat is a device for controlling resistance in an electrical circuit. It would seem natural for an engineer to give a user a control to vary resistance in Ohms. But Ohm’s law says

V = IR,

i.e. voltage equals current times resistance. Users expect that when they turn a knob clockwise they get more of something—brighter lights, louder music, etc.—and that means more voltage or more current, which means less resistance. Asking users to control resistance reverses expectations.

If I remember correctly, someone designed a defibrillator once where a knob controlled resistance rather than current. If that didn’t lead to someone dying, it easily could have.


When I worked for MD Anderson Cancer Center, I managed the development of software for clinical trial design and conduct. Our software started out very statistician-centric and became more user-centric. This was a win, even for statisticians.

The general pattern was to move from eliciting technical parameters to eliciting desired behavior. Tell us how you want the design to behave and we’ll solve for the parameters to make that happen. Sometimes we weren’t able to completely automate parameter selection, but we were at least able to give the user a head start in knowing where to look.


Technical people don’t always want to have their technical hat on. Sometimes they want to relax and be consumers. When statisticians wanted to crank out a clinical trial design, they wanted software that was easy to use rather than technically transparent. That’s the backstory to this post.

It’s generally a good idea to conceal technical details, but provide a “service panel” to expose details when necessary.

Related posts

Black Swan Gratification

Psychologists say that random rewards are more addictive than steady, predictable rewards. But I believe this only applies to relatively frequent feedback. If rewards are too infrequent, there’s no emotional connection between behavior and reward. The connection becomes more intellectual and less visceral as feedback becomes less frequent and less predictable.

Nassim Taleb distinguishes between delayed gratification and random gratification in his foreword to the book Safe Haven by Mark Spitznagel.

There are activities with remote payoff and no feedback that are ignored by the common crowd. … So what this idea is about isn’t delayed gratification but the ability to operate without gratification — or rather, with random gratification.

Choosing a course of action that is certain to pay off a year from now is opting for delayed gratification. Choosing something that is likely to pay off eventually, maybe two years from now, or maybe next week, is opting for random gratification.

Random rewards encourage an addictive response to frequent feedback, and discourage a rational response to infrequent feedback.

The solution is to act on principle, rather than respond like the rats in the psychological studies alluded to above.


Not-to-do list

There is an apocryphal [1] story that Warren Buffett once asked someone to list his top 25 goals in order. Buffett then told him that he should avoid items 6 through 25 at all costs. The idea is that worthy but low-priority goals distract from high-priority goals.

Paul Graham wrote something similar about fake work. Blatantly non-productive activity doesn’t dissipate your productive energy as unimportant work does.

I have a not-to-do list, though it’s not as rigorous as the “avoid at all costs” list that Buffett is said to have recommended. These are not hard constraints, but more like what optimization theory calls soft constraints, more like stiff springs than brick walls.

One of the things on my not-to-do list is work with students. They don’t have money, and they often want you to do their work for them, e.g. to write the statistical chapter of their dissertation. It’s easier to avoid ethical dilemmas and unpaid invoices by simply turning down such work. I haven’t made exceptions to this one.

My softest constraint is to avoid small projects, unless they’re interesting, likely to lead to larger projects, or wrap up quickly. I’ve made exceptions to this rule, some of which I regret. My definition of “small” has generally increased over time.

I like the variety of working on lots of small projects, but it becomes overwhelming to have too many open projects at the same time. Also, transaction costs and mental overhead are proportionally larger for small projects.

Most of my not-to-do items are not as firm as my prohibition against working with students but more firm than my prohibition against small projects. These are mostly things I have pursued far past the point of diminishing return. I would pick them back up if I had a reason, but I’ve decided not to invest any more time in them just-in-case.

Sometimes things move off my not-to-do list. For example, Perl was on my not-to-do list for a long time. There are many reasons not to use Perl, and I agree with all of them in context. But nothing beats Perl for small text-munging scripts for personal use.

I’m not advocating my personal not-to-do list, only the idea of having a not-to-do list. And I’d recommend seeing it like a storage facility rather than a landfill: some things may stay there a while then come out again.

I’m also not advocating evaluating everything in terms of profit. I do lots of things that don’t make money, but when I am making money, I want to make money. I might take on a small project pro bono, for example, that I wouldn’t take on for work. I heard someone say “Work for full rate or for free, but not for cheap” and I think that’s good advice.


[1] Some sources say this story may be apocryphal. But “apocryphal” means of doubtful origin, so it’s redundant to say something may be apocryphal. Apocryphal does not mean “false.” I’d say a story might be false, but I wouldn’t say it might be apocryphal.

More stability, less stress

It’s been eight years since I started my consulting business. Two of the things I love about having my own business are the stability and the reduced stress. This may sound like a joke, but I’m completely serious.

Having a business is ostensibly less stable and more stressful than having a salaried job, but at a deeper level it can be more stable and less stressful.

If you are an employee, you have one client. If you lose that client, you lose 100% of your income. If you have a business with a dozen clients, losing a client or two at the same time is disappointing, but it’s not devastating.

As for stress, I prefer the stress of owning a business to the stresses of employment. My net stress level dropped when I went out on my own. My sleep, for example, improved immediately.

At first I never knew where the next project was coming from. But I found this less stressful than office politics, questioning the value of my work, a lack of correlation between my efforts and my rewards, etc.

If you’re thinking of striking out on your own, I wish you well. Here is some advice I wrote a few years ago that you may find helpful.

Simultaneous projects

I said something to my wife this evening to the effect that it’s best for employees to have one or at most two projects at a time. Two is good because you can switch off when you’re tired of one project or if you’re waiting on input. But with three or more projects you spend a lot of time task switching.

She said “But …” and I immediately knew what she was thinking. I have a lot more than two projects going on. In fact, I would have to look at my project tracker to know exactly how many projects I have going on right now. How does this reconcile with my statement that two projects is optimal?

Unless you’re doing staff augmentation contracting, consulting work is substantially different from salaried work. For one thing, projects tend to be smaller and better defined.

Also consultants, at least in my experience, spend a lot of time waiting on clients, especially when the clients are lawyers. So you take on more work than you could handle if everyone wanted your attention at once. At least you work up to that if you can. You balance the risk of being overwhelmed against the risk of not having enough work to do.

Working for several clients in a single day is exhausting, but that’s usually not necessary. My ideal is to do work for one or two clients each day, even if I have a lot of clients who are somewhere between initial proposal and final invoice.

Opposite of the Peter Principle

Peter Principle book cover

The Peter Principle is an idea that was developed by Lawrence Peter and expanded into a book coauthored with Raymond Hull in 1969. It says that people rise to their level of incompetence. According to the Peter Principle, competent people are repeatedly promoted until they get to a level where they’re not bad enough to fire but not good enough to promote.

I haven’t thought about the Peter Principle in a while, but I was reminded of it when I was reading One Giant Leap and was struck by this line:

He was the opposite of the Peter Principle.

What a great thing to have someone say about you. So what was the context of that line?

Jane Tindall said it about her husband Bill. The title of that chapter in One Giant Leap is “The Man Who Saved Apollo.” The author, Charles Fishman, is saying indirectly that Bill Tindall was the man who saved Apollo by getting the program’s software development effort on track. The previous chapter, “The Fourth Crew Member” explained how Apollo’s guidance computer, primitive as its hardware was by contemporary standards, was absolutely critical to the missions.

Here’s the paragraph containing the line above.

By 1966 Tindall had had years of management experience; one engineer who worked for him said that Tindall liked remaining the deputy in the divisions where he worked because it gave him more actual ability to get things done, more maneuvering room, and considerably less bureaucratic hassle. Said his wife, Jane, “He was the opposite of the Peter Principle.” [1] Tindall had the ability and experience to absorb, understand, and sort out serious technical problems, and that ability earned him the respect of his colleagues, even when they didn’t get the decision they wanted.

More Peter Principle posts

[1] No one used the term “Peter Principle” during the Apollo program because Dr. Peter had not yet coined the term yet. The quote from Jane Tindall came from Fishman interviewing her in 2016.

Scaling up and down

There’s a worn-out analogy in software development that you cannot build a skyscraper the same way you build a dog house. The idea is that techniques that will work on a small scale will not work on a larger scale. You need more formality to build large software systems.

The analogy is always applied in one direction: up. It’s always an exhortation to use techniques appropriate for larger projects.

But the analogy works in the other direction as well: it’s inappropriate to build a dog house the same way you’d build a skyscraper. It would be possible to build a dog house the way you’d build a skyscraper, but it would be very expensive. Amateur carpentry methods don’t scale up, but professional construction methods don’t scale down economically.

Bias for over-engineering

There’s a bias toward over-engineering because it works, albeit inefficiently, whereas under-engineering does not. You can use a sledgehammer to do a hammer’s job. It’ll be clumsy, and you might hurt yourself, but it can work. And there are tasks where a hammer just won’t get the job done.

Another reason for the bias toward over-engineering is asymmetric risk. If an over-engineered approach fails, you’ll face less criticism than if a simpler approach fails. As the old saying goes, nobody got fired for choosing IBM.

Context required

Simple solutions require context to appreciate. If you do something simple, you’re open to the criticism “But that won’t scale!” You have to defend your solution by explaining that it will scale far enough, and that it avoids costs associated with scaling further than necessary.

Suppose a group is debating whether to walk or drive to lunch. Someone advocating driving requires less context to make his point. He can simply say “Driving is faster than walking,” which is generally true. The burden is on the person advocating walking to explain why walking would actually be faster under the circumstances.

Writing prompt

I was using some database-like features in Emacs org-mode this morning and that’s what prompted me to write this post. I can just hear someone say “That won’t scale!” I often get this reaction from someone when I write about a simple, low-tech way to do something on a small scale.

Using a text file as a database doesn’t scale. But I have 88 rows, so I think I’ll be OK. A relational database would be better for storing million of records, but that’s not what I’m working on at the moment.

More posts on scale