Optimal team size

Kevlin Henney’s keynote at GOTO Copenhagen this year discussed how project time varies as a function of the number of people on the project. The most naive assumption is that the time is inversely proportional to the number of people. That is

t = W/n

where t is the calendar time to completion, W is a measure of how much work is to be done, and n is the number of people. This assumes everything on the project can be done in parallel. Nobody waits for anybody else.

The next refinement is to take into account the proportion of work that can be done in parallel. Call this p. Then we have

t = W[1 – p(n-1)/n].

If everything can be done in parallel, p = 1 and tW/t as before. But if nothing can be done in parallel, p= 0, and so tW. In other words, the total time is the same whether one person is on the project or more. This is essentially Amdahl’s law.

With the equation above, adding people never slows things down. And if p > 0, every addition person helps at least a little bit.

Next we add a term to account for communication cost. Assume communication costs are proportional to the number of communication paths, n(n – 1)/2. Call the proportionality constant k. Now we have

t = W[1 – p(n-1)/n + kn(n-1)/2].

If k is small but positive, then at first adding more people causes a project to complete sooner. But beyond some optimal team size, adding more people causes the project to take longer.

Of course none of this is exact. Project time estimation doesn’t follow any simple formula. Think of these equations more as rough guides or metaphors. It’s certainly true that beyond a certain size, adding more people to a project can slow the project down. Kevlin gave examples of projects that were put back on track by reducing the number of people working on them.

My quibble with the equation above is that I don’t think the cost of more people is primarily communication. Communication paths in a real project are not the simple trees of org charts, but neither are they complete graphs. And if the problem were simply communication, then improved communication would mitigate the cost of adding people to a project, though I imagine it hardly does.

I think the cost of adding people to a project has more to do with Parkinson’s Law which says that people make work for each other. (The aphorism form of Parkinson’s Law says that work expands to the time allowed. But the eponymous book explains why work expands, and it is in part because people make extra work for each other.)

Dust jacket of the book Parkinsons Law and Other Studies in Administration

I wrote about a similar theme in the blog post Maybe you only need it because you have it. Here’s the conclusion of that post:

Suppose a useless project adds staff. These staff need to be managed, so they hire a manager. Then they hire people for IT, accounting, marketing, etc. Eventually they have their own building. This building needs security, maintenance, and housekeeping. No one questions the need for the security guard, but the guard would not have been necessary without the original useless project.

When something seems absolutely necessary, maybe it’s only necessary because of something else that isn’t necessary.

Grateful for failures

I’ve been thinking lately about different things I’ve tried that didn’t work out and how grateful I am that they did not.

The first one that comes to mind is my academic career. If I’d been more successful with grants and publications as a postdoc, it would have been harder to decide to leave academia. I’m glad I left when I did.

When I was in high school I was a fairly good musician. At one point decided that if I made the all state band I would major in music. Thank God I didn’t make it.

I’ve looked back at projects that I hoped to get, and then realized how it’s a good thing that they didn’t come through.

In each of these examples, I’ve been forced to turn away from something I was moderately good at to pursue something that’s a better fit for me.

I wonder what failure I’ll be grateful for next.


Selecting clients

One of the themes in David Ogilvy’s memoir Confessions of an Advertising Man is the importance of selecting good clients. For example, he advises “never take associations as clients” because they have “too many masters, too many objectives, too little money.”

He also recommends not taking on clients that are so large that you would lose your independence and financial robustness by taking them on.

I have never wanted to get an account so big that I could not afford to lose it. The day you do that, you commit yourself to living with fear. Frightened agencies lose the courage to give candid advice; once you lose that you become a lackey.

This is what lead me to refuse an invitation to compete for the Edsel account. I wrote to Ford: “Your account would represent one-half of our total billing. This would make it difficult for us to sustain our independence of counsel.” If we had entered the Edsel contest, and if we had won it, Ogilvy, Benson & Bather would have gone down the drain with Edsel.

This sort of thinking was very much on my mind when I was preparing to leave my last job to strike out on my own. As Nassim Taleb discusses in Antifragile, a steady job seems safer than entrepreneurship, but in some ways it’s not. With one big client, i.e. an employer, you are less exposed to small risks but more exposed to big risks. Your income doesn’t vary per month, unless it suddenly drops to zero.

In addition to looking for good clients, Ogilvy shares several stories of letting go of bad clients. I have yet to resign from a bad client—I haven’t had any bad clients—but I value the option to do so. The option to resign from a project makes it less likely that you’ll find yourself in a project you wish to resign from.

Formulating applied math problems

Somewhere in school I got the backward idea that solving math problems is hard but that formulating them is easy. I don’t know if anybody ever said that to me. Maybe it was just implied by years of solving problems someone else had formulated.

A related wrong idea that I also picked up was that formulating math problems was not a mathematician’s responsibility. Someone, probably an engineer, would formulate the problem and hand it over to a mathematician. That happens occasionally, but that’s not how it usually works.

Formulating problems is hard, and it’s usually the applied mathematician’s responsibility, ideally with generous input from a domain area expert.

There are a lot of ways to turn a real world problem into a math problem, and maybe several of them would be adequate for the task at hand. Then you might as well choose the easiest one to understand and compute. Knowing several ways to formulate a problem increases your chances of find one approach that’s tractable. Particularly when you can determine what problem really needs to be solved, not just the problem you first see, you might give yourself more options for how to go about it.

Applied mathematicians don’t need to be an expert in every area of application, and of course cannot be. But they do need to meet clients half way (or more). They need to know something about the problem domain. They need to listen well and need to ask good questions. The questions help the mathematician get going, and they may also give the client something new to think about.

Consulting for consultants

They say that doctors make terrible patients, but in my experience consultants make great consulting clients. The best are confident in their own specialization and respect you in yours. They get going quickly and pay quickly. (I’ve only worked for consultants who have small companies. I imagine large consulting companies are as slow as other companies the same size.)

Sometimes consultants working in software development will ask me to help out with some mathematical part of their projects. And sometimes math/stat folks will ask me to help out with some computational part of their projects.

I started my consulting business three years ago. Since then I’ve gotten to know a few other consultants well. This lets me offer a broader range of services to a client by bringing in other people, and sometimes it helps me find projects.

If you’re a consultant and interested in working together, please send me an email introducing yourself. I’m most interested in meeting consultants who have some overlap with what I do but who also have complementary strengths.

Compressing ten years into six months

The other day I ran across a line from Peter Thiel saying that if you have a plan for where you’d like to be in ten years, ask yourself if you could get there in six months.

I don’t think he’s simply saying see if you can do everything 20 times faster. If you estimate something will take ten days, it probably will take more than half a day. We’re better at estimating things on the scale of days than on the scale of years.

If you expect to finish a project in ten days, you’re probably going to go about it the way you’ve approached similar projects before. There’s not a lot of time for other options. But there are a lot of ways to go about a decade-long project.

Since Thiel is well known for being skeptical of college education, I imagine one of the things he had in mind was starting a company in six months rather than going to college, getting an entry level job, then leaving to start your company.

As I wrote in an earlier post, some things can’t be done slowly.

Some projects can only be done so slowly. If you send up a rocket at half of escape velocity, it’s not going to take twice as long to get where you want it to go. It’s going to take infinitely longer.

Some projects have to be done quickly if they are going to be done at all. Software projects are usually like this. If a software project is expected to take two years, I bet it’ll take five, if it’s not cancelled before then. You have to deliver software faster than the requirements change. Of course this isn’t unique to software. To be successful, you have to deliver any project before your motivation or your opportunity go away.

Overestimating the competition

Richard Feynman tells a story in Surely You’re Joking, Mr. Feynman that I’m reminded of periodically when I realize something is smaller and less sophisticated than I imagined.

[Update: A couple people pointed out in the comments that I got the roles of the two characters in this story reversed, so I’ve corrected this.]

Feynman tells the story of a colleague at Los Alamos, Frederic de Hoffman, describing his company’s attempt to plate plastics with metal. De Hoffman said that his company was making progress, but gave up when he saw that another company, Metaplast Corporation, was apparently way ahead of them, based on Metaplast’s advertising. Feynman had worked at Metaplast a few years earlier, but didn’t tell de Hoffman immediately.

Feynman asked de Hoffman how many chemists he thought Metaplast had.

“I would guess they must have twenty‑five or fifty chemists … How the hell could we compete with them?”

Feynman told de Hoffman “You’ll be interested and amused to know that you are now talking to the chief research chemist of the Metaplast Corporation, whose staff consisted of one bottle‑washer!”

I don’t think Feynman was trying to gloat that he was smarter than the staff of chemists at de Hoffman’s company, though he may have been. Feynman knew all the problems his company had and the times they screwed up. They projected a more confident image in their advertising, and the competition bought it.

Learning (needlessly) hard technology

A few years ago, a friend told me he was thinking about learning a certain technology because it was really hard to use. This was not something that had to be complex to solve a complex problem, but something that was unnecessarily complex. Why would anyone do that?

His reasoning was that as a consultant, he could make good money supporting a technology that’s hard to use. My friend would have more integrity than to recommend something that he didn’t think was a good solution. Perhaps he was thinking of saying something like this to a client: “I wouldn’t recommend this technology if you were starting from scratch. But since you’re invested in it, I’ll help you with it or help you migrate to something else.”

That sounds like an unpleasant way to earn a living. It also sounds risky. If something really is unnecessarily complex, better alternatives are likely to arise, perhaps suddenly. (This assumes people are free to choose alternatives, not prohibited by law, for example.)

Learning a technology that’s complex for good reasons could be a smart and ethical move. The work is harder at lower levels of abstraction, but someone has to solve the problems others would rather not think about. And since not as many people can do that work, it should pay better and be more secure.

There are a couple dangers, however, associated with choosing a more difficult technology. One is the temptation to use it where it isn’t needed. The other is that the set of problems where it is needed may shrink over time.

Related posts:

Project lead time

Large companies take longer to start projects. How much longer?

A plausible guess is that project lead time would be proportional to the logarithm of the company size. If a company with n employees has a hierarchy with every manager having m subordinates, the number of management layers would be around logm(n). If every project has to be approved by every layer of management, lead time should be logarithmic in the company size. This implies huge companies only take a little longer to start projects than medium-sized companies, and that doesn’t match my experience.

In my experience, lead time is proportional to something like the square root of the company size:

T = kE

where T is lead time, k is a proportionality constant, and E is the number of employees. For example, someone told me that he moved to a company 1000 times bigger and things seem to move about 30 times slower. That would be consistent with a square root rule.

If T is measured in days and k = 0.5, the square root rule would say that a solo entrepreneur could start a project in half a day, and a company of 130,000 employees would take six months. That seems about right. Of course small companies can move slowly, and large companies can move quickly. But it’s a good rule of thumb to say individuals operate on a scale of days, small-to-medium companies on the scale of weeks, and large companies on the scale of months.

The reason may be that large companies scale up well, but they don’t scale down well. They can put together large deals fairly quickly, relative to the size of the deal, but not small deals.

Intellectual property is hard to steal

It’s hard to transfer intellectual property. When I was managing software projects, it would take months to fully transfer a project from one person to another. This was with full access to and encouragement from the original developer. This was a transfer between peers, both part of the same environment with all its institutional memory. If it’s this hard to transfer a project to a colleague, how hard must it be for a competitor to make sense of stolen files?

I’m most familiar with intellectual property in the form of software, but I imagine the same applies to many other forms of intellectual property. Some forms of data are easy to understand, such as a list of passwords. But others, such as source code, require a large amount of context beyond the data. One reason acquisitions fail so often is that the physical assets of a company are not enough. The most valuable assets a company has are often intangible.

Of course companies should protect their intellectual property, but a breach is not necessarily a disaster. On the other hand, the loss of institutional memory may be a disaster.

Balancing profit and learning in A/B testing

A/B testing, or split testing, is commonly used in web marketing to decide which of two design options performs better. If you have so many visitors to a site that the number of visitors used in a test is negligible, conventional randomization schemes are the way to go. They’re simple and effective.

But if you have less traffic so that the number of visitors involved in a test is appreciable, you might be concerned with possible lost revenue during the test itself. The point of A/B testing is to improve profitability after the test, not during the test. If you also want to consider profitability during the test, you might want to consider more alternatives.

My experience with testing comes from a context where the stakes are higher than improving conversion on web sites: treating cancer patients. You want to find out which treatments performed better for the sake of future patients, those who were treated after the randomized trial. But you also want to treat the participants in the clinical trial effectively. Two ways we would do that are early stopping rules and adaptive randomization. Both practices are applicable to A/B testing web pages.

A conventional clinical trial might take a few hundred patients and randomize half to one treatment and half to another. But if one treatment appears to be much more effective, at some point it becomes unconscionable to keep assigning the less effective treatment. So you stop the experiment early. You might want to do the same with web designs. If you planned to show two variations of a page to 500 visitors each, but after 100 visitors it’s obvious which version is performing better, you’d like to stop the test and show everyone the better page. On the other hand, if you have so many visitors that you’re not concerned with what happens to the 1000 visitors in the test, just let the test run to completion.

Another approach is to compromise between equal randomization and early stopping. Suppose A is performing better than B, but not so much better that you’re willing to stop and declare A the winner. You might keep randomizing, but increase the probability that the test will assign A. If A really is better, more visitors will see the better page. But if you’re wrong and B is really better, you may still discover this because some visitors are still seeing B. If B keeps performing better, the tide will turn and the test will prefer it. This is called adaptive randomization. The more evidence there is that one version is better, the higher the probability that you’ll show people that version.

One way to use adaptive randomization is variable experiment sizes. Instead of deciding a test size in advance, you test until you’re satisfied that you’ve found a winner. That may require fewer visitors than a conventional A/B test. It may also require more, but only when there’s a good reason to. The test may go into overtime, so to speak, because the two versions are performing similarly, in which case you’d like to keep testing longer to find which is better.

It’s easy to fall into thinking that the winner of a test will be used forever, whether you’re testing web pages or cancer treatments. But this isn’t the case. The winner will eventually be tested against something else, maybe very soon. This means that you might want to put a little more emphasis on the performance during the test and not just performance after the test, because there may not be much opportunity for performance after the test.

Click to learn more about help with randomization

Juggling projects

Yesterday on Twitter I said I was thinking about writing the names of each of my clients and leads on balls so I could literally juggle them. I was only half joking.

I didn’t write my clients and leads on balls, but I did write them on index cards. And it helped a great deal. It’s easier to think about projects when you have physical representations you can easily move around. Moving lines up and down in an org-mode file, or even moving boxes around in 2D in OneNote, doesn’t work as well.

Electronic files are great for storing, editing, and querying ideas. But they’re not the best medium for generating ideas. See Create offline, analyze online. See also Austin Kleon’s idea of having two separate desks, one digital and one analog.

Scientifically valid, practically invalid

In a recent episode of EconTalk, Phil Rosenzweig describes how the artificial conditions necessary to make experiments scientifically valid can also make the results practically invalid.

Rosenzweig discusses experiments designed to study decision making. In order to make clean comparisons, subjects are presented with discrete choices over which they have no control. They cannot look for more options or exercise any other form of agency. The result is an experiment that is easy to analyze and easy to publish, but so unrealistic as to tell us little about real-world decision making.

In his book Left Brain, Right Stuff, Rosenzweig quotes Philip Tetlock’s summary:

Much mischief can be wrought by transplanting this hypothesis-testing logic, which flourishes in controlled lab settings, into the hurly-burly of real-world settings where ceteris paribus never is, and never can be, satisfied.

Another reason we don’t apply the 80-20 rule

I’ve written about the 80-20 rule several times because it keeps coming up. I’d like to believe that each time I revisit it I understand it a little better.

In its simplest form the 80-20 rule says 80% of your outputs come from 20% of your inputs. You might find that 80% of your revenue comes from 20% of your customers, or 80% of your headaches come from 20% of your employees, or 80% of your sales come from 20% of your sales reps. The exact numbers 80 and 20 are not important, though they work surprisingly well as a rule of thumb.

The more general principle is that a large portion of your results come from a small portion of your inputs. Maybe it’s not 80-20 but something like 90-5, meaning 90% of your results coming from 5% of your inputs. Or 90-13, or 95-10, or 80-25, etc. Whatever the proportion, it’s usually the case that some inputs are far more important than others. The alternative, assuming that everything is equally important, is usually absurd.

The 80-20 rule sounds too good to be true. If 20% of inputs are so much more important than the others, why don’t we just concentrate on those? In an earlier post, I gave four reasons. These were:

  1. We don’t look for 80/20 payoffs. We don’t see 80/20 rules because we don’t think to look for them.
  2. We’re not clear about criteria for success. You can’t concentrate your efforts on the 20% with the biggest returns until you’re clear on how you measure returns.
  3. We enjoy less productive activities more than more productive ones. We concentrate on what’s fun rather than what’s effective.

I’d like to add another reason to this list, and that is that we may find it hard to believe just how unevenly distributed the returns on our efforts are. We may have an idea of how things are ordered in importance, but we don’t appreciate just how much more important the most important things are. We mentally compress the range of returns on our efforts.

Making a list of options suggests the items on the list are roughly equally effective, say within an order of magnitude of each other. But it may be that the best option would be 100 times as effective as the next best option. (I’ve often seen that, for example, in optimizing software. Several ideas would reduce runtime by a few percent, while one option could reduce it by a couple orders of magnitude.) If the best option also takes the most effort, it may not seem worthwhile because we underestimate just how much we get in return for that effort.