Suppose project completion time follows a Pareto (power law) distribution with parameter α. That is, for *t* > 1, the probability that completion time is bigger than *t* is *t*^{-α}. (We start out time at *t* = 1 because that makes the calculations a little simpler.)

Now suppose we know that a project has lasted until *t*_{0} so far. Then the expected finish time is α*t*_{0}/(α-1) and so the expected additional time is *t*_{0}/(α-1). Note that both are proportional to *t*_{0}. So the longer it has taken, the longer it will take. If the project is running late, you can expect the time remaining to be even more than the expected time before the project started. The finish line is moving away from you!

For example, suppose α = 2 (in applications of power laws, α is often between 1 and 3) and you’re measuring time in years. When the project starts at *t* = 1, it is expected to take one year, until *t* = 2. Now suppose you’re starting the second year and the project isn’t done. Now it’s expected to finish at *t* = 4, two more years. When you started, the project was supposed to take a year. One year later, it has taken a year, and should be expected to take two more years. I said “should be expected” rather than “is expected” because no one would believe such an estimate. (Ever heard of the Big Dig? Or other megaprojects?)

Note that we have computed the conditional probability given only the time it has taken so far, and *no other information*. If you know more, for example maybe you know that some specific pieces have been completed, then you should use that information.

This is related to the Lindy effect. The longer a cultural artifact has been around, the longer it is expected to last into the future.

* * *

For daily posts on probability, follow @ProbFact on Twitter.

Well that’s depressing. However, do you have any reason to “project completion time follows a Pareto (power law) distribution”, or are you just messing with us?

Can you show us the derivation here?

It depends on whether a project is dominated by thin tail or fat tail events. Thin tail time events are unlikely to go too far past their expected times. Maybe there’s some cutoff where if something takes too long, something new happens that puts a limit on the time, such as dropping a feature that takes too long to develop.

Fat tail events are more open ended. Maybe something is expected to take six months, but there’s nothing to prevent it from taking years. When you have these kinds of risks, then the Pareto model is at least qualitatively reasonable. It’s not the only fat tail distribution, but it’s representative.

It’s not hard to derive, but I’d rather not write it up.

I should have said, how you get αt_0/(α-1) from the relation in the first paragraph. A bit of integration?

Someone should derive the relationship to Hofstadter’s Law: “It always takes longer than you expect, even when you take into account Hofstadter’s Law.”

This is one of the most interesting results that I remember from the queuing theory course that I took in grad school, and the one that gets the most shocked looks when I’m asked how long I think it might take for some project about which I have little or no knowledge of to reach completion.

Thanks for making it a little better known.