This post is a set of footnotes to my previous post on the Lindy effect. This effect says that creative artifacts have lifetimes that follow a power law distribution, and hence the things that have been around the longest have the longest expected future.

**Works of art**

The previous post looked at technologies, but the Lindy effect would apply, for example, to books, music, or movies. This suggests the future will be something like a mirror of the present. People have listened to Beethoven for two centuries, the Beatles for about four decades, and Beyoncé for about a decade. So we might expect Beyoncé to fade into obscurity a decade from now, the Beatles four decades from now, and Beethoven a couple centuries from now.

**Disclaimer**

Lindy effect estimates are crude, only considering current survival time and no other information. And they’re probability statements. They shouldn’t be taken too seriously, but they’re still interesting.

**Programming languages**

Yesterday was the 25th birthday of the Perl programming language. The Go language was announced three years ago. The Lindy effect suggests there’s a good chance Perl will be around in 2037 and that Go will not. This goes against your intuition if you compare languages to mechanical or living things. If you look at a 25 year-old car and a 3 year-old car, you expect the latter to be around longer. The same is true for a 25 year-old accountant and a 3 year-old toddler.

**Life expectancy**

Someone commented on the original post that for a British female, life expectancy is 81 years at birth, 82 years at age 20, and 85 years at age 65. Your life expectancy goes up as you age. But your expected *additional* years of life does not. By contrast, imagine a pop song that has a life expectancy of 1 year when it comes out. If it’s still popular a year later, we could expect it to be popular for another couple years. And if people are still listening to it 30 years after it came out, we might expect it to have another 30 years of popularity.

**Mathematical details**

In my original post I looked at a simplified version of the Pareto density:

*f*(*t*) = *c**/ t^{c+1} *

starting at *t* = 1. The more general Pareto density is

*f*(*t*) = *ca*^{c}/*t*^{c+1}

and starts at *t* = *a*. This says that if a random variable *X* has a Pareto distribution with exponent *c* and starting time *a*, then the conditional distribution on *X* given that *X* is at least *b* is another Pareto distribution, now with the same exponent but starting time *b*. The expected value of *X* *a priori* is *ac*/(*c*-1), but conditional on having survived to time *b*, the expected value is now *bc*/(*c*-1). That is, the expected value has gone up in proportion to the ratio of starting times, *b*/*a*.

Isn’t there a confirmation bias here? Lisp and Fortran and Forth are still with us, their approximate contemporaries APL and COBOL and PL/I not. The “Lindy effect” hypothesis could be (but hasn’t been) tested against real populations, essentially doing the equivalent of actuarial demographics for computer languages, for musicians’ legacies, etc.

Also, even if musician legacy lifetimes were to obey a power law, it still doesn’t tell us what to expect about Beyoncé, but only about the statistics of a pretty large ensemble of Beyoncés (perish the thought!)

Bach was obscure until Mendelssohn revived his music 150 years ago. Does this mean his music has a lifetime of 150 even though it is 250 years old? That would be odd.

Human life expectancy follows a double exponential law.

Software system lifetime seems to have an exponential distribution (fig 3 of the link).

What process would generate power law lifetimes? If available resources were governed by a fractal process and different entities consumed different ‘size’ objects we would expect lifetime to follow a power law.

Some kind of variation on the game of life where memory was divided into chunks and activity stopped once all the chunks were used up?

Rick: The Lindy effect is a probability statement, not a law. If it were a law, nothing new would ever survive. Also, it only makes predictions about things that are currently alive and well. So you can’t apply it retroactively to things like PL/I.

As for Beyoncé versus a large ensemble of Beyoncés, that’s what probability means (at least the frequentist interpretation of probability). Or from a more Bayesian perspective, it’s a statement not about Beyoncé per se but our uncertain knowledge of her future career, informed by the careers of other artists.

Derek: The example of Bach’s career is interesting, and the Lindy effect is too simple to take such things into account.

I was thinking about Windows in light of the Lindy effect. When did Windows begin? Do you go back to DOS in 1981, or Windows 1.0 in 1985? I’d say there was a pretty sharp discontinuity in its history in 1993 with Windows NT and that current versions of Windows are descendents of NT, not really Windows 1.0 or DOS. Linux, on the other hand, seems like more of a direct descendent of the original Unix versions.

I don’t think your font is small enough.

Normally the font is medium-size. Sometimes when I get a lot of traffic, pages don’t render correctly. Try refreshing the page.

Might a better statement be:

“… So we might expect Beyoncé to fade into obscurity

no sooner thana decade from now, the Beatlesno sooner thanfour decades from now, and Beethovenno sooner thana couple centuries from now… ”Seems like the effect suggests that things will last at least as long as they have, not that they will only last as long as they have?

lens: It’s a mean value. You expect the value of a random variable to be somewhere near its mean, though of course it doesn’t have to be. And particularly for long-tailed distributions like Pareto, there’s a good chance a variable could be far from its mean.

Great post and some good insights into how power laws can actually be helpful. They can be complicated because measures of central tendency are less meaningful (ie it’s not unlikely for a value to be very different from the mean in a power law) than for thin-tailed distributions).

One thing I’ve noticed here is that, when looking at other math problems, the devils is in the definition. When talking about whether a programming language “will be around” in a given period of time, we should really ask what this means. I’m sure that in 25 years, you’ll be able to find a program written somewhere in Perl that is still in use and being maintained. Even if there’s only one such application, does that mean Perl is “in use”? Then there’s situations like JavaScript, where a whole bunch of languages and tools essential boil down to writing JS without actually writing JS. Does that mean it’s “in use”, even if almost no one actually writes JS code directly?

It’s the same with popular music. Look hard enough, and you’ll find someone who is a fan of any given song or artist. Are they still considered popular? What if there’s a one-off tribute festival for some obscure artist one year, with thousands of attendees?

I do agree the statement is a probabilistic statement and not a law. It should also be subject to clearly defined terms, but on the whole, it’s a good guide.

Very interesting topic and interesting blog in general. I was reading your blog quietly for a while. But now I would like to ask a question or an advice.

I am in the process of writing my master thesis on customers’ churn prediction. Can Lindy effect be used for estimating the chances for particular customer to switch to another service provider based on his account’s age?

Or is it simply the loyalty degree and is completely unrelated to Lindy effect?

Thanks,

Jamil

Jamil: One way to approach this would be to first determine what sort of distribution customer loyalty follows. You could try a few distributions and do a goodness of fit test. If you’re lucky, a well-known distribution will fit well enough. If not, you’ll need to do something else, say with a nonparametric model.

If you find a distribution that fits your data, then you can compute the distribution on continued loyalty conditional on loyalty to date. If your data follow a power law, you get a simple distribution, the Lindy effect. But in general you’ll get something else, similar to the Lindy effect if your distribution is similar to a power law.

In the last line shouldn’t the conditional expectation be bc/(c-1).

Alex. You’re right. I updated the post.

What’s the distribution of X | X>k? I ask after a thought experiment: if people stopped listening to Beethoven tomorrow, then the expected value of his music’s lifespan that you calculated today (200+200 years) would be very off. In fact, the expectation of X | X>k is the most “wrong” at time k when X turns out to be k+1 — in other words, artists, languages, nations, etc., die just after you expect them to survive for another few years/decades/millennia.

So I wonder if, while the expected value of X | X>k is 2*k, the full density was flat-ish between k+1 to 2*k? or at least interesting enough to shed some light on why the expectation is the worst predictor just before the extinction. Thanks! I’ll try to make Wolfram Alpha show me the answer in the meantime.

Ah, I reread your post and you answered my question in words. With c=2, the random variable X | X>10, say, has density 200/x^3 and mean 20, as confirmed by http://www.wolframalpha.com/input/?i=ParetoDistribution%5B10%2C+2%5D. This tells me that although Beyoncé’s music has been popular for 10 years, the probability of it “dying” within the next couple of years is 30%, and 55% within five years! So the Lindy effect, to me, isn’t so much about the longevity of a great institution (like Beyoncé or Beethoven) but the precariousness of survival.

But it’s also about the tenacity of the survivors. The probability that people are enjoying Beyoncé in 2113, in a hundred years, is 1% — not too bad odds.

(It becomes even more entertaining to think about when one learns that it’s not just social survival but also daily rainfall and city populations and personal wealth are also apparently Pareto-distributed. If it’s rained 10 inches already today, there’s a 1% probability that it’ll have rained 100 inches by midnight!)

I don’t think the model breaks down re life.. But it depends on if it’s members of one type.. Eg human accountant vs baby, you expect the baby to outlive… But if we compare organisms of arbitrary species.. If you had 2 eukaryote organisms.. One 30 years old, one 30 days old… If the individuals are of unknown species, you would better guess that the older one would love another 30… So, thus very much depends on further factors instead? We are not stating how we are sampling technologies (?)