Civic duty on StackOverflow

On StackOverflow, users gain reputation points when other users vote up their questions or answers. Voting is considered a civic duty. Voting doesn’t increase your own reputation. The only direct reward for voting is the “Civic Duty” badge for voting 300 times. But voting makes the site work well. Good questions and answers generally rise to the top.

How civic-minded are StackOverflow users? Where do the votes come from? Are people who receive more or less likely to give? That is, do those who have received high reputation scores through other users’ votes also give away reputation points in the form of votes? Jeff Atwood mailed me some data the other day so I could answer these questions.

Sixty percent of  StackOverflow users haven’t cast one vote, but that doesn’t tell the whole story. The site is growing rapidly and so there are always a large number of users who haven’t been on the site long enough to vote much or gain much reputation. Also, there are a large number of users who registered some time ago but hardly participate on the site.

When you compare reputation scores and votes, things get more interesting. For starters, users who are somewhat invested in the site, as indicated by reputation score > 100, have voted 91 times on average. That still doesn’t tell the full story because it averages over a huge range of reputation scores. Here’s the more interesting story: The number of votes users cast is proportional to their reputation.

The graph above shows average number of answer votes as a function of reputation. I divided reputation ranges into blocks of 100 (i.e. 0 – 99, 100 – 199, etc.) and averaged the number of times users in that range voted up an answer. There are two reasons I only considered answer votes: there are far more question votes than answer votes, and question votes follow a similar pattern to answer votes.

The graph starts to feather out on the right end because there are fewer users in each reputation range; there is more random variation because there are fewer people in the higher ranges to average over.The number of users at each reputation level drops off rapidly according to a power law. Although 99.4% of users have reputation less than 5000, the largest reputation score was 51,313 on the day Jeff collected the data. Here’s a graph from my earlier post, StackOverflow reputation statistics, that shows how quickly the number of people in each reputation range drops.

The graph above was based on data collected at the end of February this year but the data discussed in this post was collected in April. As you look at higher reputation scores, the curve continues to drop of quickly. Since reputations follow a power law, the decrease is linear on a log scale.

Even though users with the highest reputation scores vote the most, most votes come from users with lower reputation scores. That’s just because the large majority of users have lower reputation scores. Users with reputation < 1400 account for a little over half the answer votes cast. They also account for over 96% of all users. If you turn this around, it says that nearly half the votes come from the top 4% of users in reputation. This explains in part why the best answers usually rise to the top: the most knowledgeable users are active voters, assuming reputation and knowledge are correlated.

(The situation is analogous to that of income taxes. The very wealthy pay the most taxes per person, but the bulk of tax revenue comes from those who are not so wealthy. Even so, the percentage of total tax revenue from the top earners is surprisingly high. According to this site, the top 1% of tax payers were responsible for about 40% of all income tax revenue in 2008. The analogy holds for good reasons. Wealth, like StackOverflow reputation, follows a power law distribution. And taxes increase roughly linearly with wealth the same way StackOverflow votes increase with reputation.)

In short, it looks like StackOverflow users are civic minded. Those who receive the most votes also give the most votes. And users in the lower end of the reputation range cast most of the votes in total even though they cast fewer votes per person.

Related post: StackOverflow reputation statistics

7 thoughts on “Civic duty on StackOverflow

  1. Isn’t there an alternative explanation that involves a confounding variable? What if both votes and reputation are simply proportional to the amount of time spent on the site?

  2. Hadley, I imagine you’re right, that votes and reputation are highly correlated with time spent on the site, but I didn’t have that data.

  3. I had emailed you after your last post on this topic, saying “I suspect that those Long Tail users without much reputation of there own are actively voting and generating the vast majority of the reputation for the rest of us…”

    I was sufficiently vague, so you can’t really tell how wrong I was. I’m very surprised that only 4% of the users cast half the votes.

  4. Great stuff, John!

    I was wondering about some of these things myself recently. In particular, what are the statistics and patterns behind down votes. What are the ratios of up vote to down votes across the user spectrum. Etc…

    If you have the time, maybe it would make for another interesting blog post?

    Stu

Comments are closed.