Twitter follower distribution

A conversation this morning prompted the question of how many Twitter accounts have between 10,000 and 20,000 followers. I hadn’t thought about the distribution numbers of followers in a while and was curious to revisit the topic.

Apparently this question was more popular five years ago. When I did a few searches on the topic, all the results I got back were five or more years old. Also, data about the Twitter network was more available then that it is now.

This post will come up with an estimate based on almost no data, approaching this like a Fermi problem.

Model

I’m going to assume that the number of followers for the nth most followed account is

f = c n

for some constants c and α. A lot of networks approximately follow some distribution like this, and often α is somewhere between 1 and 3.

Two data points

I’ve got two unknowns, so I need two data points. Wikipedia has a list of the 50 most followed Twitter accounts here, and the 50th account has 37.7 million followers. (I chose the 50th account on the list rather than a higher one because power laws typically fit better after the first few data points.)

I believe that a few years ago the median number of Twitter followers was 0: more than half of Twitter accounts had no followers. Let’s assume the median number of followers is 1. But median out of what? I think I read there are about 350 million total Twitter accounts, and about 200 million active accounts. So should we base our median out of 350 accounts or 200 accounts? We’ll split the difference and assume the median account is the 137,500,000th most popular account.

Solve for parameters

So now we have two equations:

c 50 = 37,700,000

c 137500000 = 1

and taking logs gives us two linear equations in two unknowns. We solve this to find α = 1.1 and c = 2.9 × 109. The estimate of α is about the size we expected, so that’s a good sign.

Take this with a grain of salt. We’ve guessed a very simple model and fit it with just two data points, one of which we based on an educated guess.

Final estimate

We assumed f = c n and we can solve this for n. If an account has n followers, we’d estimate its rank n as

n = (c / f)1/α.

So we’d estimate the number of accounts with between 10,000 and 20,000 followers to be

(c / 10000)1/α – (c /20000)1/α.

which is about 40,000. I expect this final estimate is not bad, say within an order of magnitude, despite all the crude assumptions made along the way.

New Twitter account: ElementFact

I started a new Twitter account this morning: @ElementFact. I’m thinking the account will post things like scientific facts about each element but also some history around how the element was discovered and named and other lore associated with the element.

@ElementFact icon

We’ll see how this goes. I’ve started many Twitter accounts over the years. Some have taken off and some have not.

Six months ago I started @TensorFact as an experiment to see what would happen with a narrowly focused account. It did moderately well, but I ran out of ideas for content. This will be the last week for that account.

You can see a list of my current Twitter accounts here.

Alt tags on tweet images

I learned this morning via a comment that Twitter supports alt text descriptions for images. I didn’t think that it did, and said that it didn’t, but someone kindly corrected me.

When I post equations as images on this site, I always include the LaTeX source code in an alt tag. That way someone using a screen reader can determine the content of the equation. It also helps me if I need to go back and change an equation. I’d like to do the same on Twitter.

Unfortunately, it seems support for this feature is inconsistent. Maybe it’s new. The software I use to manage my Twitter accounts apparently doesn’t offer a way to add alt text.

When I use Twitter via its website, I am able to write alt text but not able to read it. Maybe you have to have accessibility features turned on, which would be unfortunate. People who do not use screen readers occasionally benefit from being able to read photo descriptions. I could imagine, for example, that someone might be curious to see the LaTeX code I used to create an equation image.

I tried a couple experiments, one on my personal account and one on my AlgebraFact account. On the latter, I posted an image of the quadratic formula with the text description

    x = \frac{-b \pm \sqrt{b^2 - 4ac}}{2a}

When I look at the tweet while logged into my AlgebraFact account, I see a little black box in the lower left corner of the image with “ALT” in white letters.

quadratic formula with ALT box in lower left

I do not see the same box on an image I posted from my personal account. When I log into my personal account, I no longer see the ALT box on the AlgebraFact tweet but I see one on an image I posted.

Question and request

It appears that with default settings, users cannot see image descriptions. What do you have to do to see the descriptions?

I intend to always put LaTeX source in the alt tag of equation images on this site. If you run across an equation without alt text, please let me know.

Removing Unicode formatting

Several people responded to my previous post asserting that screen readers would not be able to read text formatted via Unicode variants. Maybe some screen readers can’t handle this, but there’s no reason they couldn’t.

Before I go any further, I’d like to repeat my disclaimer from the previous post:

It’s a dirty hack, and I’d recommend not overdoing it. But it could come in handy occasionally. On the other hand, some people may not see what you intend them to see.

This formatting is gimmicky and there are reasons to only use it sparingly or not at all. But I don’t see why screen readers need to be stumped by it.

In the example below, I format the text “The quick brown fox” by running it through unifont as in the previous post.

If we pipe the output through unidecode then we mostly recover the original text. (I wrote about unidecode here.)

    $ unifont The quick brown fox | unidecode 

            Double-Struck: The quick brown fox
                Monospace: The quick brown fox
               Sans-Serif: The quick brown fox
        Sans-Serif Italic: The quick brown fox
          Sans-Serif Bold: The quick brown fox
   Sans-Serif Bold Italic: The quick brown fox
                   Script: T he quick brown fox
                   Italic: The quick brown fox
                     Bold: The quick brown fox
              Bold Italic: The quick brown fox
                  Fraktur: T he quick brown fox
             Bold Fraktur: T he quick brown fox

The only problem is that sometimes there’s an extra space after capital letters. I don’t know whether this is a quirk of unifont or unidecode.

This isn’t perfect, but it’s a quick proof of concept that suggests this shouldn’t be a hard thing for a screen reader to do.

Maybe you don’t want to normalize Unicode characters this way all the time, but you could have some configuration option to only do this for Twitter, or to only do it for characters outside a certain character range.

How to format text in Twitter

Twitter does not directly provide support for formatting text in bold, italic, etc. But it does support Unicode characters [1], and so a hack to get around the formatting limitation is to replace letters with Unicode variants.

For example, you could tweet

How to include bold or italic text in a tweet.

I cheated in the line above, using bold and italic formatting rather than Unicode characters because some readers might not be able to read it.

Here’s a screenshot of the actual Unicode text in Emacs. You can see the text in the footnotes [2].

This is plain text. I have asked for the details on the ‘b’ in bold, and the bottom windows shows that it is not the common U+0062 for ‘b’ down in the ASCII range, but U+1D5EF up in the Supplementary Multilingual Plane. Similarly, the i in italic above is not U+0069 but U+1D456.

Here’s how the text appears in Twitter:

It’s a dirty hack, and I’d recommend not overdoing it. But it could come in handy occasionally. On the other hand, some people may not see what you intend them to see. Here’s a portion of a screenshot from an Android device:

How to include XXXX or XXXXXX test

As a very rough rule of thumb, characters with smaller Unicode values are more likely to display correctly everywhere. Math symbols like ∞ (U+221E) work everywhere as far as I know. I wouldn’t depend on any Unicode character above 0xFFFF.

Update: Several people have said this formatting poses a problem for speech readers. The next post explains why it shouldn’t. (Maybe it does cause a problem, but it wouldn’t have to.)

How to produce Unicode formatting

I produced the Unicode text above using the programs unifont and unisupers from the Perl module Unicode::Tussle. See this post for how to install the module. Here’s a screenshot of using these utilities from the command line.

To use unifont, type the text you’d like to format after the command. It then shows the text formatted several ways using Unicode characters. I typed “bold” and copied the bold version of the word. The text could be anything; it’s a coincidence that I gave it text that was also a format name. For example, I created the double-struck R and C above with the command

    unifont R C

The unisupers command does not take an argument but instead takes its input from standard input. So I hit return after the command name and then typed ‘n’ to get the superscript n.

Related posts

[1] Twitter supports Unicode characters, but there’s a question of whether readers will have fonts installed to display the characters. I wrote eight years ago about some symbols users were and were not likely to see, but my impression is that the situation has improved quite a bit since then.

[2] Here’s the actual text of the tweet:

How to include  or  text in a tweet.
Weierstrass function.
Im: ℂ -> ℝ
ℝⁿ -> ℝᵐ

(I pasted the text into my blogging software, but it looks like it is deleting the words “bold” and “italic.”)

Data privacy Twitter account

My newest Twitter account is Data Privacy (@data_tip). There I post tweets about ways to protect your privacy, statistical disclosure limitation, etc.

I had a clever idea for the icon, or so I thought. I started with the default Twitter icon, a sort of stylized anonymous person, and colored it with the same blue and white theme as the rest of my Twitter accounts. I think it looked so much like the default icon that most people didn’t register that it had been customized. It looked like an unpopular account, unlikely to post much content.

Now I’ve changed to the new icon below, and the number of followers is increasing.
data tip icon

Related pages

My Twitter graveyard

I ran into The Google Cemetery the other day, a site that lists Google products that have come and gone. Google receives a lot of criticism when they discontinue a product, which is odd for a couple reasons. First, the products are free, so no one is entitled to them. Second, it’s great for a company to try things that might not succeed; the alternative is to ossify and die.

In that spirit, I thought I’d celebrate some of my Twitter accounts that have come and gone.

Dormant accounts

My first daily tip Twitter account was SansMouse, an account that posted one keyboard shortcut per day. I later renamed the account ShortcutKeyTip. I also had an account PerlRegex for Perl regular expressions, DailySymbol for symbols, and BasicStatistics for gardening. Just kidding about that last one; it was for basic statistics.

Donated accounts

I started RLangTip and then handed it over to people who were more enthusiastic about R. I also started MusicTheoryTip and gave it away.

Renamed accounts

The account GrokEM was for electricity and magnetism. I renamed it ScienceTip and broadened the content to be science more generally. I also had an account SedAwkTip for, you guessed it, sed and awk. It also broadened its content and became UnixToolTip.

I renamed StatFact to DataSciFact and got an immediate surge in followers. Statistics sells better when you change the label to data science, machine learning, or artificial intelligence.

Resurrected accounts

I stopped posting to my signal processing account DSP_fact for a while and then started it up again. I also let my FormalFact account lie fallow for a while, then renamed it LogicPractice and restarted it. It’s the only one of my accounts I don’t post to regularly.

Rejected accounts

I had several other ideas for accounts that I never started. They will probably never see the light of day. I have no intention of starting any new accounts at this time, but I might someday. I also might retire some of my existing accounts.

Graphic design

Here’s a collage of the icons for my accounts as eight years ago:

SansMouse icon RegexTip icon TeXtip icon ProbFact icon StatFact icon AlgebraFact icon TopologyFact icon AnalysisFact icon CompSciFact icon

The icons became more consistent over time, and now all my Twitter accounts have similar icons: blue dots with a symbol inside. I’m happy with the current design for now.

Twitter icons