The Truth About Twitter's "Active User" Stats

Analysis done week ending 7 Nov 2013


key points
  • Twitter is ad revenue driven. Thus, key metrics are user engagement and user growth
  • Twitter's S-1 filing says they have 215+ million active monthly users (or 230m), and 100m+ active daily users
  • From a sample of 1.2m accounts, the actual monthly figure is 111m (47m daily)
  • Twitter claims 40% of accounts are active but never tweet. Since this can't be proven (they don't publish login stats), this is an easy way to raise their figures ahead of the IPO
  • Since 2009, spam on Twitter has increased 2300% (spam is now 93% of all signups)
  • Twitter is 1/10th the monthly size of Facebook, but only 1/15th the size on a daily basis
  • Over all time, only 2% of signups become active daily users (5% monthly)
  • Twitter is ephemeral, Facebook much less so (eg wedding photo albums). Comparing Twitter Daily Actives to Facebook Monthly means it's actually 1/20th (901m/47m) the size
  • Facebook floated at $104b (massively overpriced, it took 15mo to return to that level) on 901m users. 1/10th-1/20th means Twitter at $5.3b-$9.7b (~$8-$14/share) would be similarly overvalued
  • Thus, recent valuations ($70/share, p/e > 50, etc) are all sizzle, no steak
[All figures based on the analysis below. Analysis done 28 Oct - 7 Nov 2013]

questions
  • Is someone who signs into Twitter once a month really an active user?
  • Is Twitter going to continue to grow as strongly as Facebook has since their IPO?
  • Is Twitter going to be able to monetize as well (do they have as much user information)?
analysts/media starting to wake up


Twitter profiles downloaded so far
1,182,444
Profiles analysed
1,182,444 (100.00%)

raw stats
Total Twitter Accounts Ever Created
2,186,172,835
Number Downloaded so far
1,182,444
Deleted Accounts (404s. Mostly spam a/c's, killed by Twitter)
726,479 (61%)
Non deleted
455,965 (39%)

... and of the accounts left on Twitter (non-deleted):

Never really active *
261,797 (22%) [view]
Not active in the last 30 days *
107,336 (9%) [view]
Protected (Private) Accounts *
27,123 (2%) [view]
Absent but auto-tweeting (the autobots!) *
3,129 (0.3%) [view]
Monthly (public) active users
56,580 (5%) [view]
plus 5% of the protected accounts
57,877 (5%)


Daily (public) active users
23,991 (2%) [view]
plus 2% of the protected accounts
24,541 (2%)



estimates
No of deleted accounts ever *
1,343,157,608
No of non-deleted accounts *
843,015,227
No of non-deleted, non-empty accounts *
358,989,353

Active in last month (incl. protected)
111,224,650 (111m)
Active in last day (incl. protected)
47,161,374 (47m)

For comparison. Facebook has 1.19 billion monthly actives, 727 million daily active (from here)




year by year analysis
If the account is "never really active" - eg zero tweets etc, it's not considered alive (this is 57% of all non-deleted accounts)
yearaccounts created% still aliveest no still alive% deleted (or spam)
2006359,4233%12,05397%
200711,308,8894%440,81596%
20086,848,74437%2,552,86963%
200982,347,86444%36,385,21356%
2010131,827,72740%53,262,07760%
2011219,062,98537%82,115,78763%
2012599,681,96416%98,245,77884%
20131,134,735,2397%84,358,23193%






notes
  • Calculating exact numbers is tricky because Twitter doesn't provide information about logins, or protected (private) accounts
  • Using logins to define active behaviour is flawed. If you have any twitter apps on your phone, they will auto-check Twitter constantly, regardless of any actual interest
  • A VERY common behaviour for spam accounts is to spam, then immediately delete the tweets. The victim sees the tweet, but Twitter won't later identify and kill the spammers
  • never really active = fewer than 5 tweets and no user icon or no bio, or no tweets and few followers
  • % protected (of active) = protected / (older_than_30_days + newer_than_30_days + protected)
    In English: Protected are pulled to one side. As a percent, divide the count by ALL active (analysed) accounts
  • No of deleted accounts = maximum_id * (no_killed / no_analysed)
  • No of non-deleted accounts = maximum_id - no_of_deleted_accounts
  • Non-deleted, non-empty accounts = non-deleted - those considered "never really active"
    In English: whatever is left
  • Active in last 30 days = maximum_id * (newer_than_30 / (analysed - protected - %protected*no_killed)
    In English: those active recently, proportional to all non-protected downloaded accounts
    Note: this is a VERY high estimate, due to assumption 2 below
  • Autobots = anyone that sets up a program (or app) to automatically tweet for them. Some examples: Blog post auto-tweeters (twitterfeed etc). Facebook can auto-tweet for you. Ditto tumblr, Pinterest etc. These accounts look active, but they're not. The users could be dead for all we know. This is only a problem if this is ALL the account is doing. Full list of autobot apps here



methodology
  • We randomly select and download a random Twitter profile. Twitter Id's are assigned sequentially. Which profile we choose is based on selecting a random number from 1 to the largest current Twitter Id (ie, the most recent account created).
  • We crudely weighted the random selection on a year by year basis to nudge selection closer to reflecting annual account creation numbers
  • That tells us which accounts have been deleted (Twitter returns a 404 error).
  • It also tells us which accounts are protected (ie, private, so we can't see when they tweet).
  • We can then analyse the public accounts and see if they have tweeted in the last month, and in the last day.
  • We can also see if they have simply left a robot auto-tweeting from their account. Ie, they look active but they're not.
    An example of an auto-tweeter would be connecting Facebook so everytime they post on Facebook it automatically comes through to Twitter. This is NOT an active user. Ie, they will not be seeing Twitter's adverts, or contributing to their bottom line.
  • We then add in the same percentage of protected accounts as the results with the public accounts (see assumption 1, below) to reach our final figure



assumptions
  1. That protected accounts behave the same as normal accounts
  2. That as many protected accounts are deleted as normal accounts - this is VERY unlikely. 90+% of deleted accounts are non-protected (ie public), spam accounts
  3. That the random selection (from zero to maximum twitter id) is truly (or significantly enough for our purposes) random



who am I?
I'm Si Dawson. I ran Twit Cleaner from 2009 until 2013.

Twit Cleaner was a behavioural categorisation engine - it identified different types of account behaviour and simplified unfollowing. Put another way, it was an asshole-detector (in all their delightful variations) for Twitter.

This means I've spent a LOT of time getting data out of Twitter and analysing the hell out of it. Eg, over that time I found and reported half a million spam accounts to Twitter - but really I did that just for the fun of it.

(I've also spent 30 years in tech, 15 or so in finance, with time as a quantitative analyst, and done a ton of artificial intelligence development. A longer bio is here.)

Frankly, I love Twitter. I've met so many incredible people through it.

However, in the last few years Twitter has focused on monetisation at the expense of their users and their external developers (dozens of others have been affected, not just myself). Which means ultimately, Twitter the community has suffered enormously. Prioritising money over users/product/experience is bad enough, but developers are what helped make Twitter great.

This kind of attitude is damaging to the long term viability of the company, but makes perfect sense if you're just trying to ramp up the numbers ahead of an IPO

Personally, I'm done with developing for Twitter, but what I'm primarily concerned about is less informed investors getting fleeced.

If you did buy in after the IPO? Good luck. I hope you get out before it costs you too much.

Any questions? twitter@eggsbacon.co.nz

Cheers, Si