Yes Cloud!

Infrastructure Implications of Tweeting

Posted on: March 19, 2009

The Twitter phenomenon, or micro-blogging, has been quite intriguing. Though not yet a regular tweeter myself, I am told that the “aha” moment will come when I start using it actively. So I started tweeting this week on Twitter and Facebook.

As I was warming up, a new tweet popped up in my mind. What are the infrastructure implications of tweeting, in terms of HTTP connection rate, rate of new storage required, etc. I quickly looked up Twitter stats on tweetstats.com – nearly 2 million tweets per day. What if most of the world starts tweeting using smart phones (very much like SMS today)? To get a better sense of the infrastructure needed for this human urge to tweet, I did some quick back of the envelope calculation.

Assumptions:
Average Tweet Size: 100 bytes
# of Tweets: 10 per tweeter per day
# of Tweeters: 1 billion worldwide (think big!)

Infrastructure Requirements:
Tweet Rate: 10 billion tweets per day
Tweet Storage: 100 Gigabytes per day (with 10:1 compression)

Each tweet is essentially an HTTP transaction (request and response). The tweet rate of 10B/day translates to ~115K HTTP transactions/sec for tweets uniformly distributed throughout the day. Assuming that the compute infrastructure (aggregate of web, application, database servers) can process 1000 transactions/sec/server, about 115 servers are needed. If a peak to average ratio of 3:1 is assumed, then about 350 servers are needed.

Storage needs appear to be quite manageable also – 100GB/day means ~37TB/year, which is no sweat in the petabyte world we live in today.

Net-net, setting up a tweeting service does not seem to need an onerous compute/storage infrastructure (even if people double or triple their daily tweetings). Any techie tweeters out there who can validate/correct the above?

An interesting extension of this would be to estimate capacity of handling all new thoughts of every human being on this planet!!!

PG.

Advertisements

7 Responses to "Infrastructure Implications of Tweeting"

[…] Original post by Prashant Gandhi […]

10 tweets per day might be a lower number if you take into account some of the one-to-one tweets which is more like SMS/chat. Eventually if twitter becomes one of the prime mediums of communication (partially replacing SMS, scraps in social networking, passing comments, short discussions, idea sharing etc) the tweets could grow exponentially.

Also on the storage front, I guess twitter stores meta data associated with each tweet. And the meta data depends on how much twitter is interested in. If you account for geo-location information and the like it would be much more per tweet. Looking at the way twitter is trying to monetize on its data, it might further process the tweets and generate more valuable information out of them which might need more storage as well.

Ekanth,

Very good points. My analysis has been constrained by my own limited experience with Twitter and Facebook 🙂 Interestingly, there is a couple of orders of magnitude room here both on compute and storage fronts before infrastructure implications become more acute…

PG.

Look at this

http://www.techcrunch.com/2009/03/23/the-efficient-cloud-all-of-salesforce-runs-on-only-1000-servers/

As you say, it will take a lot of time before we feel a pinch on the infrastructure limitations. There might also be scope for optimizing the efficiency of data storage and retrieval which might buy even more time.

[…] informative post on YES Cloud outlines the infrastructure implications of tweeting using Twitter. Using […]

Interesting article. Keep it up.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


  • None
  • Kevin Clark: Interesting article. Keep it up.
  • Michael Segal: Prashant, An interesting analogy between the earlier and the most recent cloud models. What became evident to me based on this analogy, is the gro
  • Pete K.: Prashant, You are ever the professor. I see cloud computing as a semi-dumb interface (more than a CRT in a network PC - such as at an airport gat

Categories

%d bloggers like this: