[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Blogging needs a whole new BAG...




On 18 May 2004, at 5:57 pm, Bob Wyman wrote:


Push vs. Pull: Currently, blogging is all about pull and polling. However, it is clear to many that the current approach simply won't scale. We can't have millions of aggregators polling and pulling from millions of Atom files without introducing massive resource consumption issues. Blogging works today in large part because it simply isn't a very popular or well known system. Unfortunately, it appears that the resource (bandwidth, etc.) that is required to support blogging increases at a rate which is greater than linear in relationship to the number of blogs and bloggers...

Can I rebut this theory - I know a few bloggers with the wrong hosting package have been burned by syndication bandwidth consumption, but when you do the maths for a large organization, the numbers actually look fine for scalability. Starting at the extreme end: Google has 53.3 million unique visitors according to Wired. If every one of them subscribes to a 20KB feed, and polls it once an hour. They don't bother with conditional GETs or gzip, and they're all online at the same time. That works out as 2.25 Gbps average bandwidth (not including TCP overhead etc). Gigabit internet connections are not that rare - though I can't find any online pricing. But I'm sure an organization with 53.3 million customers subscribed can make back the revenue to buy enough to cope.


Similarly, a medium sized organization with a dedicated, unmetered 2 Mbps line (which are fairly cheap) for syndication users can theoretically support 46,000 simultaneous users. Seeing as you could count them all as loyal customers for being subscribed, that works out very well for return on investment.

It looks like the numbers might even work better the larger the scale - it's only cheap hosting packages where they don't. Do we have any examples of a large website offering syndication and being unhappy about the bandwidth cost? I also haven't seen any explanation of how push would reduce anyone's bandwidth consumption (As far as I can tell, it's the same numbers initiated from the other end, plus far more overhead from managing who to push to).

Graham