[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: Blogging needs a whole new BAG...



I am so swamped with stuff that I can't respond in detail, but there are one
or two things I do want to inject into the discussion.

>> Currently, blogging is all about pull and 
>> polling. However, it is clear to many that the current approach simply 
>> won't scale.
>
> I think there's a good chance it'll scale just fine.  

Just as there was a chance that 'the web' would have scaled just fine with
HTTP/1.0, without a conditional GET or persistent connections. Just as there
was a chance that just by using IIOP CORBA would have just run fine at
Internet scale. 

There is a very big difference between 'the web' and the feed/aggregator
systems. The first is designed for and driven by human interaction.
Feed/aggregator systems take the human out of the loop and there is 'reload'
even if there is no human consuming the information at that point in time.
Secondly the use of aggregators have given the user the feel of 'real-time'
access to new information. Why does slashdot ban you if you try to poll more
then every 5 minutes? Because there are a lot of people that indeed try to
poll every minute. We have seen 'patched' aggregators to bypass the
restrictions on minimum inter-poll period. 

We large groups of new types of users coming online, it is not obvious (at
least not to me) that it will scale 'just fine'

> I expect that at 
> some point, very popular feeds are going to have to put some sort of 
> specialist polling aggregator between them and their millions of 
> subscribers, just as big websites use Akamai or other caching/staging 
> engines.  Nothing in the design of RSS or Atom gets in the way.  Trying 
> to solve the problem in advance is premature optimization.

I agree. A lot of support architectures are possible. And we can have a long
discussion about the different solutions. Or what the real needs or desires
of users are. For example one of the drivers for the increased load on the
infrastructure is the desire for real-time access to information, instantly.
And users are exploiting the feed/aggregator infrastructure for that. Whether
this can be solved by pull+caching, or direct push, or p2p, or collaborative
overlays, there is room for a lot of discussion and new ventures.

I am not advocating that you should hold Atom/API work to re-evaluate the
architecture, not that you re-architect the format and APIs to focus on this.
But it would be good if people working on the format could keep in mind that
there may be alternative delivery mechanisms of the feed data in the future. 

And whether a BAG makes sense? I do think there is room for a group to look
at all the different pieces of the syndication architecture, and whether they
interoperate in a coherent manner, and whether all the lessons we have
learned from building large-scale systems are indeed applied consistently
throughout the architecture. Feed formats and publishing protocol could be
part of that.

I apologize for not elaborating more, nor presenting hard numbers (e.g. how
much of BoingBoing bandwidth is probably wasted because of the current
approach to feed serving) Have to get back to work & deadlines.

--
Werner