[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Bandwidth concerns
> I have a third reason not to give full feeds - visual design. I put a
> lot of effort into making my blog pretty. And some of the markup I use
> in my feed references a stylesheet that may not be available to the
> aggregator and, even if it were, may not work in the aggregator's context.
>
> I've thought about working on some sort of Atom extension that allows
> full visual markup to be communicated through to the aggregator. But
> it's complicated and, in the complete case, impossible.
Run away! Heh. The rocks below this cliff are littered with the bodies of
those that have tried wrangling with the nightmare of separating markup from
content.
> >Basically I suggest requiring clients to behave somewhat like nntp
> >clients, fetching only newer entries, with fallback to "standard" last
> >15 items if the server only has a static file.
>
> This sounds way too stateful and complex. It's a clever idea, but it
> would require a lot of innovation to get right. I don't think
> Pie/nEcho/Atom is the place to do that.
Sounds too difficult? Please, let's not get into generalizations intended to
stall momentum. We can go argue /that/ in the RSS lists if we want that.
The only complexities here lie in asking things that want full content to make
additional queries. Is it 'better' to use multiple documents all containing the
same stuff? Or is it better to use mutliple document each with their own data
and linked via a manifest-like document instead?
People complain that a full content file is too big. They then wander off into
looking into ways to get into differential delivery. I'm arguing it's better to
go with a lightweight manifest document and keep the rich items in their own
external documents.
This way the only 'wasted' bandwidth lies in the consumption of the lightweight
manifest. The manifest contains, for example, 15 items. Only 3 of which are
new. Yes, you're getting 12 items that you've already seen before. However,
they're something like 5 lines each (title, short desc, timestamp, url (as in
identifier)). Thus if you already know about those 12 items there's no point in
making the 12 requests to go get those items again. Although you could do a
HEAD query to find if they're actually different from the last time you checked.
But the manifest would have already /told/ you this so there'd be no need to
check again. Thus you're then only faced with getting the 3 items that are new.
Delivered up using gzip, timestamps and eTag of course.
The added benefit here is the environment hosting these rich content items can
utilize local static resource caching. Either by actually using static files or
via local proxying. There's no dynamic CPU load to worry about scaling.
Realizing, of course, there ARE places where dynamic processing is going to be
required (such as running commentaries, threads and the like).
Now, for folks that /have/ the resources available the use of something like
RDF's data model makes it possible to do BOTH. The references in the manifest
could be either a local resource within the document itself or actual externally
resolvable documents indicated with a URL.
-Bill Kearney