[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Feeds by date
On Thu, May 13, 2004 at 06:42:21PM +0300, Janne Jalkanen wrote:
> How about using something like ETags or If-Modified-Since to support
> dynamic generation of feeds? I know dynamic generation is a big burden
> on many devices, and you really want to have a static file to serve, but
> in some situations it might come in handy.
There's some discussion of trying to solve this problem in
AggregatorBehaviorRules [2] (at the end). RFC 3229 (content deltas)
may be useful here; there was also an idea of extending HTTP [2], but
that wasn't very popular. In the end the broad concensus seemed to be
to recommend RFC 3229, and to work on a dedicated AggregatorApi
[1]. That stalled, in part because the people in the discussion ran
out of time, but also because at the time the base Atom API, which we
were planning on build on, wasn't very stable. I certainly recommend
reading those pages through.
[1] <http://www.intertwingly.net/wiki/pie/AggregatorApi>
[2] <http://www.intertwingly.net/wiki/pie/AggregatorBehaviorRules>
[3] <http://www.intertwingly.net/wiki/pie/PossibleHTTPExtensionForEfficientFeedTransfer>
> GET /atom.xml HTTP/1.1
> ETag: 2004-05-13T15:32:08Z
>
> The server would then be free to serve only content changed since the
> tag date, or just the 20 last items or so.
I think you mean If-None-Match rather than ETag, since ETag is a
response entity header not a request header. If so, while it's not
actually forbidden by RFC 2616 (as far as I can tell) to vary the
entity returned based on If-None-Match, I doubt very much that doing
so will be smooth or transparent. In particular, you're almost
certainly harming cacheability, because an If-None-Match conditional
GET should be able to be served out of a caching proxy server; if you
change the semantics of If-None-Match to require a return to the
original server, I think you'll have to mark all your responses as
uncacheable, which is almost certainly not going to be popular with
people running servers. We really want a solution which won't upset
the balance for servers, but which can still reduce the bandwidth
needed for clients.
(RFC 3229 is similarly not ideal for servers, because they have to go
around generating, and possibly storing, all those deltas.)
> In fact, since most aggregators hit a site every hour, it is
> conceivable that a smart server would keep a separate atom file with
> just the items from the last two hours, and serve that in case the
> date matches and there has been new content. In those rare cases,
> where someone asks the feed after two weeks of silence, it's okay to
> regenerate it for them.
But then you're not just transferring new entries, but all new entries
in a given time period. If you transfer everything from 2-4pm for a
request at 4pm, then a client that requests the feed every hour will
get too much data.
> But if the server does not support dynamic generation, no worries.
> It'll just use the ETag to see whether it should serve a 301
> NOT_MODIFIED or the entire feed...
304 is Not Modified; 301 is Moved Permanently. And you'll never get
304 by passing ETag, but I've mentioned that above.
I'm very much in favour of coming up with a solution to this problem,
but I'm pretty sure that it can't be solved nicely using raw
HTTP.
James
--
/--------------------------------------------------------------------------\
James Aylett xapian.org
james@xxxxxxxxxxxx uncertaintydivision.org