[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

timeless atom




hello everybody.

at http://dret.typepad.com/dretblog/2007/12/timeless-atom.html i have published the following text, and i would be really glad to get some opinions about this, in particular simply about the mere thought of using atom and atompub for things which are not inherently time-ordered.

thanks and kind regards,

erik wilde   tel:+1-510-6432253 - fax:+1-510-6425814
       dret@xxxxxxxxxxxx  -  http://dret.net/netdret
       UC Berkeley - School of Information (ISchool)

----------------------------------------------------------------------


Timeless Atom

Atom evolved out of RSS and as such basically is a well-designed version of RSS. AtomPub adds to that a well-defined protocol for interacting with a server that manages collections of (potentially updateable) Atom entries. RSS was designed for news feeds, which are regarded as time-ordered series of news items. Atom has the same bias, which makes sense.

However, i am wondering how well Atom works (and how well Atom implementations work) in an environment where the entries are not time-ordered. there is quite a bit of time bias in Atom:

* Atom (RFC 4287): atom:entry elements MUST contain exactly one atom:updated element. * AtomPub (RFC 5032): The Entries in the returned Atom Feed SHOULD be ordered by their app:edited property, with the most recently edited Entries coming first in the document order.

so it is not even allowed to have untimed entries in an Atom feed, and an Atom feed is expected to be ordered by time. so there definitely is some bias that one cannot escape, but what would it take to implement a timeless AtomPub server?

if a collection of atom entries has no time stamps (or at least the time stamps are not considered to be useful as the primary sort key), then there are two possibilities:

1. there may be some other implicit primary sort key, which is used for publishing the feed. this would conflict with AtomPub's SHOULD, but that is allowed. 2. if there is no implicit key, then the GET request has to specify some query that is used for retrieval and ranking.

this raises the interesting question whether a feed can be regarded as a queriable collection, rather than an ordered sequence. i don't think anything in Atom or AtomPub disallows this (as long as the atom entries have the required atom:edited element), but it certainly is not the usual way how Atom is used.

in a very limited sense, one part of the Atom landscape already does this: feed paging and archiving (RFC 5005) specifies how to navigate paged feeds, and in an example uses http://example.org/index.atom?page=2 as a link to the second page of a feed. this is a query, but a very primitive one.

the interesting question is: how far can you go? it would be conceivable to allow arbitrarily complex query strings in the URI, but due to limitations of URIs, there is only so much you can embed into a URI. assume you have a collection of pictures and want to query that by picture, using a sample picture and getting as a result the ranked list of similar pictures. this does not work as a URI query string. but AtomPub tells us that listing collection members always uses GET, which means we cannot POST the image. we could do this by using GET to transmit the sample picture, but is it allowed for an HTTP GET to have a message body? actually, this question is kind of undecided, nothing in HTTP explicitly prohibits this, but HTTP also never explicitly mentions it.

so, i would be really interested to find out whether the usage of Atom and AtomPub for collections where time is not the primary key is something that is (a) a stupid idea, (b) reasonable but esoteric, or (c) something people are already doing; and if (c) applies, how these collections are queried, in particular in the light of the fact that URI query strings are limited in the data they can carry.