[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Option #1: Using WebDAV with Atom




At the community meeting, there was a discussion of different mechanisms for the Atom API, especially regarding multiple resources; e.g., being able to post an atom entry along with a picture, a PDF or other associated but distinct things. There was also some discussion of how this problem relates to that of updating feed entries themselves.


Option #1 on the board was to do nothing, with the presumption that people would find other means to upload images. An elaboration of this was to leverage an existing mechanism, WebDAV <http://www.webdav.org/>, and recommend that it be supported by tools, to promote interoperability.

This note explains how that would happen, and explores the advantages and disadvantages of doing so.

* Uploading Resources

At its simplest, WebDAV is really just HTTP PUT; you can upload some data to a specific URI like this:

---8<---
PUT /stuff/picture.jpg HTTP/1.1
Host: www.example.com
Content-Type: image/jpeg
Content-Length: nnn

... image here ...
-->8--

Subsequently, a GET to the URI (http://www.example.com/stuff/picture.jpg, in this case) will retrieve the picture. Another useful feature of WebDAV is the ability to get a listing of a "directory", with PROPFIND:

---8<---
PROPFIND /stuff/ HTTP/1.1
Host: www.example.com
Depth: 1
-->8---

This is asking the Web server for the properties of everything inside of "/stuff". The response will be an XML document containing an entry for each such item, listing properties such as its name, owner, last modification date, etc. The format of the XML document is specified by WebDAV.

Furthermore, WebDAV allows you to DELETE, MOVE and COPY resources with simple HTTP methods; e.g.,

--8<--
DELETE /stuff/picture.jpg HTTP/1.1
Host: www.example.com
-->8--

gets rid of the picture. In every case, the HTTP status code of the response tells you whether your request was successful; in the case when your request can affect many resources (e.g., asking to delete a directory full of images), WebDAV defines a 'multistatus' XML format that shows you what happened to each one.

For the purposes of Atom, this allows resources to be uploaded and manipulated in a simple fashion. Depending on our requirements, we could profile WebDAV so that only a subset of its features are required for Atom; the simplest possible such profile would be only to specify "PUT", relying on the user to interpret errors as the need to rename the submitted file.

One requirement that we'll likely run into is the need for the server to tell the client where it's allowed to upload things. One way to satisfy this would be to define a new type of link:

<link rel="base.upload" href="http://www.example.com/stuff/"/>

that would be used as a base URI for any such uploads; in other words, upon seeing this link tag, clients would know that they can upload anything with a URI that begins "http://www.example.com/stuff/";.

* Updating Feeds

Another and somewhat more experimental way to use WebDAV is to update the feed itself. For example, if your feed URI is
http://www.example.com/feed.atom
one might add a new entry by PUTting to
http://www.example.com/feed.atom/entry_id
where 'entry_id' is the identifier for the entry; when a new entry is PUT as a sub-resource of a feed (i.e., as a child of it, from a URI perspective), the feed is automatically updated.


This is attractive because individual entries can be fetched just by GETting them. It's also extremely easy to support from client toolkits; they need only PUT the entry to the correct URI.

One problem to overcome with this approach is the management of the namespace; effectively, the server is giving control of identifiers below the feed URI to the client, and the client needs to be able to pick a new entry identifier without danger of a collision with an existing one.

One way to address this would be to allow the client to fetch a complete list of the identifiers in use, so it can maintain and update that state as needed (not necessarily upon every POST). WebDAV multistatus, as returned by PROPFIND, would be a format for doing so; we could also adapt Atom itself to this purpose (with GET instead of PROPFIND as the method).

Another, simpler way would be for the client to optimistically choose an identifier and tell the server to fail if it already exists. In HTTP, this is done with If-None-Match:

-->8--
PUT /feed.atom/my_entry HTTP/1.1
Host: www.example.com
If-None-Match: *
Content-Type: application/atom+xml

...entry XML here...
---8<--

This basically says "put this at /feed.atom/my_entry", as long as there isn't something already there." If there is, a 412 Precondition Failed status code will be returned. My assumption would be that the URI used (http://www.example.com/feed.atom/my_entry, in this case) would also be the unique identifier for that item, as an Atom entry (without the If-None-Match header, the client would be able to update the entry regardless of whether it exists).

Note that there may be additional, client-controlled structure; e.g., the URI could just as easily be 'http://www.example.com/feed.atom/2004/06/04/my_entry'.

* Advantages

If we were to take this route, there would be very little to do, in terms of specifying new behaviours, mechanisms, etc.; HTTP and WebDAV provide much of what we need, and are both well-tested and widely deployed. The mechanisms are simple to understand and implement.

Because everything (including entries, in the latter example) has a unique URI, publishers and users would be able to take full advantage of Web mechanisms such as caching, access control, linking, etc.

Finally, having the client control the namespace on the server also seems to fit in well with current uses of syndication; in my experience, the user wants to control what the URI will be.

* Disadvantages

Full WebDAV support is usually implemented on top of a filesystem. While this would be appropriate for our purposes, we don't want to require that people have WebDAV enabled to use Atom (indeed, I've been looking for a WebDAV hosting service for quite a while, and haven't had much luck beyond Apple's iDisk). Luckily, WebDAV can be implemented on top of CGI quite easily, especially if we profile its functionality; in a minimal case, it's just supporting the PUT and GET methods in a CGI script.

For example, a script called "stuff.cgi" could be placed on the Atom server, so that a requests to PUT and GET "/stuff.cgi/picture.jpg" send back the appropriate responses.

The caveat -- and biggest downside -- here is that a CGI implementation would have to handle the GETs as well, potentially leading to load problems.

While this could be avoided through some hackery with mod_rewrite (i.e., by rewriting PUT requests to a script that writes files in the appropriate place to handle GETs off the filesystem), or a more efficient mechanism like mod_perl or mod_python, doing so probably isn't within the grasp of a typical novice.

* My thoughts

Using WebDAV seems like an interesting, conceptually simple way to extend our use of the Web; however, because of the state of support for it and problems that people will likely encounter on the server, it may not be the best solution on its own.

That said, Atom needs to do very little to enable it; in the upload case, it would be as simple as adding a new link tag to the API spec. If servers were advanced enough to support it (e.g., they had mod_dav enabled), they could advertise this fact; clients can choose to take advantage of it or not, depending on whether they support PUT.

As such, I think Atom should specify a simple way to advertise support for PUT uploads (probably *not* full WebDAV, at least at this stage), but it should also specify a non-PUT mechanism, such as the #6 (POST) approach.

Regards,

--
Mark Nottingham     http://www.mnot.net/