[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Option #1: Using WebDAV with Atom
At the community meeting, there was a discussion of different
mechanisms for the Atom API, especially regarding multiple resources;
e.g., being able to post an atom entry along with a picture, a PDF or
other associated but distinct things. There was also some discussion of
how this problem relates to that of updating feed entries themselves.
Option #1 on the board was to do nothing, with the presumption that
people would find other means to upload images. An elaboration of this
was to leverage an existing mechanism, WebDAV <http://www.webdav.org/>,
and recommend that it be supported by tools, to promote
interoperability.
This note explains how that would happen, and explores the advantages
and disadvantages of doing so.
* Uploading Resources
At its simplest, WebDAV is really just HTTP PUT; you can upload some
data to a specific URI like this:
---8<---
PUT /stuff/picture.jpg HTTP/1.1
Host: www.example.com
Content-Type: image/jpeg
Content-Length: nnn
... image here ...
-->8--
Subsequently, a GET to the URI
(http://www.example.com/stuff/picture.jpg, in this case) will retrieve
the picture. Another useful feature of WebDAV is the ability to get a
listing of a "directory", with PROPFIND:
---8<---
PROPFIND /stuff/ HTTP/1.1
Host: www.example.com
Depth: 1
-->8---
This is asking the Web server for the properties of everything inside
of "/stuff". The response will be an XML document containing an entry
for each such item, listing properties such as its name, owner, last
modification date, etc. The format of the XML document is specified by
WebDAV.
Furthermore, WebDAV allows you to DELETE, MOVE and COPY resources with
simple HTTP methods; e.g.,
--8<--
DELETE /stuff/picture.jpg HTTP/1.1
Host: www.example.com
-->8--
gets rid of the picture. In every case, the HTTP status code of the
response tells you whether your request was successful; in the case
when your request can affect many resources (e.g., asking to delete a
directory full of images), WebDAV defines a 'multistatus' XML format
that shows you what happened to each one.
For the purposes of Atom, this allows resources to be uploaded and
manipulated in a simple fashion. Depending on our requirements, we
could profile WebDAV so that only a subset of its features are required
for Atom; the simplest possible such profile would be only to specify
"PUT", relying on the user to interpret errors as the need to rename
the submitted file.
One requirement that we'll likely run into is the need for the server
to tell the client where it's allowed to upload things. One way to
satisfy this would be to define a new type of link:
<link rel="base.upload" href="http://www.example.com/stuff/"/>
that would be used as a base URI for any such uploads; in other words,
upon seeing this link tag, clients would know that they can upload
anything with a URI that begins "http://www.example.com/stuff/".
* Updating Feeds
Another and somewhat more experimental way to use WebDAV is to update
the feed itself. For example, if your feed URI is
http://www.example.com/feed.atom
one might add a new entry by PUTting to
http://www.example.com/feed.atom/entry_id
where 'entry_id' is the identifier for the entry; when a new entry is
PUT as a sub-resource of a feed (i.e., as a child of it, from a URI
perspective), the feed is automatically updated.
This is attractive because individual entries can be fetched just by
GETting them. It's also extremely easy to support from client toolkits;
they need only PUT the entry to the correct URI.
One problem to overcome with this approach is the management of the
namespace; effectively, the server is giving control of identifiers
below the feed URI to the client, and the client needs to be able to
pick a new entry identifier without danger of a collision with an
existing one.
One way to address this would be to allow the client to fetch a
complete list of the identifiers in use, so it can maintain and update
that state as needed (not necessarily upon every POST). WebDAV
multistatus, as returned by PROPFIND, would be a format for doing so;
we could also adapt Atom itself to this purpose (with GET instead of
PROPFIND as the method).
Another, simpler way would be for the client to optimistically choose
an identifier and tell the server to fail if it already exists. In
HTTP, this is done with If-None-Match:
-->8--
PUT /feed.atom/my_entry HTTP/1.1
Host: www.example.com
If-None-Match: *
Content-Type: application/atom+xml
...entry XML here...
---8<--
This basically says "put this at /feed.atom/my_entry", as long as there
isn't something already there." If there is, a 412 Precondition Failed
status code will be returned. My assumption would be that the URI used
(http://www.example.com/feed.atom/my_entry, in this case) would also be
the unique identifier for that item, as an Atom entry (without the
If-None-Match header, the client would be able to update the entry
regardless of whether it exists).
Note that there may be additional, client-controlled structure; e.g.,
the URI could just as easily be
'http://www.example.com/feed.atom/2004/06/04/my_entry'.
* Advantages
If we were to take this route, there would be very little to do, in
terms of specifying new behaviours, mechanisms, etc.; HTTP and WebDAV
provide much of what we need, and are both well-tested and widely
deployed. The mechanisms are simple to understand and implement.
Because everything (including entries, in the latter example) has a
unique URI, publishers and users would be able to take full advantage
of Web mechanisms such as caching, access control, linking, etc.
Finally, having the client control the namespace on the server also
seems to fit in well with current uses of syndication; in my
experience, the user wants to control what the URI will be.
* Disadvantages
Full WebDAV support is usually implemented on top of a filesystem.
While this would be appropriate for our purposes, we don't want to
require that people have WebDAV enabled to use Atom (indeed, I've been
looking for a WebDAV hosting service for quite a while, and haven't had
much luck beyond Apple's iDisk). Luckily, WebDAV can be implemented on
top of CGI quite easily, especially if we profile its functionality; in
a minimal case, it's just supporting the PUT and GET methods in a CGI
script.
For example, a script called "stuff.cgi" could be placed on the Atom
server, so that a requests to PUT and GET "/stuff.cgi/picture.jpg" send
back the appropriate responses.
The caveat -- and biggest downside -- here is that a CGI implementation
would have to handle the GETs as well, potentially leading to load
problems.
While this could be avoided through some hackery with mod_rewrite
(i.e., by rewriting PUT requests to a script that writes files in the
appropriate place to handle GETs off the filesystem), or a more
efficient mechanism like mod_perl or mod_python, doing so probably
isn't within the grasp of a typical novice.
* My thoughts
Using WebDAV seems like an interesting, conceptually simple way to
extend our use of the Web; however, because of the state of support for
it and problems that people will likely encounter on the server, it may
not be the best solution on its own.
That said, Atom needs to do very little to enable it; in the upload
case, it would be as simple as adding a new link tag to the API spec.
If servers were advanced enough to support it (e.g., they had mod_dav
enabled), they could advertise this fact; clients can choose to take
advantage of it or not, depending on whether they support PUT.
As such, I think Atom should specify a simple way to advertise support
for PUT uploads (probably *not* full WebDAV, at least at this stage),
but it should also specify a non-PUT mechanism, such as the #6 (POST)
approach.
Regards,
--
Mark Nottingham http://www.mnot.net/