[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
<id> and <link> are broken! (was RE: base and atom:id)
Let me try putting this point a slightly different way (with a more dramatic
subject). Though it's a valiant effort to solve a problem, I don't think
using one URI to identify an entry and another URI to identify a
representation of that entry works. I think we should be looking at a single
URI for the entry, with dereferencing treated as most other http-oriented
URIs. The retrieved representation should be that of the URI in question,
rather than having one URI act as a proxy for another.
If there are other resources related to the entry (such as previous
versions), sure, provide other URIs for them. Once the primary entry
resource and other entry resources are cleanly separated, then maybe (and
only maybe) it might make sense to avoid having a http-retrievable
representation of the former. But if this is the path taken, then it needs
to be absolutely clear in the spec that we are talking about separate
resources - an entry and related resources, even an entry and its
constituent parts.
I've no problem with something being described by means of relationships
between resources (I am a fan of RDF after all...), I just don't think the
way it's being done in the current spec is sound.
Here are a few fragments that suggest that <id> and <link> are at best
poorly defined (though I'd suggest this is a symptom of a more fundamental
breakage in the model):
>From the spec: "The 'atom:id' element's content conveys a permanent,
globally unique identifier for the entry." The implication has been that
this may, even should be without a retrievable representation. An entry is a
discrete chunk of information. Ok, it can have a database-style primary key.
But the information is being provided on the web. Why on earth should there
*not* be a http-retrievable representation?!
"The 'atom:link' element is a Link construct that conveys a URI associated
with the entry." Is this the association between an identifier and a
resource, or not? If not, what exactly is it?
The definition of <id> includes the line: "It MUST NOT change over time,
even if other representations of the entry (such as a web representation
pointed to by the entry's atom:link element) are relocated." But nothing is
really located by atom:link - it's a URI, an identifier. If we're saying
that there's some retrievable data by applying a http GET on it, how does
that data relate to the entry? If it's a representation of the entry, it
should be identified as such, i.e. by the URI of the entry. But the entry
identifier is that in <id> - we're also saying that the <id> and <link> can
be different. They identify different resources, by definition. Sure, the
representation of different URIs may be the same - but that's not what's
been said here.
The above is why I think the current model doesn't work against regular web
architecture. But plenty of things that have an unconventional model can
work. So let's try a simple example - I want to put a blog post in an Atom
feed.
title: Penguin Paradise
permalink: http://example.org/blog/post123
text: The <a href="http://arctic-circle.com">Arctic Circle</a> is cool!
What goes in <id>?
The permalink is a "permanent, globally unique identifier" for the blog
post, is that the same as the entry? Maybe. Or should I make up a new,
unrelated identifier and forget this existing one..?
What goes in <link>?
Ok, there are two URIs here, both associated with the entry. "The nature of
the relationship as well as the link itself is determined by the element's
content." The relationships are [the permalink for the blog post
corresponding to the entry] and [the resource that the entry text
describes]. So how do I express these relationships in <link> elements? The
spec does say one of the <link> elements must have the "alternate"
attribute - so I guess that'll be the permalink. But what if there were
several blog posts (different addresses, same content) corresponding to the
entry, how can that be reconciled with "atom:entry elements MUST NOT contain
more than one atom:link element with a rel attribute value of "alternate"
that has the same type attribute value."
Anyhow, time passes, and I discover the natural history is iffy, so I change
the entry to read :
title: Polar Bear Heaven
permalink: http://example.org/blog/post123
text: The <a href="http://arctic-circle.com">Arctic Circle</a> is cool!
Is this the same entry? Yes. Is it the same resource? Arguably not, because
there's been a significant change to it. Should <id> and <link> remain the
same? The permalink is the same. No idea according to the spec. Reading
between the lines the <id> would probably remain the same, with <link>
changing. But there's a whole range of possibilities for modification. At
one extreme, every single piece of data in the entry could change, at the
other a spelling mistake could be corrected or a bit of extra whitespace
removed. At which point does it become a different entry? At which point
does it become a different resource, and require another <id> URI? Without
clean separation between the resources involved, it's impossible to tell.
Cheers,
Danny.
> > But I do think there still needs to be some more low-level
> > clarification
> > around <id> and <link>. There seems to be a kind of tacit
> > understanding that
> > <id> will be a static URN and <link> will be a URL that might change.
> > I'm
> > not sure this is altogether consistent with general web architecture,
> > at
> > best it's confusing.
>
> At best, it's a perfectly sound idea. There is absolutely no technical
> reason to:
> a) Require each identifier to resolve to an HTML representation of a
> resource.
> b) Require the address of each entry's HTML representation to have all
> of the special properties its identifier must have (persistence,
> uniqueness, etc)
>
> The confusing thing would be to conflate the two functions into one
> string.
>
> > I've a feeling (which is reflected on the Wiki page) that essentially
> > <id>
> > is an attempt to mimic RSS 2.0's <guid>. The web has a far better
> > construct
> > for the purpose, the URI. GUIDs are fine if you want to make sure
> > things are
> > unique on your LAN, but the network has grown a little since the
> > 1980's.
>
> But aren't we already using URIs?
>
> Graham