[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Shipping Atom products prematurely
* Paul Hoffman / IMC <phoffman@xxxxxxx> [2004-08-17 19:43-0700]
> - We expect that the extensibility will be cleaner and better defined
> so that small markets will be able to use Atom easier than previous
> RSSs.
Really? That'd be great, but I've not seen much evidence that
the WG takes that view. I'm comparing with RSS 1.0's RDF/XML-based
extensibility framework; the other flavours don't really have one,
except for RSS 2.0's "use namespaces and you're on your own".
Extensibility comes at a price, and designing something cleaner and
better than RDF will be an interesting endeavour. Finding a clean and
syntactically graceful way of mapping _into_ RDF also comes at a price,
as does (of course) simply using full RDF/XML syntax. The pre-IETF
Atom community decided some time ago that they didn't find RDF/XML
an attractive proposition, which I guess means we're going the route of
defining an extensibility model that is somehow better than RSS 1.0's.
Hmm so what were the key benefits of RDF extensibility in RSS 1.0?
- (fairly) predictable XML notation; RSS 1.0 defined a profile of the
RDF/XML syntax, so that namespace-extended feeds all shared a
basic structure. (rather than allowing all RDF's syntactic
variations).
- Supported free combination of independently developed descriptive
vocabulary (manifested as RDF/XML-based namespaces). RSS 1.0 feeds
can carry extra markup describing things in the world beyond
syndication, such as people, places, movies, bank accounts. Element
names in the markup correspond to classes (categories) and properties
(fields, relations etc) defined by any RDF vocabulary that proves
useful.
- The external vocabularies a feed draws upon do not need to be defined
with RSS 1.0 (or Atom or newsfeed syndication) in mind. Or be tightly
coordinated amongst themselves. There is a tightly-defined model and
simple-minded (additive) model for explaining how these independent
namespaces interact when deployed together.
What were the problems / drawbacks with RSS 1.0's RDF extensibility?
- explaining the XML-level constraints on markup structures amounted
to the need to present a mini-tutorial on RDF's syntax rules, since
RSS 1.0 used RDF's standard XML encoding. This involves unenviable
tasks like explaing RDF's "striped" XML style (see
http://www.w3.org/2001/10/stripes/ ), and trying to summarise the
rules for when you use "rdf:about=" versus "rdf:resource="
attributes.
- when RSS 1.0 shipped (4 years ago) there weren't many RDF
vocabularies, software libraries were less mature, and the RDF
specs hadn't gone through the RDFCore cleanup (which finished Feb'04).
Sites like http://www.schemaweb.info/ show that there are a
growing number of vocabularies, but many are still a bit drafty.
- RSS 1.0 was perhaps a little too minimalist, forcing people to use
extensions for things that a broader syndication-oriented vocab
could have included in a more generous core.
- The only widely used extension for carrying hypertext content in
RSS 1.0 was 'content:encoded', which somewhat opts-out of
the XML world. RDF in 2000 was a bit vague on how to deal with
namespaces, xml:base, xml:lang andther canonicalisation issues
relating to "literal XML" content. (addressed in Feb'04 specs)
- the blogging use case (which dominated RSS deployment and evangelism)
didn't have as much to gain from a powerful extensibility framework
as those apps which sometimes get called 'synthetic feeds'.
> None of that might spin your beanie. One big reason many of us are
> working on Atom is that we believe that RSSish syndication can become
> a major communication mechanism for literally decades to come (look
> at RFC 822 and MIME, for instance). If that is true, it should have
> the most polished, most reviewed, and clearest spec possible.
Yes, that's what draws me to RSS 1.0 and its siblings. One lesson from
the non-blogging use cases (job adverts, movie listings, bank account
feeds, etc.) that get some of us enthused is that the most interesting
bits of the markup are those which use (possibly various) non-RSS
namespaces. RSS is the least interesting, and simplest, part of the
document.
I think the recently announced "Nature" RSS 1.0 feeds might repay study,
as a way of thinking about the tradeoffs around AtomPub's extensibility
goals.
See http://www.nature.com/rss/
There are feeds there from scientific journals, and for job listings in
the sciences. The extensions used are both useful, and evocative. They
immediately suggest further extension ideas.
eg. http://www.nature.com/nrc/journal/v4/n8/rss.rdf
[[
<item rdf:about="http://dx.doi.org/10.1038/nrc1424">
<title>TUMOUR VIRUSES: A genetic switch</title>
<link>http://dx.doi.org/10.1038/nrc1424</link>
<description>Kristine Novak</description>
<dc:title>TUMOUR VIRUSES: A genetic switch</dc:title>
<dc:creator>Kristine Novak</dc:creator>
<dc:identifier>doi:10.1038/nrc1424</dc:identifier>
<dc:source>Nature Reviews Cancer 4, 572 (2004)</dc:source>
<dc:date>2004-08-01</dc:date>
<prism:publicationName>Nature Reviews Cancer</prism:publicationName>
<prism:publicationDate>2004-08-01</prism:publicationDate>
<prism:volume>4</prism:volume>
<prism:number>8</prism:number>
<prism:section>Highlights</prism:section>
<prism:startingPage>572</prism:startingPage>
</item>
]]
or, more interesting: http://www.nature.com/naturejobs/jobs/biologicalsciences.rdf
[[
<item
rdf:about="http://naturejobs.nature.com/texis/jobsearch/details.html?id=411109c54a01090&lookid=nature">
<title>University of Sheffield: Research Associate</title>
<link>http://naturejobs.nature.com/texis/jobsearch/details.html?id=411109c54a01090&lookid=nature</link>
<description>Research Associate, University of Sheffield, Sheffield,
United Kingdom. Posted on 4 August 2004.</description>
<nj:advertises>
<nj:Job>
<nj:offeredBy>University of Sheffield</nj:offeredBy>
<nj:title>Research Associate</nj:title>
<nj:city>Sheffield</nj:city>
<nj:country>United Kingdom</nj:country>
</nj:Job>
</nj:advertises>
<nj:postedOn>2004-08-04</nj:postedOn>
<nj:expiresOn>2004-09-02</nj:expiresOn>
</item>
]]
So already we're talking about the syndicating online representations of
articles, digital rights, page numbers, authors, cancer, jobs, cities,
locations and expiry dates for job applications. All of those things
are (a) beyond the immediate and future scope of the AtomPub group (b)
described in different levels of detail by different parties for
different purposes. Eg. jobs are based in places; they have associated
skills, which might be picked out in cross-domain subject schemes or in
domain specific details; places have lat/long info, which can be
modelled in painful detail or very crudely. XML description here is a
task without end, because we're trying to create a marketplace where it
is possibly for increasingly rich descriptions to be mixed together
in as sane a fashion as possible.
I'd be interested to take this
http://www.nature.com/naturejobs/jobs/biologicalsciences.rdf feed as a
test for AtomPub's extensibility goals. Would we hope to do a
cleaner/better job than RSS 1.0 here. Perhaps in the future,
scientifically minded job hunters will be able to go do a search on jobs
in their particular speciality and see the results using
geographically-oriented tools. I'd love to see Atom become a transport
for all this...
cheers,
Dan