[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Extending Entry Collections [Long]



In the current draft we have two types of Collections, Generic
and Entry. We know that we would like other types of Collections, 
whether they are defined in this spec or another, we know that we want them.

The two Collections, Generic and Entry, are the most general types of
collections we would want. Any other type of collection is just going
to be a subclass of either 'generic' or 'entries'. That is, a
collection that just accepted images would just be a subclass of
'generic'. A collection of Book or Music entries would just be a
subclass of 'entries'.

Thomas Broyer's PaceCollectionsAcceptedMediaTypes takes care of
defining how to create new Generic Collection subclasses.  For
example, think of Generic  Collection as having mediatype of "*/*", a
collection just for images would have mediatype="image/*".

That only leaves a method for subclassing Entry Collections. To frame
the discussion look at the Music, Book, People and Link collections
that the TypePad API implemented:

  http://www.sixapart.com/pronet/docs/typepad_atom_api#Atom_Entry_Elements

When I wrote the draft that the TypePad implementation is based on,
this is exactly the type of extensions I was thinking of. It was
precisely what I wanted to see. At this point I have to admit that I
feel like the kid that wakes up on Christmas morning and gets exactly
the present I asked for, except that I didn't see the "Not a flying
toy" label on the box.

I've been mulling the problem for about a year now, and only recently
figured out what bothers me so much about the SixApart extensions.
[This is not to cast aspersions on anybody at SixApart. They extended
the APP in exactly the way I had intended. And it works. I just think
that there is a better way.]

The problem is that those extensions require boiling the ocean. For
that new type of collection to be used both the client and server need
to be programmed to understand the extension. Both client and server
need to be programmed with the new namespace and elements and what
they are supposed to mean. And then if any of that information gets
syndicated then all the aggregators will need to be upgraded. That's a
serious about of ocean boiling.

At this point I'd like to bring to your attention an article Tantek just wrote:

  "Avoiding plain XML and presentational markup"
   http://tantek.com/log/2005/07.html#d24t1935

To quote a small bit:

"""The marketing message of XML has been for people
    to develop their own tags to express whatever they wanted, 
   rather than being stuck with the limited predefined tag set in 
   HTML. This approach has often been labeled "plain XML" or
   "generic XML" or "SGML, but easier, better, and designed
   just for the Web".

   The problem with this approach is that while having the
   freedom to make up all your own tags and attributes sounds
   like a huge improvement over the (mostly perceived) limits of  
   HTML, making up your own XML has numerous problems,  
   both for the author, and for users / readers, especially when  
   sharing with others (e.g. anything you publish on the Web) is important. "'"

The solution, of course, is microformats. To subclass an Entry
Collection you declare what type of microformat must be in the HTML of
the entry:content. Yeah, it's just that simple.

For exaple, look at the SixApart extensions for Books or Music. Those
are just reviews really. And there's already work being done on an
hReview microformat. There's even review format for books. To define a
'book' Collection type you just need to define a collection as
requiring that each entry:content must contain an hReview for a Book.

Let's look at actual documents, and try to create an extension to the
basic protocol to handle a Book Collection. The Introspection Document
needs to be extended so that for Collections with contents="entries"
there is an additional attribute "class" which tells which kinds of
entries are allowed:

<?xml version="1.0" encoding='utf-8'?>
<service xmlns="http://purl.org/atom/app#";>
  <workspace title="Main Site" > 
    <collection contents="entries" title="My Blog Entries" 
      href="http://example.org/reilly/feed"; />

    <collection contents="entries" 
                     title="Movies" 
                     class="movies"
      href="http://example.org/reilly/movies"; />
  </workspace>
</service>

Now the 'class' attribute maps to a IETF registry via a URI in a manner
similar to the 'rel' values for the link element. That registry defines
"http://...some-ietf-regisgtry-name.../movies"; to be a Collection that
only accepts entries that contain hReview's of movies. That unique URI
we'll call the Collection Subclass URI.

To create a Movie entry we POST an entry to
http://example.org/reilly/movies. It would look like this:

  <entry>
  <title>Movie Review</title>
    <link href="http://example.org/2003/12/13/atom03"/>
    <id>urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a</id>
    <updated>2003-12-13T18:30:02Z</updated>
    <summary>A review</summary>
    <author> 
        <name>John Doe</name>
    </author> 

    <content type="xhtml" xml:lang="en" xml:base="http://bitworking.org/";>
        <div xmlns="http://www.w3.org/1999/xhtml";>
         
          <div class="hreview">
              <span class="reviewer">
                  <span class="fn">anonymous</span>, 
                  <abbr class="dtreviewed" title="20050418">April
18th, 2005</abbr>
              </span>
              <div class="item">
                  <a lang="zh" class="url fn"
href="http://www.imdb.com/title/tt0299977/";>
                      Ying Xiong (<span lang="en">HERO</span>)
                  </a>
              </div>
              <div>Rating: <span class="rating">4</span> out of 5</div>
              <blockquote class="description"><p>
                  This movie has great music and visuals.
                  </p></blockquote>
              </div>
        
      </div>
    </content>
  </entry>

Note that all of the subclassing action takes place in the 'content'
element. This has several advantages:

1. When that content is syndicated, the aggregator need not 
   know about the hReview microformat to render that entry. 
   If it *does* know about the hReview microformat it can 
   then take advantage of that information to do more with that entry,
   such as looking up the entry on Amazon, or formatting the content
   differently.
2. The APP server-side doesn't really need to know about the 
    hReview microformat, since it just passes that information 
    along in the 'content' element.  Just like in the case of the aggregator
    *if* the server does know about the microformat then it can start
adding value,
    but nothing *requires* it.
3. The client doesn't really require explicit programmed in knowledge of the
    microformat either. As long as the user can enter in straight HTML
then they
   should be able to post to any Entry Collection subclass, as long as
they know the
   proper microformat.

Lot's of stuff working, and very little ocean boiling.

Note that the IETF registry could take a large roll in 
making all of this work smoothly. The Collection Subclass
URI could be derefenced. It could be an HTM document. In that HTML we
could place a 'template' microformat, which is a container for a template
for the microformat that the collection requires. If an APP clients comes across
a Collection Subclass that it knows nothing about it can always try to
dereference
the Collection Subclass URI and retrieve the template for the
microformat to be used
and present that to the user to fill in.

This technique is already working today, for example, rel="tag'
tagging of your entries and having them appear in the technorati
'tags' listing works today.

    http://technorati.com/tag/microformats

I also recently demonstrated how to move encrypted content around
using microformats:

   http://www.xml.com/pub/a/2005/07/13/secure-rss.html

Note that neither my blogging software nor my aggregator of choice
needed to be upgraded to achive my goals.

  Thanks,
  -joe

-- 
Joe Gregorio        http://bitworking.org