[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Added proposal to make xml:base generally applicable



On Fri, Feb 27, 2004 at 08:40:22AM -0500, Sam Ruby wrote:

> >>If xml base is everywhere, then how should the following line in my atom 
> >>feed be interpreted?
> >>
> >><issued>2004-02-26T09:15:36-05:00</issued>
> >
> >What's the data type of the contents of <issued/>? A W3CDTF (from
> >memory); xml base doesn't apply to this type.
> 
> Agreed.  However, I think that enumeration would make this clearer and 
> simpler.

We need to enumerate which elements have content of type URI, agreed.
 
> >I think we'd need to specify which parts of the Atom vocabulary are
> >URIs in the spec (and in the schema)? (Which is enumeration.) And then
> >define that the base URI for <content/> data is evaluated using XML
> >base processing on the <content/> element. (Which is by rule.) So
> >"both". Which is possibly what you meant to say.
> 
> It is not simply content, but also title and summary.

Sorry, I meant to say something like "and then list elements such that
the base URI for their content data is evaluated using XML base
processing on that element", with the list being 'content', 'title',
'summary', assuming we don't identify any more.

In my opinion this is the fiddly bit; I would naturally assume that if
an element FOO had data of type URI in an XML syntax that used XML
base anywhere, that XML base processing would be used for the URI data
of FOO elements. (In particular this means that extension elements
that have data of type URI should get XML base processing.)
 
> >I don't have time to look up the specs, but are we going to get nasty
> >clashes with BASE processing in HTML, or will the above fit with it
> >nicely?
> 
> This also needs to be spelled out.

Found some time :)

Reading RFC 2396, and the HTML 4.01 spec, the BASE uri in HTML is
determined by (HTML 4.01 12.4.1) the following, in decreasing order of
precedence:

 * as set by the BASE element
 * as given "by meta data discovered during a protocol interaction"
 * the URI of the document (if this exists)

(3) doesn't apply in the case of Atom, because the HTML document
inside, for instance, atom:content, does not have a URI[1]. (1) is
outside the scope of Atom. (2) is where Atom comes in, and is
equivalent to the "base URI from the encapsulating entity" (5.1.2) of
RFC 2396: "base URI of a document is defined by the document's
retrieval context".

So what we /presumably/ need to do is to specify that in Atom, the
retrieval context supplies a base URI using XML base processing on the
encapsulating element. The RFC goes on to say:

> For a document that is enclosed within another entity (such as a
> message or another document), the retrieval context is that entity;
> thus, the default base URI of the document is the base URI of the
> entity in which the document is encapsulated.

There's an annoying problem here that 'entity' isn't defined in RFC
2396, but I think it's valid in the context of XML to consider the
encapsulating XML element (eg: atom:content) to be the encapsulating
entity).

I don't think there's a conflict here. We need to be very clear in the
spec, of course. Aggregator authors that launch an external HTML
viewer for encapsulated HTML may (depending on the HTML viewer) need
to mangle the HTML to add a BASE element if there isn't one already
present (or if there are relative URI references before the BASE
element, as I understand HTML).

[1] At least without doing something using XPointer, which I don't
think would yield a valid base URI anyway.

James

-- 
/--------------------------------------------------------------------------\
  James Aylett                                                  xapian.org
  james@xxxxxxxxxxxx                               uncertaintydivision.org