[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: text/xhtml+xml vs. application/xhtml+xml

Rather than trying to channel what the authors of RFC 2046 were thinking, it
seems like we should just ask them.  Specifically, Ned Freed and has been
instrumental in framing the issues we discuss in
<http://www.imc.org/draft-murata-xml>, is (I believe) the IESG member
presenting the draft, and also co-authored RFC 2046.

Ned, my understanding is that text/html is seen in retrospect as problematic
and has led to some of the user unhappiness with MIME, by presenting
information to users that they are unlikely to be able to deal with.  The
referenced paragraph in the draft was meant to capture some of the
discussion on the ietf-xml-mime@xxxxxxx list, though I am on too slow a
modem to find the references.  I suspect that the distinction of application
versus text comes down to whether you expect most users to be like us (who
can heuristically recognize HTML, save to a file with the right file
extension, and then open in a browser) versus my mother, who just gets
annoyed with all of the cruft and gives up.

At the end of the day, I would evaluate based on failure scenarios.  I think
people would rather see an attachment than the source text, and are more
likely to be able to recover from the former.

In principal, your suggestion of registering both text/xhtml+xml and
application/xhtml+xml could enable the document author to decide based on
the fallback behavior desired.  However, this is unlikely to work as
expected due to the widespread practice of mapping MIME types to file
extensions, which provides insufficient granularity.  Also, if we are having
trouble evaluating the tradeoffs, it seems unlikely that most document
authors would understand the subtlety.  Finally, I also subscribe to section
3.2 of RFC 1958 on Architectural Principles of the Internet, which says, "If
there are several ways of doing the same thing, choose one."

		- dan

P.S.  Mark, I presume you will keep the HTML WG in the loop on this

Dan Kohn <mailto:dan@xxxxxxxxxxx>
<http://www.dankohn.com>  <tel:+1-650-327-2600>

-----Original Message-----
From: Mark Baker [mailto:mark.baker@xxxxxxxxxxxxxx]
Sent: Wednesday, 2000-10-18 11:41
To: Dan Kohn
Cc: xml-mime-types@xxxxxxx
Subject: Re: text/xhtml+xml vs. application/xhtml+xml

(HTML WG BCCd - the new w3.org spam filter makes it impractical to CC

Hi Dan,

Dan Kohn wrote:
> Mark, I would appreciate if the HTML WG could provide a little more
> on their thinking, perhaps by adding to discussion to the eventual XHTML
> MIME registration.
> First, I'm not convinced that text/ is the correct top-level type.
> 3 of <http://www.imc.org/draft-murata-xml> says:
>    If an XML document -- that is, the unprocessed, source XML document
>    -- is readable by casual users, text/xml is preferable to
>    application/xml. MIME user agents (and web user agents) that do not
>    have explicit support for text/xml will treat it as text/plain, for
>    example, by displaying the XML entity as plain text. Application/xml
>    is preferable when the XML MIME entity is unreadable by casual
>    users. Similarly, text/xml-external-parsed-entity is preferable when
>    an external parsed entity is readable by casual users, but
>    application/xml-external-parsed-entity is preferable when a plain
>    text display is inappropriate.
>       NOTE: Users are in general not used to text containing tags such
>       as <price>, and often find such tags quite disorienting or
>       annoying. If one is not sure, the conservative principle would
>       suggest using application/* instead of text/* so as not to put
>       information in front of users that they will quite likely not
>       understand.

That's interesting.  I guess I hadn't read that section.  Are you
attempting to update RFC 2046 on this subject?

>From RFC 2046, Sec 4.1;

"Beyond plain text, there are many formats for representing what might
be known as "rich text". An interesting characteristic of many such
representations is that they are to some extent readable even without
the software that interprets them. It is useful, then, to distinguish
them, at the highest level, from such unreadable data as images, audio,
or text represented in an unreadable form. In the absence of appropriate
interpretation software, it is reasonable to show subtypes of "text" to
the user, while it is not reasonable to do so with most nontextual data.
Such formatted textual data should be represented using subtypes of

While this doesn't go into as much depth as draft-murata-xml does, the
HTML WG believes, despite the DOCTYPE/xmlns/HTML-header preamble, that
the bulk (i.e. body) of most XHTML documents will useful, to "some
extent" (per above), to casual users.

> It seems like application/* is thus the safer bet.  Moreover, section 2.11
> of <http://www.w3.org/TR/REC-xml> already standardizes end-of-line
> so the canonicalization of line endings that text/* supports does not seem
> necessary.

True.  That's a small point against text/* handling.  But we feel that
the text/plain fallback is more valuable.

Something we did consider, that we didn't really come to concensus on
(AFAIK) at this morning's call was the possibility of registering both
application/xhtml+xml and text/xhtml+xml, and letting server admins
decide which one wins (or if both are useful).  Any thoughts about that?

> Also, I would like to see some detailed discussion of when to use
> application/xhtml+xml and when to use text/html.  This seems like an
> compatibility challenge of exceeding subtlety, and may deserve more
> attention than it received in your IRC conversation.

I'll follow that up in a separate message, hopefully soon.

> Thanks in advance for any insight you can provide into your and the WG's
> thinking.

No problem.