Re: Well-formedness statistics

On Wed, 23 Jun 2004 11:11:54 -0400
Mark Pilgrim <pilgrim@xxxxxxxxx> wrote:

> This is the situation as best I understand it.  I would be happy if
> someone could provide further clarification on how I should handle XML
> served as "text/plain", "text/css", or "text/javascript".

The most authoritative documents are MIME RFCs.  RFC 2045 says:

   The purpose of the Content-Type field is to describe the data
   contained in the body fully enough that the receiving user agent can
   pick an appropriate agent or mechanism to present the data to the
   user, or otherwise deal with the data in an appropriate manner. The
   value in this field is called a media type.

   HISTORICAL NOTE:  The Content-Type header field was first defined in
   RFC 1049.  RFC 1049 used a simpler and less powerful syntax, but one
   that is largely compatible with the mechanism given here.

   The Content-Type header field specifies the nature of the data in the
   body of an entity by giving media type and subtype identifiers, and
   by providing auxiliary information that may be required for certain
   media types.  After the media type and subtype names, the remainder
   of the header field is simply a set of parameters, specified in an
   attribute=value notation.  The ordering of parameters is not

RFC 2046 says:

    (1)   text -- textual information.  The subtype "plain" in
          particular indicates plain text containing no
          formatting commands or directives of any sort. Plain
          text is intended to be displayed "as-is". No special
          software is required to get the full meaning of the
          text, aside from support for the indicated character

In my interpretattion of these paragraphs, MIME entities labelled as 
text/plain are plain text and intended to be displayed "as is".

A collateral evidence is the handling of fragment identifiers.  In 
RFC 2396, the syntax and semantics of fragment identifiers depend 
on media types.  This also means that media types are authoritative.

So, I think that MIME entities labelled as text/plain are not well-formed 
XML documents.


MURATA Makoto (FAMILY Given) <EB2M-MRT@xxxxxxxxxxxxxxx>