[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: well-formedness error




At 3:35 PM -0700 6/16/04, Dare Obasanjo wrote:
> OK, that's bad. Out of curiosity, what did those three claim to be?

application/octet-stream, text/plain & text/html actually.

I guess I should have expected that. One each of "random guessing".


> That seems irrelevant as long as the XML received had the
encoding in the ?xml entry correct for the object itself.

Nope. Read RFC 3023. Specifically


"if the media type given in the Content-Type HTTP header is text/xml,
text/xml-external-parsed-entity, or a subtype like
text/AnythingAtAll+xml, then the encoding attribute of the XML
declaration within the document is ignored completely, and the encoding
is

1. the encoding given in the charset parameter of the Content-Type HTTP
header, or
2. us-ascii."

All 4 feeds claimed to be text/xml with no charset parameter meaning
that I was supposed to treat them as us-ascii even though their actual
encodings were UTF-8, Windows-1252 and ISO 8859-1.

Many realize this is a bug in RFC 3023 but no one seems to have
initiated the process to get the RFC fixed.

Very good points. We are going to have to make the specification in the protocol document *very* explicit, and explain clearly what will happen if people get this wrong.


--Paul Hoffman, Director
--Internet Mail Consortium