[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Well-formedness statistics




Hello Mark,


Many thanks for the statistics, very helpful, because they show
clear priorities for how to improve things.

But I have to agree with Dare. It is correct that nothing forbids
serving XML documents as text/plain. Indeed, there are very good
use cases for doing so. But serving it as text/plain means
"show this as text", which in turn means "do not parse this,
even if it looks exactly like an XML document". Cases like this
are the reason for having MIME types in the first place. If we
could always just look at the bytes and decide what to do with
a document, we might not even need MIME types.

[now I see that Makoto has already cited chapter and verse for this]

Regards, Martin.

At 11:11 04/06/23 -0400, Mark Pilgrim wrote:

On Wed, 23 Jun 2004 06:59:08 -0700, Dare Obasanjo <dareo@xxxxxxxxxxxxx> wrote:
> So you count a document served as text/plain as well-formed XML?

I can find nothing in the XML specification stating that XML documents
MUST NOT be served as text/plain.  I can find nothing in RFC 3023
stating that XML documents MUST NOT be served as text/plain.  RFC 2046
states that "us-ascii" is the default character encoding for all
"text/*" content types.  RFC 2646 updates 2046 but does nothing to
change the default character encoding.

Therefore, if an XML document
1. is served as "text/*",
2. does not specify a charset parameter in the HTTP Content-type, and
2. is parseable by a conformant XML parser using a "us-ascii" encoding
then I count it as well-formed.

This is the situation as best I understand it.  I would be happy if
someone could provide further clarification on how I should handle XML
served as "text/plain", "text/css", or "text/javascript".

Note that this does not affect the non-well-formedness of your feed at
http://www.25hoursaday.com/rss10.xml, which is served as "text/xml"
and therefore falls squarely under the jurisdiction of RFC 3023.

--
Cheers,
-Mark