[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: well-formedness error
Dare Obasanjo wrote:
-----Original Message-----
From: Danny Ayers [mailto:danny666@xxxxxxxxxxx]
Sent: Wednesday, June 16, 2004 12:58 PM
To: Dare Obasanjo
Cc: atom-syntax@xxxxxxx
Subject: Re: well-formedness error
It should help to have a single MIME type rather than the 3+
that have been specified around RSS.
But we were talking about the issue of XML well-formedness
rather than any other measure of validity. Correct me if I'm
wrong, but the MIME type itself shouldn't influence that,
unless a false charset declaration is made. Are you
suggesting that this is the case in 25-50% of feeds?
Yes.
Like I just said, I tested 10 feeds at random from
http://www.intertwingly.net/wiki/pie/ListOfFeeds and of the 10
3 would be rejected flat out since their MIME type claimed they weren't
XML
4 would be rejected because their MIME contained incorrect or missing
charset information
If I was truly anal of the remaining 3, 2 would be rejected because they
used a MIME type of text/xml instead of application/atom+xml. Leaving
only 1 out of 10 Atom feeds chosen at random that was actually
well-formed XML.
These sample figures are pretty grim, but it is early days. At this
stage if something is broken it needs a fix, not a workaround.
As far as I am aware, not having the correct MIME *type* does not in
itself constitute ill-formedness. If the charset value is wrong (rather
than missing), sure, but that doesn't sound like what's happening in
your samples. Of the cases you list, how many would have been correctly
interpreting by determining the encoding from somewhere other than the
MIME type? Basically I don't think it's as cut and dried as you're
suggesting - for example, in RFC 3023 it says:
[[
If users would like to rely on the encoding
declaration or BOM and to hide charset information from protocols,
they may determine not to use the parameter.
]]
But in any case, even if every single one of the feeds in ListOfFeeds
was delivered in a manner which meant the XML was ill-formed, this still
doesn't mean a strict approach is a bad thing. It simply means more work
is needed to reach a good standard, and that certainly won't happen by
encouraging pseudo-AI interpreters to enable bad feeds to appear
equivalent to good ones.
Cheers,
Danny.
--
Raw
http://dannyayers.com