[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: well-formedness error




Dare Obasanjo wrote:


-----Original Message-----
From: Danny Ayers [mailto:danny666@xxxxxxxxxxx] Sent: Wednesday, June 16, 2004 12:58 PM
To: Dare Obasanjo
Cc: atom-syntax@xxxxxxx
Subject: Re: well-formedness error



It should help to have a single MIME type rather than the 3+ that have been specified around RSS.


But we were talking about the issue of XML well-formedness rather than any other measure of validity. Correct me if I'm wrong, but the MIME type itself shouldn't influence that, unless a false charset declaration is made. Are you suggesting that this is the case in 25-50% of feeds?



Yes.


Like I just said, I tested 10 feeds at random from
http://www.intertwingly.net/wiki/pie/ListOfFeeds and of the 10


3 would be rejected flat out since their MIME type claimed they weren't
XML
4 would be rejected because their MIME contained incorrect or missing
charset information


If I was truly anal of the remaining 3, 2 would be rejected because they
used a MIME type of text/xml instead of application/atom+xml. Leaving
only 1 out of 10 Atom feeds chosen at random that was actually
well-formed XML.



These sample figures are pretty grim, but it is early days. At this stage if something is broken it needs a fix, not a workaround.


As far as I am aware, not having the correct MIME *type* does not in itself constitute ill-formedness. If the charset value is wrong (rather than missing), sure, but that doesn't sound like what's happening in your samples. Of the cases you list, how many would have been correctly interpreting by determining the encoding from somewhere other than the MIME type? Basically I don't think it's as cut and dried as you're suggesting - for example, in RFC 3023 it says:

[[

If users would like to rely on the encoding
declaration or BOM and to hide charset information from protocols,
they may determine not to use the parameter.

]]

But in any case, even if every single one of the feeds in ListOfFeeds was delivered in a manner which meant the XML was ill-formed, this still doesn't mean a strict approach is a bad thing. It simply means more work is needed to reach a good standard, and that certainly won't happen by encouraging pseudo-AI interpreters to enable bad feeds to appear equivalent to good ones.

Cheers,
Danny.

--

Raw
http://dannyayers.com