[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: well-formedness error




Danny Ayers wrote:

Dare Obasanjo wrote:


-----Original Message-----
From: owner-atom-syntax@xxxxxxxxxxxx [mailto:owner-atom-syntax@xxxxxxxxxxxx] On Behalf Of Danny Ayers
Sent: Wednesday, June 16, 2004 3:43 PM
To: James Robertson
Cc: atom-syntax@xxxxxxx
Subject: Re: well-formedness error


As an aggregator user I would prefer to use a tool with useful features rather than one that sacrificed them for cleaning up bad feeds.

Since when were these mutually exclusive? Quite frankly, it would be more work for me to parse charset headers in HTTP then reject 7 out of 10 Atom feeds than for me to continue with the status quo in RSS Bandit.

Aren't you using libraries? Don't they recognise ill-formedness?

I'll answer that: yes; no. At least not at a macro level (considering the possibility of external encoding information), only on the micro level (considering the XML document in isolation).


As far as I know, NONE of the popular libraries, on all the popular platforms, and in all the popular languages, take RFC 3023 into consideration. Nor do they provide ANY mechanism for the caller to indicate the "Presence of External Encoding Information" [1].

The way the feed validator accomplishes this function is to actually open the stream, peek at the first few bytes, determine the declared encoding, and actually REPLACE the prolog if necessary to get these to match.

Perhaps somebody out there will know of one or more libraries that actually do provide such support or interface; and if so, I would be interested in hearing about it. But even if such occurs, this would not change the reality that the overwhelming majority of applications and libraries consider the XML prolog to be authoritative.

- Sam Ruby

[1]<http://www.w3.org/TR/2000/REC-xml-20001006#sec-guessing-with-ext-info>