[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: well-formedness error
* Tim Bray wrote:
>Having said that, Walter is right and it's useful to distinguish
>between the conditions where the XML content is intrinsically
>ill-formed and those where the header is borked.
If it is possible to specify higher level protocol information it is not
possible to distinguish between ill-formedness and a borked header, for
example for the HTTP message fragment
Content-Type: application/xml
<x> </x>
or
Content-Type: application/xml
<x><!--+APY---></x>
you can either say that that the document is ill-formed or that the
header is borked as it lacks a charset="utf-8+names" resp. charset=
"utf-7" parameter just like you can say that
Content-Type: application/xml
<?xml version="1.0" encoding="us-ascii"?>
<Björn/>
is not well-formed or that it has a borked header or that it has a
borked encoding declaration. I fail to see why it could be wrong to
say that a document is ill-formed if it is not well-formed and that
a document is well-formed if a XML processor processes the document
without encountering a fatal error.
If you want to apply the strict definition of well-formedness in the
XML 1.0 Recommendation you can not ever talk about the well-formedness
of data objects received through binary transmission or from binary
storage as well-formedness is a property of character sequences which
you only get after decoding the octet sequence which requires to know
the encoding which you do not know if you assume that there is anything
wrong with the encoding specification(s).
If there is any need for new terminology here this should be brought
to the attention of the W3C XML Core Working Group, Atom should not
invent its own terminology for XML or HTTP concepts.