[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Some text that may be useful for the update of RFC 2376
"Martin J. Duerst" wrote:
>
> I came up with this for a different purpose, but Dan Connolly
> suggested it might be added to an update of RFC 2376, as a
> quick overview:
Here is my suggested amendment, which removes dubious wiggle room and
weasel words and makes the result purely deterministic:
> - XML sent (e.g. mail, http) as text/xml (or equivalent, e.g. text/vnd.wap.wml):
as text/"anything" in other words
> - Charset parameter is strongly recommended
Charset parameter is required if the charset is not UTF-8 or UTF-16
> - If no charset parameter, default is ASCII. The default of iso-8859-1 in
> HTTP is explicitly overridden in the specification of the charset
> parameter in section 3.1 "Text/xml Registration" of RFC 2376
> (http://www.ietf.org/rfc/rfc2376.txt)
The charset (not default, but THE charset) is UTF-16 (if BOM) or UTF-8 (if
no BOM) and the "default" of iso-8859-1 in HTTP and US-ASCII in mail is
explicitly overridden ...
> - No error handling provisions
> - An encoding declaration, if present, is irrelevant, but when saving a
> received resource as a file, the correct encoding declaration should
> be inserted.
shall be inserted.
[if the application claims to save as XML rather than saving as a bunch of
stuff with pointy brackets. If it fails to do so, then the rules for static
storage explains what happens when the file is next parsed - WF error. ]
> - XML sent as application/xml (or equivalent):
> - Charset parameter is strongly recommended, and if present,
> it takes precedence.
Charset parameter is *disallowed*.
> - If the charset parameter is omited, the rules for XML in static storage
> are followed (see below).
The rules for XML in static storage are followed. Such files may be freely
saved to static storage without modification in all cases.
> - XML in static storage without external metainformation (e.g. file):
> - Default is UTF-8, or UTF-16 if there is a BOM
For files without an explicit encoding declaration, the file is in UTF-16
if there is a BOM and UTF-8 if there is not.
> - For other things, there has to be an encoding declaration
> - There is some provision for 'error recovery'. What exactly this
> means is currently under discussion in the XML Core WG, so that
> it can be clarified.
"Some provision"????
There is no provision for error recovery, and if a file does not parse for
whatever reason then it shall be a well formedness error.
--
Chris