[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: well-formedness error




On Sat, 19 Jun 2004 13:16:40 +0200, Danny Ayers <danny666@xxxxxxxxxxx> wrote:


Mandating utf-8 might make a simple pragmatic answer to the whole problem.

I think this is a good idea. Or at least not put constraints upon 'Content-Type: text/plain' which makes it impossible to override in the XML PI. Having read Dan Brickley's history search findings, I can't see a problem with mandating either:


  If a document is served as 'text/xml' without a corresponding
  'charset' parameter, the character set SHOULD be considered to
  be UTF-8.

  If a document is served as 'text/xml' without a corresponding
  'charset' parameter, the character set SHOULD be considered to
  be US-ASCII unless something else is stated in the XML
  processing instruction.

I think the first option would be the simplest one to deploy (all US-ASCII is valid UTF-8, so it won't break anything), but the second would maybe be the «safest» one. I dunno. What I do know, however, is that I think it's evil that 'text/xml' is US-ASCII, and that it's so frikkin' hard to get web servers to serve the correct character set (dynamically) for different formats.

The <Atøm> idea is rather elegant ;-)

Other than that it would make for a really wierd pronouncation in Norway, where the 'ø' is regularly used (see my name for examples). :-p


--
Asbjørn Ulsberg         -=|=-        asbjornu@xxxxxxxxxxx
«He's a loathsome offensive brute, yet I can't look away»