Julian Reschke wrote:
Tim Bray schrieb:
On Apr 12, 2007, at 9:33 PM, James M Snell wrote:
I would have absolutely no problem requiring that implementations
support UTF-8 encoding with RFC2047. I also have no problem with the
idea of supporting %-encoding of the Slug value. There's actually
nothing in the spec stopping implementations from doing so today other
than the not being able to predict how servers will handle it.
Unfortunately, I don't think there's really any chance of getting a
consensus on using the %-encoding approach.
I find Asakura-san very persuasive. He's telling us clearly that what
we have there isn't a basis for interoperation. Why don't we just say
"The slug MUST be %-encoded UTF-8?" or "The slug MUST be
RFC2047-encoded UTF-8?", end of story. Neither of them are rocket
science to implement.
I fully agree that we need one encoding everybody can rely on. For
that, it's sufficient to say that the server must accept UTF-8 encoding
(similarly to XML requiring support for UTF-8/UTF-16).
I don't think you're saying this, but it sounds like you're saying that
you can simply write the UTF-8 byte sequences in the header. For the
record: The problem here is that HTTP defines header fields to be
Latin-1. Coincidentally, I am currently engaged in debugging a problem
in which someone is sending UTF-8 encoded bytes via an HTTP header,
which then get corrupted, somewhere inside either Apache or mod_jk.
(Presumably somebody somewhere is doing something other than a byte
copy.) So you need to define something that sneaks Unicode data past.
As for the URI escaping proposal: I made the same proposal a long time
ago, for the same reasons (my doubts of people getting RFC2047 support
right). Back then I was told that there are libraries out there doing
it for us, and that this is no problem :-).
Note that WebDAV Versioning (RFC3253) uses %-escaping in the "Label"
header exactly for that reason
(<http://greenbytes.de/tech/webdav/rfc3253.html#rfc.section.8.3>).
Thanks for the reference!
|