* John Panzer wrote:
I don't think you're saying this, but it sounds like you're saying that
you can simply write the UTF-8 byte sequences in the header. For the
record: The problem here is that HTTP defines header fields to be
Latin-1. Coincidentally, I am currently engaged in debugging a problem
in which someone is sending UTF-8 encoded bytes via an HTTP header,
which then get corrupted, somewhere inside either Apache or mod_jk.
I would argue RFC 2616 is less than clear in this regard and as far as I
can tell there is little consensus among deployed servers and agents how
to interpret this. I would certainly hope a future version of RFC 2616
requires servers to use UTF-8 for the protocol-defined text parts of the
messages and clients to assume a different encoding only if it is not
UTF-8 encoded to accomodate legacy applications as necessary.
Whether or not RFC 2616 is less than clear on this point (I personally
find it clear as mud), there's no ambiguity about what happens when you
send an "X-Foo: eà" header to Apache running mod_jk sending data to a
Tomcat servlet container: It passes the data correctly if you use the
ISO-8859-1 encoding, and it corrupts the data if you use a UTF-8
encoding. At least in our tests. (Note that this happens before the
data leaves the Apache process, so there's not even an opportunity to
fix this at the servlet container level.)