From: Erland Sommarskog (sommar-usefor@algonet.se)
Date: Sat Jul 14 2001 - 14:23:47 CDT
[By the way, you may have noticed the funky address of kairos.kairos.algonet.se
in my messages. I cannot but apologize, but my ISP have yet to fix this
configuration error.]
Jean-Marc Desperrier <jean-marc.desperrier@certplus.com> writes:
> The true problem is not that the newsreader changes to MIME, but that it
> doesn't know what charset the subject line is in.
That's another problem.
However, if the newsreader truly does not know the charset and then
changes the subject line from:
Subject: Vi gillar räksmörgåsar
To:
Subject: ?iso-8859-1?Q?Vi gillar r=E5ksm=F6rg=E6sar?=
(MIME-encoding made ny hand, may not be completely correct.)
It is actually changing the subject line in violation of a MUST requirement
in GNKSA, as it changes an unknown charset into a known charset. (Which does
not prevent newsreaders that do this to get the seal. So much for GKNSA.)
> In USEFOR, it will be required to change it to unicode, so it will take the
> unknown charset, assume it is some known charset, and convert that known
> charset to unicode, therefore destructing the title.
In Usefor we do not discuss this, but I believe we should. To wit, if
we again have the subject line:
Subject: Vi gillar räksmörgåsar
There is no way that a followup agent safely can convert this to UTF-8.
It can certainly conclude that it is not UTF-8, but it cannot convert
it to UTF-8 if it does not know the source charset.
Therefore I would suggest this NOTE to section 5.4:
While the user is free to change the subject line at his own will,
the followup agent should avoid accidental changes, except those
related to the back-reference as discussed above. Particularly, the
encoding of a subject line should not be changed. If the subject
line is encoded according to [RFC2047], the followup agent should
not change this to UTF-8, even if the RFC2047 is deprecated by this
standard. Likewise, if the followup-agent can conclude that the
the subject line is not in UTF-8 despite that it contains 8bit
characters, the followup agent should not make any attempt to guess
the character set and correct it to UTF-8.
-- Erland Sommarskog, Stockholm, sommar@algonet.se