Re: C.T.E. and message/partial

New Message Reply About this list Date view Thread view Subject view Author view

From: Erland Sommarskog (sommar-usefor@algonet.se)
Date: Sat Jul 14 2001 - 14:23:47 CDT


[By the way, you may have noticed the funky address of kairos.kairos.algonet.se
in my messages. I cannot but apologize, but my ISP have yet to fix this
configuration error.]

Jean-Marc Desperrier <jean-marc.desperrier@certplus.com> writes:
> The true problem is not that the newsreader changes to MIME, but that it
> doesn't know what charset the subject line is in.

That's another problem.

However, if the newsreader truly does not know the charset and then
changes the subject line from:

   Subject: Vi gillar räksmörgåsar

To:

   Subject: ?iso-8859-1?Q?Vi gillar r=E5ksm=F6rg=E6sar?=

(MIME-encoding made ny hand, may not be completely correct.)

It is actually changing the subject line in violation of a MUST requirement
in GNKSA, as it changes an unknown charset into a known charset. (Which does
not prevent newsreaders that do this to get the seal. So much for GKNSA.)

> In USEFOR, it will be required to change it to unicode, so it will take the
> unknown charset, assume it is some known charset, and convert that known
> charset to unicode, therefore destructing the title.

In Usefor we do not discuss this, but I believe we should. To wit, if
we again have the subject line:

   Subject: Vi gillar räksmörgåsar

There is no way that a followup agent safely can convert this to UTF-8.
It can certainly conclude that it is not UTF-8, but it cannot convert
it to UTF-8 if it does not know the source charset.

Therefore I would suggest this NOTE to section 5.4:

    While the user is free to change the subject line at his own will,
    the followup agent should avoid accidental changes, except those
    related to the back-reference as discussed above. Particularly, the
    encoding of a subject line should not be changed. If the subject
    line is encoded according to [RFC2047], the followup agent should
    not change this to UTF-8, even if the RFC2047 is deprecated by this
    standard. Likewise, if the followup-agent can conclude that the
    the subject line is not in UTF-8 despite that it contains 8bit
    characters, the followup agent should not make any attempt to guess
    the character set and correct it to UTF-8.

--
Erland Sommarskog, Stockholm, sommar@algonet.se


New Message Reply About this list Date view Thread view Subject view Author view


This archive was generated by hypermail 2b29.