Re: UTF-8 and RFC 2047

New Message Reply About this list Date view Thread view Subject view Author view

From: Jean-Marc Desperrier (jean-marc.desperrier@certplus.com)
Date: Mon Jul 01 2002 - 14:48:28 CDT


Charles Lindsey wrote:

>In <3D1CAD61.8040508@certplus.com> Jean-Marc Desperrier <jean-marc.desperrier@certplus.com> writes:
>
>
>>Some agents will just transmit everything directly to the output device,
>>if it's configured to UTF-8, it will go through, but this is not a
>>usable setting in a world as we know the most common usage is other local
>>encoding, but *not* utf-8.
>>
>>
>Yes, but I include the underlying OS under the term "reading agent".
>
I think there's a miscomprehension problem.
Even if you include the underlying OS, if you choose to set the default
character set to UTF-8, either by telling a sophisticated newsreader
that this is the default charset, or by setting LANG to UTF-8 for a
simple text based unix newreader inside an xterm that can understand
UTF-8, you will loose and not be able to decode anymore *any* messages
that is not encoded in UTF-8.

The surway has shown that today this represents 99.9% of the world.

This choice of setting the default to UTF-8 will work, when all messages
are in UTF-8, but is completely incompatible with the transition period.

>Does OE understand UTF-8 when running under Windows 95/98?
>
>
It can be made to.
You can tell it what is the default charset amongst all of those it
understands, and it will assume all non-encoded 8 bits data is in that
charset.

If you set this default charset to UTF-8, then headers in UTF-8 should
be displayed correctly in the thread panel.

But if you do that, you loose the display of 8 bits headers that are not
in utf-8.
Also, in all message in ISO-8859-1 that do not declare a charset in the
content-type header, you will loose all accentuated characters in the
content.
The default option of Outlook Express is *not* to send a content-type
header, and maybe 1% of users know they must change that option.

But more, if a message in ISO-8859-1 correctly declares the charset,
with that setting in the thread list the title will be interpretated as
UTF-8, but inside the message panel the message title will be displayed
as ISO-8859-1.

Basically unusable, except in a world where nobody sends raw 8 bit in
headers other than UTF-8, and nobody sends messages whose content is not
UTF-8, but that do not have the correct content-type header.
I forgot to say that in the case of japanese, title will not be
correctly displayed in the thread panel even if they respect RFC 2047.
I don't know why but on my western OS, OE only wants to display
correctly japanese in the thread panel if I set the default charset to
japanese.
And this can changed only by going into the options.

All of these testing with OE really remembered me *why* I'm using
Mozilla for good i18n support.

BTW I didn't notice at first the insistance on Windows 95/98.
There is no significant difference in unicode support of OE on Windows
95/98 with regard to other version.
When you know what to do, you can get very acceptable unicode support on
Windows 95/98.


New Message Reply About this list Date view Thread view Subject view Author view


This archive was generated by hypermail 2b29.