From: Erland Sommarskog (sommar@algonet.se)
Date: Fri Oct 01 1999 - 17:24:30 CDT
=?ISO-8859-1?Q?Claus_F=E4rber?= (list-ietf-wg-apps-usefor@faerber.muc.de) writes:
> Erland Sommarskog <sommar@alg^onet.se> schrieb/wrote:
> Which will or will not help when leagacy software is involved. If a
> software says message/news; charset="unknown-8bit" encoded as quoted-
> printable, we still don't know whether it's still utf-8 or has been
> mangled before (ok, unlikely, but who know what type of gates might be
> involved. One person leaf nodes run strange software sometimes,
> including conversion to other formats. Yes, and some of them are
> moderating newsgroups.
I guess there will be a few situations when you are out of luck. However,
take things in perspective. News is normally transported by news servers,
we expect them to handle UTF-8 a lot better than many of the other
new features we are suggesting.
Sometimes news are transported by mail, either over a gateway, or as mail
to a moderator.
Now, groups will ASCII-only names will not be affected, only groups
with UTF-8 chars in their names. Initially that will be a limited
set. They are not likely to occur in the Big-8 for the simple reason
that names in Big-8 are in English. National hierarchies is another
matter. Now, don't know about news-to-mail gateways for Swedish users,
but I have never heard about any. I'm brave enough to say that's a
non-issue. Moderated groups might be another matter. However, I don't
think we will go around and create new groups with ÅÄÖ in them the
first day that our Draft becomes Standard. If nothing else, we need
a wide deployment of UTF-8 first. And since we only have one moderated
group this far, I don't think we will start with a moderated one.
> > And when you take in regard that mail will eventually move to 8-bit,
> > and then we will be talking UTF-8,
>
> I doubt that, at least for headers, as there are already workarounds
> (RFC 2047), the support of which is more and more widespread.
Support for RFC2047 is fairly limited, appearing mainly in mail and
news tools. Support for UTF-8 will be wider. For instance, I'm using
mailx which does not understand RFC2047, which is why the attribution
of Claus's name looks so funny. But mailx supports UTF-8 more or less.
It doesn't really understand it, so might be wrong on character counts
everyone once in a while. But provided that I have a Telnet client
that supports UTF-8, I have no problem of reading UTF-8 without get
garbled characters.
And mailx is not unique. The same goes for grep, Perl and others.
They will understand UTF-8, but they don't understand RFC2047.
-- Erland Sommarskog, Stockholm, sommar@algonet.se