Re: Backwards compatible

New Message Reply About this list Date view Thread view Subject view Author view

From: Claus Färber (list-ietf-wg-apps-usefor@faerber.muc.de)
Date: Sun Aug 18 2002 - 06:08:00 CDT


Erland Sommarskog <sommar-usefor@algonet.se> schrieb/wrote:
> =?ISO-8859-1?Q?Claus_F=E4rber?= (list-ietf-wg-apps-usefor@faerber.muc.de) writes:
>> So you are explicitly claiming that if newsgroups are encoded with a
>> Punycode-like encoding (which only uses characters currently allowed in
>> newsgroup names), existing software will not be able:
>>
>> . access the newsgroup,
>> . create new postings, and
>> . create followups

> Basically, yes. Because the names will not be understandble to the users,
> the users will not find their way there.

>> But UTF-8 does solve that problem in your opinion, although we know of
>> existing software that does produce the "funniest" results.

> For Latin scripts, UTF-8 names will still be understandable in most
> cases.

Which is true for Punycode, too.

> Note also that some newsreaders are actually capable to present UTF-8
> names today without the slightest change. This also applies to software
> such as mine non-RFC2047 capable mail reader.

> To wit, all newsreaders that write to a TTY. They only need TTY, for
> instance a Telnet client, that is able to present UTF-8.

That's not the whole truth. With UTF-8 you have a non-trivial relation
between octets, characters and columns. Most "dumb" leagacy newsreaders
make the assumption that one octet equals one column.

Further, most newsreader support only one display charset at once. With
the dumb newsreader you describe, you won't be able to read any messages
which use the 8bit charsets currently in use if you set your TTY charset
to UTF-8.

The main problem, however, is that many newsreaders are not able to pass
through UTF-8 newsgroup names. The worst thing that can (and will)
happen is that the names are recoded or encoded in RFC 2047.

>> This shows that you have never had a look at Punycode. Punycode encodes
>> ASCII characters as-is, so for most Western languages the words *are*
>> quite readable. For most non-Western languages, which traditionally
>> don't use UTF-8, there's not much difference between UTF-8 and Punycode.

> So what does se.test.räksmörgås become in Punycode?

se.test.zq--rksmrgs-5wao1o

Claus

-- 
------------------------ http://www.faerber.muc.de/ ------------------------
OpenPGP: DSS 1024/639680F0 E7A8 AADB 6C8A 2450 67EA AF68 48A5 0E63 6396 80F0


New Message Reply About this list Date view Thread view Subject view Author view


This archive was generated by hypermail 2b29.