Re: 8-bit in newsgroups names

New Message Reply About this list Date view Thread view Subject view Author view

From: Charles Lindsey (chl@clw.cs.man.ac.uk)
Date: Thu Nov 20 1997 - 03:43:25 CST


In local.usenet Leonid Yegoshin <egoshin@genesyslab.com> wrote:

> If I have NEWS-es in MIME-only coding, then in real I am needed to
>change news processing software like nntpd/innd/cnews etc, to adopt
>8-bit transfer of MIME mails body (it is already DONE in most IP providers !).
> Also I can implement very simple a transfer of e-mail to newsgroup
>and back software without any problem of charset encoding.

No, I think you misunderstand what is proposed. Bodies are not affected
(well, almost not). If there is a Content-Type: ... charset=foobar
heading, then the body will be interpreted as a document in foobar, not in
UTF-8. If your newsreader does not know how to display text in foobar
(perhaps foobar is some Korean ideograms) then that is tough, but we are
in that situation already. Likewise, if your Content-Type specifies
charset=UTF-8, the body may contain characters that your newsreader cannot
render. Again that is tough.

If there is no Content-Type header at all, then the body SHOULD be in
US-ASCII. But actually, the proposed default is UTF-8, since it does no
harm to be that way.

> Of course, I am needed to support article presentation in non-English
>language but it is needed anycase and also already done.

> But if we approve UTF-8 then there is headache of translation
>between MIME-encoded and/or UTF-8-encoded header Newsgroup: and newsgroup
>list itself. I am needed to have FULL table of translation for
>all languages. To complicate the problem - news processing software
>could have a task to choose - thats type of encoding should be used
>for particular language.

Not so, because MIME-encoded Newsgroup: headers (RFC2047-style) are not
permitted by the proposal, so UTF-8 is all you have got. If your
newsreader knows how to display the newsgroup name, then it does so;
otherwise, it displays some error characters, or octal representation, or
anything else. Presumably the reader will not care. If he wanted to read a
Korean newsgroup, presumably he would have bought a Korean newsreader. If
he wants to grep in his active file, he should grep in the UTF-8 encoding.

For other headings (which may have been gatewayed from email) the MIME
form (RFC2047) is permitted, but deprecated. UTF-8 is the default
otherwise. Even then, news software MUST NOT translate it (until it
arrives at the newsreader for display, that is).

-- 
Charles H. Lindsey ---------At Home, doing my own thing-------------------------
Email:     chl@clw.cs.man.ac.uk   Web:   http://www.cs.man.ac.uk/~chl
Voice/Fax: +44 161 437 4506       Snail: 5 Clerewood Ave, CHEADLE, SK8 3JU, U.K.
PGP: 2C15F1A9      Fingerprint: 73 6D C2 51 93 A0 01 E7  65 E8 64 7E 14 A4 AB A5


New Message Reply About this list Date view Thread view Subject view Author view


This archive was generated by hypermail 2b29.