Re: When will News Article Format be approved?

New Message Reply About this list Date view Thread view Subject view Author view

From: Charles Lindsey (chl@clw.cs.man.ac.uk)
Date: Thu Mar 06 2003 - 09:32:01 CST


In <3E661491.504@Sonietta.blilly.com> Bruce Lilly <blilly@erols.com> writes:

> > But I'd rather solve those problems than live
>> with 1 or be ignored as with 2.

>Feel free to try; first step is to come up with an RFC 2277/3066-
>compliant method for language tagging for text strings in header
>fields, compatible with existing 822/2822 parsers, that works with
>8-bit charsets.

Done that. Two methods proposed in fact.

1. Use the language tagging in Unicode (and hence in UTF-8). You insist
that does not satisfy RFC 2277/3066, but it does. Two leading workers in
the Unicode project (one of whom posts to this list) have cnfirmed that to
me, and your uncorroborated say-so carries no weight at all. They also
confirm that lanuage tagging is not particularly useful or necessary in
such short texts as headers, and that the ability to change language in
mid-text is even less so. However, that method does suffer the
disadvantage that it is to be deprecated in Unicode 4.

2. Invent a header which conveys the tagging information (for all the
headers, naturally).

{3. Revert to RFC 2047 in the very few cases where the above might not be
considered suitable.}

> Second step is to select a charset that is
>universally acceptable, and we already know that there are
>objections to utf-8 (by those who prefer GB18030) and to GB18030
>(by those who prefer utf-8).

There is not a cat in hell's chance that GB18030 will be accepted for that
universal role, and UTF-8 is the only other show in town. There is a real
risk that the Chinese would shut themselves off in a cooperating subnet,
and there is room for some discussion as to whether we make it easier or
harder for them to do so.

> Third step is to deal with negotiation
>and fallback support in the various messaging models and protocols.

Yes, which is why I have proposed that people wishing to post to moderated
I18N groups should negotiate with their injector first. The fallback is
not particularly good (it puts the burden on the poster) and the
consequence is that moderated I18N groups are not likely to happen for
some time. That seems an acceptable compromise if it lets the non-moderated
I18N groups get underway.

>Fourth step is to propose the tagging, charset, and negotiation
>schemes in the appropriate places for adoption by the standards
>bodies responsible for SMTP, NNTP, IMAP, and the message format.

It is unnecessary for SMTP (I am assuming UTF-8 for newsgroups-names only
at the moment) though desirable it should happen for other
reasons. It is unnecessary for NNTP, except to cope with those moderated
I18N groups. If IMAP wants a tag to warn them that 8-bit headers are
present, I have no objection to providing one. And I have no problem joing
discussions for an IMAP extension for use in those IMAP servers which
choose to offer I18N newsgroups to their clients.

>If you're able to accomplish all of that, then, and only then, can
>that charset begin to be used, and you still will have to convince
>users to switch. I don't think there's a chance of doing all of
>that within a decade, but you can prove me wrong by actually doing
>it.

Most of it is already done, or is doable given the will to make it happen.
Your reluctance to cooperate in making it happen is noted.

-- 
Charles H. Lindsey ---------At Home, doing my own thing------------------------
Tel: +44 161 436 6131 Fax: +44 161 436 6133   Web: http://www.cs.man.ac.uk/~chl
Email: chl@clw.cs.man.ac.uk      Snail: 5 Clerewood Ave, CHEADLE, SK8 3JU, U.K.
PGP: 2C15F1A9      Fingerprint: 73 6D C2 51 93 A0 01 E7 65 E8 64 7E 14 A4 AB A5


New Message Reply About this list Date view Thread view Subject view Author view


This archive was generated by hypermail 2b29.