Re: When will News Article Format be approved?

From: Bruce Lilly (blilly@erols.com)
Date: Mon Apr 21 2003 - 11:22:14 CDT


Martin Duerst wrote:
> At 11:45 03/03/10 -0500, Bruce Lilly wrote:
>
>> Martin Duerst wrote:
>
>
>>> Up to here, we still are working with the repertoire of characters
>>> in ASCII. To get to the Unicode plane 14 tag codes, we just map the
>>> ASCII codes to some other codes, for our specific protocol.
>>
>>
>> That mapping and subsequent encoding is the problem, because
>> it makes the language tag inaccessible.
>
>
> Inaccessible in what sense? Can you be more specific?
> How important is it, for what kinds of processing, that
> the language codes are in ASCII? And why?

E.g. filtering by language.

>> > This is
>>
>>> done all too often. RFC 2047, with base64 and qp, is a typical example.
>>
>>
>> RFC 2047 does not encode the language-tag (or charset-tag, or
>> encoding-tag, or delimiters), only the text string under
>> consideration.
>
>
> Well, 2047 doesn't encode a language tag, that's RFC 2231.

2231 amends 2047, and the point remains that the language tag is not encoded,

> The charset-tag is indeed not encoded. The encoding tag is
> encoded, 'base64' becomes 'B' and 'quoted-printable' becomes 'Q'.

No the encoding *tag* _is_ either "B' or "Q" (case-insensitive) -- it
is not encoded.

> The delimiters stand by themselves, the question of 'encoding'
> is pretty much irrelevant for them.
>
> Now I would argue that what really counts for most kinds of
> processing is the actual text. Obscuring the actual text
> while keeping language, charset,... visible seems to be
> very much backwards.

The text is encoded, not "obscured". And encoding is the point under
discussion.

>> The issue is that the plane 14 encoding obscures the language
>> tag, whereas RFC 2047 does not. One should rarely see the wire
>> format in either case, it should be handled by the implementation.
>> And that raises another point, viz. that RFC 2047 is widely
>> implemented for text messages, whereas the plane 14 tags are not.
>
>
> The language tags that have been added to RFC 2047 syntax by
> an additional RFC aren't really widely implemented. I'd be glad
> to learn about implementations, I don't know a single one.

Mozilla is a prime example, and as it's open source, you can readily
learn about it.

>>> Do we want to somehow tag it on the side
>>> to make the difference? Do you think you'll get the users to
>>> understand what's going on?
>>
>>
>> It can't hurt, and can only help.
>
>
> Do you think that <en>rec.chat</en> and <fr>rec.chat</fr>
> should be two different newsgroups, or that only one of
> them should be allowed?

That's a political question, not a technical one. No comment.




This archive was generated by hypermail 2.1.7.