From: Bruce Lilly (blilly@erols.com)
Date: Fri Feb 21 2003 - 17:25:28 CST
Charles Lindsey wrote:
> In <3E4FB790.6060404@Sonietta.blilly.com> Bruce Lilly <blilly@erols.com> writes:
>
>
>>One cannot use any untagged 8-bit charset today with any reasonable
>>expectation of it being understood, not only because of lack of
>>support in readers and lack of compatibility with transport, but
>>also because there simply is no way for the reader to determine
>>*which* charset is in use for the typically short sequences that
>>are used in header fields.
>
>
> On the contrary, China has a larger population than any other country on
> earth, and within that population there is a huge number (by Usenet
> standards) of people who are not only using an untagged 8-bit charset, and
> not only expecting it to be understood, but actually understanding it.
News flash: China is not the entire world. The untagged charset
cannot be reliably determined by a (human) reader (or by software)
_in general_. Coincidence is not a "reasonable expectation of
[..] being understood".
> So it is evident that the support in readers _does_ exist, and no lack of
> compatibility with transport seems to be holding them up. And there is a
> very simple way for the reader to determine which charset is in use. If he
> can read it, it is the one he wanted. If he cannot read it, he doesn't
> care.
No, you are confusing charset and language. One might very well
be able to read something because the characters presented are
familiar, but might be unable to understand it because the charset
is wrong. E.g. iso-8859-5 vs. KOI-8 vs. KOI8-R vs. Windows-1251.
There are over 100 registered charsets, so random guessing and
trial-and-error are not productive.