Re: When will News Article Format be approved?

New Message Reply About this list Date view Thread view Subject view Author view

From: Terje Bless (link@pobox.com)
Date: Wed Mar 05 2003 - 02:33:30 CST


Russ Allbery <rra@stanford.edu> wrote:

>Terje Bless <link@pobox.com> writes:
>
>>...these aren't the only differences between UTF-8 and RFC2047. There
>>is a -- real or imagined, it matters nought -- feeling that UTF-8 is
>>just Yet Another Charset whilst RFC2047 is a Transfer Encoding Format.
>
>Yeah, but see, this doesn't make any sense. It's just not true. And
>whether this distinction is real or imagined *does* matter.

No, it does not. RFC2047 wasn't rejected because of empirical evidence that
it does not work. In fact, there is a _lot_ of evidence that it _does_
work. Perfectly.

There were a lot of reasons why it was rejected, but none of them were due
to any real technical impossibility of making it work. IOW, the big hurdle
in making RFC2047 the One True Way is in people's perceptions. That there
are also all sorts of technical issues that /feed/ those perceptions means
that it doesn't matter a whole lot in the real world whether the feeling is
real or imagined; it's /people/ not technology we're talking about here.

>>RFC2047 won't go away; it's just that it's turned out to not adequately
>>address the needs of a portion of the Netnews population.
>
>When Stanford first rolled out AFS in 1992, it was very buggy. [...]

This anecdote suggests you would like to persuade people to give RFC2047
another try (well, actually it suggests you would like to shrug the whole
issue off, but... ;D). You're most welcome to try.

If that is the final result of this WG then as long as I'm wearing my
Standardization hat I'll support you 100%.

When I take it off and put on my "spends far too much time reading Netnews"
hat, I'll fight you tooth and nail.

>>As regards your statement; yes, I do believe that Unicode will
>>eventually be the native representation of all OSes and applications.
>
>Unicode really doesn't have anything to do with this. Unicode isn't an
>encoding, and therefore doesn't help. We can't "use Unicode." That
>doesn't have any useful meaning.

Thanks for your vote of confidence...

Given I prefer not to have to qualify that different systems use different
parts of UNICODE and in slightly different ways, just saying "Unicode" as a
umbrella statement that covers the two different UTF-16 formats, UTF-8, the
4 different nomalization forms, and whether the BOM should be included,
that are the norm in various places I'm aware of, seems to me emminently
appropriate. Thanks for not nitpicking and insulting my intelligence. :-)

>The discussion isn't about Unicode. It's about UTF-8. Which is *not*
>the native encoding of much of anything right now, although it's popular
>in some Linux circles. (It's certainly not the native encoding of
>either Java or of Windows.)

I'm not familiar with Java, or in too much detail on Windows, but I do know
that Windows includes facilities for working with UTF-16 and everything
I've read suggests UTF-16 will more and more become the native encoding on
that platform. On Mac OS X, Canonically Decomposed UTF-16 is the internal
representation of the system frameworks with facilities for producing NFKC
UTF-8. On Linux I see a prevalence of UTF-8 in the places where UNICODE is
supported, with the exception of XML-related software which seems to prefer
UTF-16.

In short, everything I see suggests "Unicode" is where the world is headed
and you'd need to talk pretty fast to convince me that one of its
transformational formats will _not_ be the "native representation" of all
modern OSes (excluding Embedded systems and similar which may have extreme
needs). Are you seriously going to argue that an OS with Unicode facilities
will have trouble dealing with UTF-8 because it uses UTF-16 for preference?

-- 
"I don't mind being thought of as a Badguy,
 but it /really/ annoys me to be thought of
 as an *incompetent* Badguy!" -- John Moreno


New Message Reply About this list Date view Thread view Subject view Author view


This archive was generated by hypermail 2b29.