From: Russ Allbery (rra@stanford.edu)
Date: Wed Mar 05 2003 - 16:13:25 CST
Terje Bless <link@pobox.com> writes:
> No, the crucial difference between UTF-8 and RFC2047 in this regard is
> that the grunt work of the encoding, normalization, etc., will be
> handled either by the OS or by generic libraries for UTF-8 while RFC2047
> will, at best, be handled by special-purpose libraries for email and
> news or, at worst, be doomed to be reimplemented by each newsreader
> author.
> Offhand, I can think of no less then four separate implementations for
> news clients and not a single case where an implementation has been
> shared. And three of the four have a common heritage from an open source
> newsreader!
> And another very significant factor here is that after you've handled
> the RFC2047 encoding, you'll _still_ have to do the Unicode bit if you
> want to be able to handle i18n properly. Unless in among outlawing
> "8bit" UTF-8 in headers we also end up outlawing the sending of
> RFC2047-encoded UTF-8...
I understand your point, but not implementing RFC 2047 already isn't an
option, and it's going to be still used in mail twenty years from now.
So....
But at this point I think we're just repeating positions already stated.
If we were designing a new protocol from the ground up, then modulo the
language tagging issues (which I don't really understand the importance of
one way or the other, so can't really comment on), I'd be all in favor of
just using UTF-8. But the world in which we live in is one where software
has to deal with both untagged 8-bit data and RFC 2047 already, and I
think adding a new untagged 8-bit data representation that few people are
currently using, with known compatibility issues with existing software,
is a major mistake.
-- Russ Allbery (rra@stanford.edu) <http://www.eyrie.org/~eagle/>