[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: UTF-8 over RFC 2047 (Re: Call for Usefor to recharter)
[Usefor folks: Apologies for sending this twice, but the first attempt
didn't make it to ietf-822, as I am not a subscriber there.]
Dave Crocker <dcrocker@xxxxxxxxxxxxxxx> writes:
> Hence, UTF-16 and UTF-8 are methods of encoding a larger bit space into a
> smaller representation space, producing variable-length strings. One crams
> the larger space into a 16-bit world. The other crams it into an 8-bit
> world.
>
> There depending on the situation, there can be processing or space
> efficiencies gained by one encoding over another.
>
> But there is no theoretical or aesthetic superiority that can be claimed by
> one over the other.
>
> Cramming those bits into a 7-bit environment is just one more cramming
> effort. It stands equal to the others as an alternative that has benefits
> and detriments.
Why let it stop with 7 bits? Why not cram it into one bit, while you're
at it?
> The confusion on this issue probably stems from the fact that you can use
> existing data viewers -- such as text editors -- to view the result of a
> 7-bit encoding and cannot use such "legacy" services for viewing UTF-8 or
> UTF-16.
>
> If you do not have UTF-8 or UTF-16 tools, you cannot view the data at all.
Incorrect. If you have an 8-bit editor that does not understand UTF-8,
you will see the text, but it will look "ugly".
The same applies if you have a any-bit editor which does not understand
RFC2047.
But:
1) UTF-8 looks less ugly than RFC2047 when incorrectly displayed.
2) UTF-8 editors are commonly available. RFC2047-capable editors are not.
And if you only have a 7-bit editor? Then you are grossly out of date
anyway.
--
Erland Sommarskog, Stockholm, sommar@xxxxxxxxxx