[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] Re: FYI: BOF on Internationalized Email Addresses (IEA)



Dave Crocker <dhc@xxxxxxxxxxxx> wrote:

> [UTF-8] might be a more efficient encoding, but it is no more "native"
> or "direct" or "raw" than ACE.

I know this is beside the point, but...

UTF-8 is more compact than Punycode only for strings with a lot of
ASCII characters, which is typical of Latin-based scripts.  For small
non-Latin scripts (like Cyrillic and Arabic), Punycode is significantly
more compact than UTF-8 (and for some of them, including all the Indian
scripts, the difference is quite great).  For large scripts (like Han
and Hangul) Punycode and UTF-8 are comparable, and UTF-16 beats them
both by a wide margin.

AMC