[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: well-formedness error



--On Saturday, June 19, 2004 7:22 AM -0700 Tim Bray <Tim.Bray@xxxxxxx> wrote:
> On Jun 18, 2004, at 12:26 PM, Julian Reschke wrote:
> 
>> How about modifying the proposal to say "either UTF-8 or UTF-16"?
>> 
>> Reasons:
> 
> Another reason: UTF-16 is in some senses the "native" encoding for
> Java and C#. -Tim

Two reasons against adding UTF-16:

1. It triples the test cases (need to test BE and LE).
2. It is only a little smaller. The ASCII XML syntax (elements 
   and attributes) take twice as much space. That takes away
   from the gains for high codepoint characters.

I'll run some quick tests with our XML search results format
and different encodings, but not today.

wunder
--
Walter Underwood
Principal Architect, Verity Ultraseek