Re: UTF-8 syntax

New Message Reply About this list Date view Thread view Subject view Author view

From: Henry Spencer (henry@spsystems.net)
Date: Wed Dec 11 2002 - 09:05:45 CST


On Wed, 11 Dec 2002, Charles Lindsey wrote:
> >The RFC it is replacing pre-dates the 17-planes limit. The Yergeau draft
> >inherited the longer sequences from the RFC, and may lose them before it
> >becomes an RFC.
>
> I think not. I emailed Yergeau, and here is his response...

It seems to me that he's being a bit inconsistent: "yes, 10646 did
eventually say that assignments end at 10ffff, but I choose to believe
that there is still need to encode things up to 7fffffff". However,
talking him out of this may be difficult; as he notes, it is basically a
religious issue.

> >However, Charles has a point in that there is some risk in attempting to
> >anticipate future standards -- we got burned on this with 8-bit characters
> >in headers, remember.
>
> I think that is Yergeau's point too.

Actually, he is doing exactly that: disregarding the clear statements of
today's standards because he thinks they will change! (Although he is
admittedly doing so in a way that has less chance of causing trouble...)

However, my point remains: today's standard is RFC 2279, not Yergeau's
draft. Whatever might happen to the draft, today's standard does include
the 5- and 6-byte sequences, so pretending otherwise is inappropriate.

> What I propose is to leave the syntax as is, but I have added a final
> sentence...
> Is that acceptable to everybody?

That looks okay to me.

                                                          Henry Spencer
                                                       henry@spsystems.net


New Message Reply About this list Date view Thread view Subject view Author view


This archive was generated by hypermail 2b29.