Harald Tveit Alvestrand wrote:
> UCS-4 is 1 representation - UTF-8 and UTF-16 are representations of the
> same charset. They have promised (ugh) that it is now only growing, not
> changing.
>
> >maybe? i think different proposals will have answer to this. i think we should
> >leave it open, and not limit to only iso10646 or some other encodings.
My point is that we may not want to put UCS-4 or ISO10646 as part of the requirement. Why don't we leave it open allowing more and varity of different proposals. For example, one possible proposal may consider using ISO-2022-X and ISO-8859-X character sets.
> >this will be a problem if ISO10646 is used. because of the CJK unification > >(arggh who is the idiot?), japanese & chinese falls under the same U+4E00 > >code space. if one folds and the other not, i think it is fairly obvious > >how messy it is going to be. > > Is this a fact or a "maybe a problem"? > I think we need to be as specific as possible here....for each folding > problem, name a glyph that has the problem, if possible.
As I have done Unicode CJK implementation, the answer to this is it is "a fact". However, I do not eliminate the possibility that I am a lousy programmer/designer :-) Maybe someone can come up with a better design and algo.
-- Harald Tveit Alvestrand, EDB Maxware, Norway Harald.Alvestrand@xxxxxxxxxxxxxx