[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Bidi issues
Adam M. Costello scripsit:
> The motivation behind the bidi restrictions is that there exist very
> different strings that get displayed exactly the same way. I don't know
> exactly why, but it's a consequence of the bidi algorithm. You can try
> reading UAX#9 if you like. Good luck. :)
Here's an example. I will use UPPER CASE to represent Arabic letters
and lower case to represent Latin letters, as is usual in examples of
this kind. If you see, totally out of context, the string
the arabs = BARA-LA
you cannot tell whether this says "the arabs = AL-ARAB", as would be the
case in an English context, or "AL-ARAB = the arabs", as would be the case
in an Arabic context. In running text, it's possible to disambiguate,
but not in an identifier that has to work correctly out of context.
Consequently, the stringprep rules forbid a identifier that contains
both LTR and RTL characters. It really has nothing to do with the encoding
of the characters, only with their appearance.
--
With techies, I've generally found John Cowan
If your arguments lose the first round http://www.reutershealth.com
Make it rhyme, make it scan http://www.ccil.org/~cowan
Then you generally can jcowan@xxxxxxxxxxxxxxxxx
Make the same stupid point seem profound! --Jonathan Robie