[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Comparison of hoffman-idn-reg and jseng-idn-admin
At 9:56 PM +0430 3/31/03, Roozbeh Pournader wrote:
A registry MUST NOT blindly combine multiple tables which have
overlapping equivalences. Instead, the registry MUST carefully analyze
every instance in the combined table where a base character has one or
more different variants and select the desired set of variants for the
base character.
(But unfortunately doesn't suggest any guidelines when doing so.)
Correct. I think that if I give a few suggestions for guidelines, it
will lead readers to think that the problem is simple, which it is
not. Either we list lots and lots, or none.
I will add a note about why I have done none; see below.
Unfortunately, the list ends here. Specifically, there are fetures that
are *required* for Arabic but are missing in the language of the tables.
And therefore it will be added. (Note to the list: I doubt that
Arabic is the only language that I missed. If you know of others,
please speak up!)
1. Mandatory equivalences as opposed to secondary/variant equivalences.
This feature is necessary for defining equivalences between European and
Arabic-Indic digit shapes in Arabic labels, for example.
Very good point! This is a registry-specific early mapping step that
must be done. I think it should be done before the variants are
checked in the table; do folks here agree?
2. Clear language about conflict resolution. There needs to be some clear
guidelines or recommendations about the times that two registered labels
come into an intersection regarding the variant labels associated to them.
This will happen with almost any multi-language Arabic-script zone
(e.g. U+0649 vs U+064A vs U+06CC).
I am unclear on how this differs from point #1. If any of those three
characters are supposed to only be represented by one of them in
names, then the registry-specific early mapping step will take care
of them. Or is that not what you are referring to? Please be more
specific.
3. Clear language with specific guidelines and real-life examples for
merging tables for different languages/locales.
Currently, I believe that there are three possibilities:
- the merging is trivially easy because there is no overlap
- the merging is a policy decision by the registry at the time of
table-making as to which language "wins" for the overlapping
characters
- it is impossible to register without knowing the supposed language
of the registration
I can add more discussion of that, but the third option is not
"merging", it is forcing the problem on the registrant (who might be
sly and use it as a way to make the bundle contain things that the
registry might not have intended). From my reading of the JET
document, they call the third option "merging" when in fact it is
just the opposite: it prevents merging by pointing at one table.
4. Better syntax for the table. Don't you agree that a U+ABCDU+BCDAU+CDAB
syntax is unreadable? Why can't one use a space?
Spaces as separators in tables cause problems going through gateway
programs. I'm happy to add an inter-character separator of "-".
--Paul Hoffman, Director
--Internet Mail Consortium