[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: model with overlapping variants



Mark Davis <mark.davis@xxxxxxxxx> wrote:

> 1. I have not been following this discussion in detail, but using
> a non-transitive relation is really pretty ugly.  It makes the
> implementation significantly more difficult,

Agreed.

> and I believe will make the matching much harder to understand for
> end-users.

I'm skeptical of that claim.  Look at the end-user reactions to the
handling of <sharp-s> in case folding.  Users find it reasonable that
<sharp-s> matches SS, and that SS matches ss, but then they're surprised
when <sharp-s> matches ss (and they complain about it).  The transitive
closure is needed for technical reasons, not because it meets user
expectations.

> Can you point me to the use-cases that people felt were problems with
> this?

Roozbeh gave this example:

U+0649 Arabic letter alef maksura
U+064A Arabic letter yeh
U+06CC Arabic letter Farsi yeh

In Arabic, the first two letters are distinct, and the third is not
used.  In Persian, the first two letters are not used, and the third
looks exactly like either of the other two, depending on its position in
a word.

So if a zone wants to support both Arabic and Persian, it needs
to prevent different registrants from having names whose only
difference is that one uses the Persian letter and the other uses the
identical-looking Arabic letter.  But if we require the relation to be
transitive, we make the two distinct Arabic letters equivalent, which
will surprise users, and possibly upset both users and registries
(because names would be blocked for no apparent reason).

I've heard that there are examples involving simplified and traditional
Chinese characters that motivate intransitive relations, but I'm not
familiar with those.

AMC