[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

model with overlapping variants



Let's suppose for a moment that it turns out to be essential to use a
relation that is not transitive, as Roozbeh has suggested, and as (I now
think) Paul would agree.  That is, even if labels X and Y are unrelated,
there might exist a label V that is related to both.

But let's assume the relation can be symmetric; that is, there is no
need to distinguish between "X is related to Y" versus "Y is related to
X" versus "X and Y are related".  Someone tell me if you think that's an
unreasonable assumption.

For the moment I'll call the relation "confusability".  Given any two
labels (in no particular order), they are either confusable or not, and
it is possible to compute that boolean value.  The computation might
enumerate all the labels that could be confused with one of the inputs
and check whether the other input is among them, or it might be possible
to be more efficient by using a clever algorithm.  We'll worry about
that later.

In this model, there are no atomic groups of labels.  In a different
model that used an equivalence relation, we could be sure that every
group is either entirely registered or entirely available.  But with
this non-transitive relation, it's possible the set of labels confusable
with a submitted label is partially registered and partially available.

What properties ought a registration policy to have in this model?

 1. Two labels in the zone belonging to different registrants must not
    be confusable.  Corollary:  Two labels in the zone belonging to
    different registration bundles must not be confusable, even if they
    belong to the same registrant (because the registrant could sell
    them to different buyers).

 2. Bundles must not tie together unrelated things in the zone.  (That
    would cheat the registry out of fees for what should be separate
    bundles.)  There are at least three ways to define the relatedness
    property that a bundle must satisfy:

     a. loosely related: For every pair of labels X and Y in the zone
        that are in the same bundle, there must exist a sequence X, Z1,
        Z2, Z3, ..., Zk, Y such that every pair of adjacent labels in
        the sequence is confusable.  In other words, X and Y must be
        related by the transitive closure of the confusability relation.

     b. tightly related: Every pair of labels X and Y in the zone that
        are in the same bundle must be confusable.  In other words, the
        labels in the zone in a particular bundle form a clique.

     c. radially related:  Among the labels in the zone that belong to a
        particular bundle, one is designated as the center, and all the
        others are confusable with it.

I think properties 1 and 2(a|b|c) are all we need.  Am I overlooking
anything?

Notice that the properties speak only of labels in the zone.  They place
no constraints on labels that are not in the zone.  Nobody really cares
what sort of behind-the-scenes bookkeeping the registry might be doing
with the labels that are not in the zone, as long as the labels in the
zone are well behaved.

2a looks difficult to enforce, and might lead to bundles that are too
large, so let's put that one aside for now.

2b and 2c both look reasonable to me.

How could a registry enforce those properties?  Here's one general
approach:

Each bundle contains a set of related labels, all of which are in the
zone.  Bundles do not contain "blocked" labels.  For 2c, one of the
labels is flagged as the primary label and cannot be removed from the
bundle.

When a registrant asks to create a new bundle, containing a single
initial label, the request is denied if the label is confusable with any
label already in the zone.

When a registrant asks to add a label to their existing bundle, the
request is denied if the label is confusable with any label in any other
bundle.  Also, the request is denied if the label is not confusable
with: every label in the bundle (2b) / the primary label (2c).

When a registrant asks to remove a label from their existing bundle, the
request is denied if it is the only(2b)/primary(2c) label in the bundle.

A request by a registrant to remove one of their entire bundles is never
denied.

We do not need to specify whether a registry keeps records regarding
labels that are not in the zone.  It might do so, as a precomputation to
help with the required checks, or there might be clever pattern-matching
algorithms and clever indexing data structures that allow the registry
to perform the required checks without extra storage.

If this model looks promising, the big question is whether there
is a confusability relation (parameterized by tables) that is both
expressive enough to be useful and tractable enough to permit a feasible
implementation of the model.

With the sort of tables that have been proposed so far, the checks to be
performed are effectively regular expression matches on the zone file
or on bundles.  And not arbitrary regular expressions, but a restricted
class of regular expressions built from rows of the mapping table.  It
wouldn't surprise me if there were tricks that could be played, but I
have no expertise in this area.

AMC