[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: reconciliation of mutually exclusive operations
[Kurt]
If the entry (contents trimmed):
dn: cn=entry, dc=example, dc=net
cn: entry
boolean: FALSE
is held by N servers in using multi-master replication and
N independent, properly authorized clients each issuing the
following operation to a different (willing and able) servers:
dn: cn=entry, dc=example, dc=net
changetype: modify
delete: boolean
boolean: FALSE
-
add: boolean
boolean: TRUE
-
One client should get a success result code.
N-1 clients should get a non-success result code.
I believe that any replication specification which cannot ensure
this behavior is not acting in accordance with X.500(93) as required
by LDAPv3.
[Albert]
Correct, but unfortunately the "should" behaviour is not possible with
ANY form of multi-master replication, since it would require contacting
other servers before returning the response code, and by definition,
multi-master does not do that. The only "standard" way to do it, would
be to unilaterally disconnect after making the change, without giving a
response code, since the standard does tell DUAs not to assume anything
in that situation.
I'm not suggesting disconnection would be a good idea, but it is an option.
(The standard does permit reads conflicting with what has just been written,
since that happens with single master too, as users aren't supposed to be
aware of which particular DSA they are asking).
For multi-master, there is NO other way to avoid them all getting a
"success"
response when they all perform the operation around the same time at
different replicas.
There may well be applications which rely on the currently expected
behaviour, eg to ensure that one, and only one, of those N DUAs
performs some action.
MDCR could also follow up with revocation notifications to all but one of
the
DUAs, and a confirmation notification to one, just to inform them of the
conflict. They could wait for the notification before performing the action,
as a sort of delayed response, while other applications would act on the
immediate operation response. But I doubt that would be very useful in
practice,
other than as a reminder that they should not use multi-master in this
situation.
Perhaps it could be useful - for some applications that need high
availability but not necessarily local availability with its longer
convergence delay.
We won't actually know without feedback.
This sort of behaviour contrary to that currently expected certainly
requires at least formal justification (eg in the requirements doc),
but there really isn't much more we can do about it, apart from
warning people not to use multi-master in that way.
If some of the DUAs had attempted to simply delete the attribute "boolean",
(assumed optional), both URP and MDCR would report "success" to both
those that now expect the attribute to be absent and to those which
expect it to be TRUE. But MDCR would at least follow up with a
notification to those which have the wrong expectation as a result, and
an (optional) confirmation to those that have the right expectation. Again
this might not help much, but at least users and administrators would find
out promptly by email that something odd was going on as a result of
inappropriate
use of multi-master, rather than just be left wondering why applications
dependent
on standard behaviour have intermittant failures (dependent on the
replication
schedule and which servers happen to be contacted, and therefore about as
relevant
to users in understanding an intermittant problem as the phase of the moon).
Its rather more serious for a non-boolean attribute. If the current value
of "color" is YELLOW and some try to set it to GREEN and others to RED, by
deleting YELLOW first and then adding GREEN or RED respectively, they
will again all get a "success" response, which confirms the inapplicability
of multi-master replication to traffic control applications ;-)
With MDCR, convergence would eventually be to just GREEN, with a
notification
to those that thought it was now RED, if a GREEN DUA got in first (according
to the clock at that replica), or vice versa. The ordering is essentially
arbitrary since clocks are not synchronized. If all N clocks happened to
be identical (whether or not the "real" time was identical), it would be
the color at the replica with the highest replica number, which is just as
arbitrary, but no more so.
With URP, the same would be true for a single valued color attribute, except
for no notifications.
For a multi-valued color attribute the eventual result of URP would be both
GREEN and
RED, contrary to the expectations of ALL the DUAs. When they try to fix it
by deleting
the color they think shouldn't be there, the result would of course be
empty, and a
schema violation if the color was mandatory, but they would again all be
told "success"
and with no follow up notification.
With the multi-valued attribute ldapACI, which IS used for a sort of
"traffic
control" by the directory itself, it gets rather interesting when DUAs
concurrently
delete an old value and replace it with different new values. I'm looking
forward
to Steve's response to the request for an example in item 9.
These sort of issues and examples should be clearly explained in the
requirements
document to get feedback on how actual existing applications would be
affected.
BTW can anyone do a quick unofficial description of the Pittsburg discussion
outcome,
re requirements document, prior to official minutes?