[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
6. "Implementation and Performance Issues"
Steve,
Here's the response to item 6, which I summarized as:
6. "Implementation and Performance Issues". Disagreement as to whether there
is any significant difference in implementation difficulty or performance.
http://www.imc.org/ietf-ldup/mail-archive/msg00614.html
***
Comments follow original below.
***
> The tree is especially important for "sporadically-connected"
> replicas since
> they are the most likely to generate conflicts. In most cases
> the tree would
> be trivial or very small, but if it is not recognized as a tree, that
> "simplification" makes everything else become incredibly complicated.
>
> I don't see a large tree as an implementation problem in itself - the
> overhead is only about 12 bytes of RAM per conflicting change.
[Steve]
You would also need to store enough information to reconstruct the
previous versions of an entry on each strand. To change the current
entry version from one strand to another requires unwinding index changes,
etc, back to the common branching point and then applying the saved
change requests on the new strand. URP never has to go backwards.
[Albert]
Performance is dominated by the normal case of no concurrent changes. If
concurrent changes are more than a small proportion of updates, the
application is just not suitable for the Directory as configured. Concurrent
changes are just an "unavoidable evil".
Even if the mechanism for dealing with that "unavoidable evil" was highly
inefficient, it would not have much effect on overall performance because it
is rarely used - it is just there to avoid the far more inefficient
alternative of manual intervention by system administrators.
For that reason there is no significant difference in performance between
MDCR and URP.
When there is no conflict, the tree would be trivial and there would be no
"going backwards".
When there is conflict, for the same reasons, performance is dominated by
the case of just 1 conflict. In that case "unwinding ... and then
applying..." is exactly equivalent to replacing the original entry with the
replacement, which happens to be precisely what URP does. The difference
being that MDCR does it based on the actual conflict tree whereas URP uses
roughly equivalent components of the lexical ordering to pretend that
changes occurred on a linear time scale even though they did not. There is
no difference in the complexity of actually implementing either ordering,
just a greater level of abstraction required in visualizing the tree rather
than succumbing to the illusion of sequential changes.
In the very rare cases when there is more than one conflict with a longer
single strand or a branching tree, MDCR actually has to do less work than
URP because URP makes each change visible as it arrives and therefore has to
update indexes (including "unwinding" indexes), whereas MDCR leaves some
arriving changes invisible and stores them only in case they are needed
later, so it does not have to update those indexes.
MDCR does have to store every arriving change report, but this can simply be
appending them to a log file, which can be very efficient for sequential
seeks on a separate log disk spindle since not even random seeks for primary
key indexing is required. Actual details are of course up to the
implementation.
Usually the log records would still be in RAM buffers if they need to be
read back within a reasonably short replication latency, but would normally
have to be written to disk for safety.
The guesstimate of 12 bytes of RAM overhead per tree entry was based on a 32
bit (4 byte) version number, 2 x 16 bit replica number for up to 16K
replicas per replicated area, and a 32 bit pointer into a log file up to 4GB
long. Note that only these pointers are needed to maintain the shape of the
tree in RAM, the actual contents of the changes required to unwind and wind
forwards can be read back from the pointer into the log file.
An implementation might choose to break down a switch from one part of the
tree to another into "primitives" as you suggest, unwinding the index
entries a step at a time and then rewinding them a step at a time. In this
case that choice would not really matter as the frequency is negligible
anyway. I would of course be more inclined to do it "atomically" ;-) as a
single deletion of the currently visible entry and replacement by the new
visible entry.
In general I think implementing trees etc is not a big issue for software
development. Unfortunately they are harder to describe in standards, which
looks like a bigger problem at the moment.
PS I will deal with your example of a long strand when returning to item 5.