[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: URI canonicalization
On Jan 31, 2005, at 7:10 PM, Martin Duerst wrote:
5) Add a note saying something like "Comparison functions
provided by many URI classes/implementations make additional
assumptions about equality that are not true for Identity
Constructs. Atom processors therefore should use simple
string functions for comparing Identity Constructs."
I think such a note could be a good balance to the normalization
advice.
That would be a falsehood. Identifiers are not subject to
"simplification" -- they are either equivalent or not. We can
add all of the implementation requirements we like to prevent
software from detecting false negatives, but that doesn't change
the fact that equivalent identifiers always identify the same
resource. It is the author's responsibility to use URIs
(or IRIs) that are actually different, not the responsibility
of the protocol or implementation.
I am disappointed that a MUST requirement was added to IRI in the
last draft without working group review. This part
Applications using IRIs as identity tokens with no relationship to a
protocol MUST use the Simple String Comparison (see section 5.3.1).
All other applications MUST select one of the comparison practices
from the Comparison Ladder (see section 5.3 or, after IRI-to-URI
conversion, select one of the comparison practices from the URI
comparison ladder in [RFC3986], section 6.2)
is completely missing the point of the ladder. The identifiers may
or may not be equivalent and there is absolutely no reason for
protocols to require inaccurate comparisons. The reason for
simplification of comparison is ONLY that false negatives are
an acceptable fact of life and their elimination is an
implementation-specific decision that has no impact on
interoperable use of identifiers. That is why there is no such
requirement for URIs.
....Roy