[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: URIs vs Strings
Dare Obasanjo wrote:
> It's one thing to make an honest mistake about an
> arcane technology like URI canonicalization and
> another to pretend you didn't say something to mask
> ignorance about a particular topic.
You seem so intent on proving me wrong that you continue to
mischaracterize what I have written. It's sort of like listening to
republicans on a Sunday morning TV talk show...
I have never suggested that canonicalization would impact the
case of characters in the path component of URI's. I have only mentioned
case when speaking of the protocol and domain fields. As to the path
component, there are still many alternate ways to encode a single
string. For instance, everyone is familiar, I think with such the
problems with "%20" etc... If not, please review C14N. You should not
have any difficulty coming up with at least 2^6 different ways to encode
"foobar" when it appears in a path. However, only one of those
semantically equivelant methods would be permitted by C14N rules.
>First of all it contains bogus statements like "Windows systems
> are typically much more liberal in string matches than are
> Unix/Linux based systems when it comes to case-sensitivity"
File names in Windows systems have always been case-insensitive.
This goes back to Windows' roots in DOS and probably reaches back to
DOS's roots in the non-Unix operating systems (many from Digital) that
influenced its early definition. On the other hand, Unix has always had
case-sensitive file names. I believe that the case-insensitivity of DOS
and Windows file names has influenced DOS/Windows developers to
implement case-insensitivity as the default string match in other areas
matches over the years.
> Does it mean that String.Equals in the .NET
> Framework or strcmp in Visual C++ work differently
> than in Java or C on Unix systems?
I was speaking of Windows itself -- not code that has been
layered on top of it in recent years.
It remains the case that canonicalization drastically reduces
the number of ways in which any particular URI can be written. In some
cases, this is done by restricting case, in others, it is done by
restricting or requiring the use of a particluar method of encoding a
character. The result in variety *will* result in fewer opportunities
for error. That is the whole point of C14N.
bob wyman
-----Original Message-----
From: Dare Obasanjo [mailto:kpako@xxxxxxxxx]
Sent: Saturday, August 28, 2004 9:54 PM
To: bob@xxxxxxxx; 'Julian Reschke'
Cc: 'Atom Syntax'
Subject: RE: URIs vs Strings
--- Bob Wyman <bob@xxxxxxxx> wrote:
> Dare Obasanjo wrote:
> > It seems you don't understand how canonicalization
> works.
> > To spell it out, Julian's point is that the
> canonicalization
> > of capitalizations does not apply to the path
> component
> > of HTTP URLs.
>
> It seems that you are too quick to jump on
> potential error... If
> you read my note you'll see I made *no* reference to capitalization
> when I mentioned "different ways to encode" the *path*
> component. My comments
> on case were restricted to the protocol and domain
> fields. There are
> many alternative encodings of the path element that
> are eliminated by
> canonicalization.
Then what is the point of your example in your mail at
http://www.imc.org/atom-syntax/mail-archive/msg09113.html
First of all it contains bogus statements like
"Windows systems are typically much more
liberal in string matches than are Unix/Linux based
systems when it
comes to case-sensitivity" which I have no idea how to
parse. Does it mean that String.Equals in the .NET
Framework or strcmp in Visual C++ work differently
than in Java or C on Unix systems?
Anyway specifically in your mail you wrote
"Then, newly discovered atom:id's are
tested for uniqueness by doing a lookup in the
database -- using a
string key. All sorts of problems will arise if the
string key is
case-insensitive."
Are you now claiming that you did not write this or I
somehow misunderstood that what this implies about
your belief in case insensitivity and
canonicalization? Seriously, what does that statement
mean if it doesn't imply that somehow URI
canonicalization will make it possible to do case
insensitive searches on URIs without problems?
It's one thing to make an honest mistake about an
arcane technology like URI canonicalization and
another to pretend you didn't say something to mask
ignorance about a particular topic.
> > Your case of applications generating IDs from
> database fields
> > where the same ID has different cases will not be
> helped by
> canonicalization.
> Read the example again. Canonicalization will
> require lowercase
> for protocol and domain components. Given that these
> are part of the ID,
> the probability of case-related errors is reduced --
> even though, as I
> said in my note, it is not eliminated.
WTF? So part of the ID will have normalized case. How
exactly will that prevent case related errors in
anything but HIGHLY CONTRIVED scenarios? Your argument
makes sense if canonicalization converted all text to
a single case but it doesn't which tremendously
weakens your argument.
=====
THINGS TO DO IF I BECOME AN EVIL OVERLORD #23
I will keep a special cache of low-tech weapons and train my troops in
their use. That way -- even if the heroes manage to neutralize my power
generator and/or render the standard-issue energy weapons useless -- my
troops will not be overrun by a handful of savages armed with spears and
rocks.
_______________________________
Do you Yahoo!?
Win 1 of 4,000 free domain names from Yahoo! Enter now.
http://promotions.yahoo.com/goldrush