[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: A simpler proposal: PaceAtomIDAsString
At 12:18 AM +0100 8/4/04, Graham Parks wrote:
I agree 100% that the id should always be treated as a string. My
problem is that with no rules on format, a publisher can only
guessing that your identifier is unique, which isn't a problem with
say, tag URIs.
As has been said on this list before, publishers are "only guessing"
that for any ID, including one that is a tag: URI.
At 1:30 AM +0200 8/4/04, Bjoern Hoehrmann wrote:
Hmm, it would make more sense to me to specify that the content must
be for example the uc(md5_hex()) of some reasonably unique data
That's good if you want opaque identifiers, but I haven't heard any
need (or even desire!) for those so far.
and
that implementations must not change the id without the user's consent,
specifically not if [common cases].
That wording would be true for any atom:id proposal. But it's kind of
obvious, isn't it? Do we really need to state it?
Your proposal just obfuscates the issue as common
practise will likely be to use URIs anyway, which would likely yield
in all the problems we have when the spec says it is a more special
kind of string.
I'm not sure why you think it will be common practice. It's easy to
do the right thing here and munge together a DNS name, a time, and a
random string.
At 1:11 AM +0100 8/4/04, Bill de hÓra wrote:
According to the pace, it's not plain text for the rest of us, it's
Unicode for the rest of us.
Those are *identical*.
If the comparison rules are to be character based over Unicode,
that would suggest the Unicode encoding must also be specified so we
know how many bytes are being used per character and so on.
Nope, exactly wrong. It is character-by-character, not byte-by-byte.
But:
"Even if a particular atom:id instance looks like a URI, it SHOULD
NOT be treated as one."
doesn't seem consistent with the idea that we can have URIs if we
want. The pace is saying (to me) something along these lines "if it
looks like a HTTP URL you shouldn't run GET on it; if it looks like
a NewsML URI you shouldn't infer from the versioning bits".
Correct.
We have experience that suggests people will not be able to oblige
themselves to this kind of constraint. IMO it should be struck.
How do others feel about this? I thought the discussion trend was
going towards "do not deference atom:ids".
May I suggest this wording for the time being:
" The "atom:id" element's content conveys a permanent,
globally unique identifier for the feed. It MUST NOT change over
time, even if the feed is relocated. An atom:head element MAY
contain an atom:id element, but MUST NOT contain more than one. The
content of this element, when present, MUST be a string of Unicode
characters encoded as UTF-8. When atom:id elements are compared,
they MUST be compared on a character-by-character basis.
I don't think that would not work in a document whose encoding is
anything other than UTF-8.
It is not a goal that atom:id be usable for retrieval of information.
I'm OK with that wording.
--Paul Hoffman, Director
--Internet Mail Consortium