[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: msg-id




Thorfinn <thorfinn@xxxxxxxxxxxxxx>:

>Indeed there sometimes is, but often it's buried amongst so much
>extraneous random stuff that it's hard to find the point. 

I tend to respond to two or three things at one time. If the other two 
things don't interest you, fine, but they are not "random" and not 
"extraneous". 

As for "random" and "extraneous" ...

>Apart from all the quoted stuff below, I note that message-id in email
>appears to be *optional*, 

That's nice. Also irrelevant. Maybe even "random" and certainly 
"extraneous". 

>And even with that circumstance, such collisions could easily happen
>even *without* a past owner, since there are several places (and thus
>several different algorithms in use) where a message ID could be
>generated on the *same* host. 

The fact that there might be several different processes that create 
message ids does not mean that there are several different algorithms 
involved. It does not mean there will be collisions, either.

>And then we have the problem that a *lot* of user-agents (both mail and
>news ones) are operating behind firewalls, and do *not* have actual
>valid Internet domain names to use as the RHS,

Excuse me, but "firewall" and "no valid domain name" are not synonymous. 

>nor do they even have valid unique Internet IP addresses, 

Random and extraneous, I think you called it. So what? Where is the 
requirement for a unique IP address?

>Anyway, it is *entirely* feasible for those several different
>algorithms, even when used on a host with a proper FQDN, to generate
>colliding message-ids.

If I proposed several different algorithms for use at the same time, why 
certainly this repeated discussion about several different algorithms 
would be valid. But since I did not ...

>The point is, some of us *do* accept that even a "MUST" *does* bow to
>practicalities. 

The "practicality" in this case, as a reminder, is an algorithm that is
known to produce not only ids that look the same as some other algorithm
(sauce for the goose), but duplicates all by itself. Since there are known
algorithms that do NOT produce duplicates all by themselves (the one I
proposed being just one) this "practicality" is just another way of saying
"MUST doesn't mean MUST". Otherwise, you'd say "it's broken".

>So, I guess we come to the meat of it:

Were I to have taken offense at your "random" and "extraneous" comments, 
I'd make a pithy remark here, something about getting to "the meat of it" 
after such a long wind-up.

>I think that a unique message id is a *theoretically impossible* thing
>to generate. I don't care *what* algorithm you use, someone can design
>a counter algorithm, which also generates "unique" ids, which collides
>with the previous algorithm. 

It does not matter what algorithm YOU design, it will not change the 
product of MY algorithm. Now, you can deliberately create duplicates by 
using your algorithm, but that does not disprove mine, only yours. Design 
an algorithm that is: 1) get latest message in group 'news.groups' 2) 
copy the message id from that. That doesn't mean the algorithm you are 
copying from is broken, it means YOURS is broken.

>Now, whilst we *can* override the mail standards to say that the
>uniqueness is now a "SHOULD" rather than a "MUST"...

When did we do that? Who has even suggested it?

"Charles Lindsey" <chl@xxxxxxxxxxxxxxxx>:

>So what does "MUST be unique" actually mean, when it comes to be
>interpreted in the Real World (TM)? It means that, if ever you find a
>system that has generated a duplicate message-id, you have a perfect
>excuse to go and beat him about the head for being in clear violation of
>the standard.

This is going to be fun. I'm going to write a system that posts a flood of 
messages using message ids from Charles's message id space, and when he 
finds his system producing a "duplicate", he'll beat himself up. After 
all, the duplicate is clearly his fault.

>In practical terms, if a site is persistently generating bad message-ids,
>then complaining to its provider/ISP/upstream is likely to be effective,

Since duplicates are not accepted at injection, it would be pretty hard 
for anyone but the person running the injecting agent or the posting agent 
to ever see one. Why would they complain to their upstream news site, or 
to their ISP?

>... because even the more clueless providers can see that interoperability is
>being affected.

Excuse me? If my incoming gateway correctly detects and prevents a loop by 
using the message id in the incoming message, and that results in the 
injecting agent not being able to inject the article, how has 
"interoperability" been affected? Isn't that the "Right Thing" to do?