[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: #1416: USEPRO 3.9: Reinjection and Injection-Date



In <8764b1gpsk.fsf@xxxxxxxxxxxxxxxxxxxxx> Russ Allbery <rra@xxxxxxxxxxxx> writes:

>First, let's go back to the Netnews protocol as it exists today and look
>at how injection and reinjection work without Injection-Date.

OK, so we stop firing obscure (but possible) scenarios back and forth
while we examine these possibilities.

>Every article has a unique identity in its message identifier, but
>...........  Therefore, an article's functional identity is
>the Date and message identifier pair.

Yes, Modulo some "fuzziness" of the role of the Date, as Forrest has
pointed out.

Agreed paragraphs snipped ......

>Note that for proto-articles ........ it's best practice for a posting
> agent injecting an article at multiple injection agents to provide the
> complete identity of the article (both Message-ID and Date header fields)
.....

More than that, both Uespro-06 and -07 REQUIRE the posting agent to do
that before multiple injection.

>  Reinjection similarly can be assured of not creating loops simply
>by preserving the identity of the article (both Message-ID and Date header
>fields) when reinjecting.

Hence it follows that if we were to make the rules for Injection-Date
exactly the same as the current rules for Date, it would result in a
system no different from the present situation, except for an increase in
the staleness margin for late-injected articles.

>....  So currently, reinjection breaks down if the
>reinjection cannot happen within the staleness period of the Date header
>....  The only difference is that reinjecting
>agents are generally the sole path to reach a particular site, so if the
>reinjection is delayed, articles go missing.

Eh? ITYM "reinjecting agents are generally the sole path for injected
articles from a particular site to reach the wider Usenet, so if ...".

>  Because of this, some
>reinjecting agents change the identity of the message by regenerating the
>Date header.

OK, let us coin the term "Magic Exception" for that practice. Clearly it
is an "Exception" to the laid-down rules, but why "Magic"? Because:

. It is hard to define exactly when the practice is safe, beyond "you
would know one of these situations when you see it".
. It is certainly only for those who "really REALLY" know what they are
doing, and certainly "Kiddies, don't try this at home" applies.
. It may not be documented in any standard, but people "in the know" will
recognize and condone it so long as it is wisely used - a bit like we
discussed for the $alz convention.

>  More on the risks of that in a moment.

>Now, the problem with this is that the Date header field carries two
>meanings.  It's part of the identity of the article, and it's also
>human-readable information about when the message was composed.  Some
>posting agents always treat it as the former and either let the injecting
>agent generate it or only generate it at injection time.  Some posting
>agents treat it as the latter and generate it at message composition time.

And we want to encourage the latter, for conformity with RFC 2822, and
contrary to the advice in s-o-1036, so explicit words of encouragement
would be in order.

>The working group therefore introduced a new header, Injection-Date, which
>serves *only* the protocol function and separates the protocol function
>from the user presentation function.  Date is now of interest only to
>humans and contains the composition time, which serves no protocol
>function.  Injection-Date is part of the identity of the message.

Right!

>However, as part of this change, we also changed who was responsible for
>establishing the identity of a message.  Now, the injecting agent *always*
>establishes the identity of a proto-article, and the posting agent isn't
>permitted to do so.

Yes, Usepro-06 did not permit that but, in the light of this discussion, I
am thinking it went too far. I have no problem with letting the posting
agent do it in appropriate circumstances.

>  Only when a proto-article is injected does it acquire
>a unique identity in practice; prior to that, it will only have a message
>identifier, which is not sufficient alone to prevent duplication of the
>message.

Look at it this way. It is the "act of injection" which requires an
Injection-Date to be created. Whether it is created by the injector (the
posting agent) or by the injectee (the injecting agent) is a secondary
matter, so long as one of them does it. Of course, there are good reasons
why normal practice should be for the injecting agent to do it, but
special cases such as multiple- or re-injection could be different.

But, from the posting agent's POV, the "act of injection" comes later than
the "act of authorship". Typical current posting agents work as follows.
The poster writes an article and presses "send". At that point, a Date
header is created and the posting agent attempts to make an NNTP
connection to a server. If that fails (because the server cannot be
reached), it puts the article into an "Outbox" where it sits until a
server can be reached. Maybe it tries every 5 minutes, or maybe there is a
mechanism for it to be activated when next the user dials in. Either way,
only then should the Injection-Date be added (assuming the posting agent
is gong to add it, which would probably not be the usual case anyway).

>This isn't a problem only for reinjection.  It's a problem for every case
>where an article is touched by multiple injecting agents.  In particular,
>since the introduction of Injection-Date, a proto-article that's injected
>at multiple injecting agents will be assigned a slightly different
>identity by each one.  In the normal case, these identities will only vary
>by seconds or at most minutes, which in practice is highly unlikely to
>cause problems, but theoretically we still broke the identity model.

So if posting agents are permitted to add Injection-Date, you might
recommend them to do so when multi-injecting, though it is hardly
necessary with those short delays (delays of several days are a different
matter of course, meriting at least a warning in the document). But
typical current posting agents don't do multi-injecting, so this is only
likely to arise with savy users who have written their own injecting
scripts (or, more likely, are running their own private news servers).

>Since the introduction of Injection-Date, time-delayed multiple injection
>is no longer safe against duplication. ... Prior to Injection-Date,
>it could make sure that all copies had the same identity and the later
>injection would never duplicate, just possibly suffer from poor
>propagation because it's stale.

Unless it took advantage of the Magic Exception, which you say some of
them did.

>Of course, serial multiple injection, reinjection, suffers the most, since
>it is the most likely to happen some time later.  So in usepro-06,
>reinjection was distinguished from injection and only in the reinjection
>case (as determined by the injecting agent) was the article permitted to
>retain its original Injection-Date.

Actually, it did say that an existing Injection-Date (however arising)
MUST NOT be altered. But it was never my intention that reinjection would
require prior agreement from the injecting agent (which is not to say that
an injecting agent has not the usual right to refuse articles which it
detects to be reinjections).


>So, we have some competing goals:

Though actually the areas of competition competition are not all that
great, leading us to hope that they can be bridged.

........

>So far, we have a couple of different solutions.

> * The current draft, my approach, basically takes the stance of "well, we
>   asked for it, so we take the consequences."........ the agent
>   doing the reinjection is required to verify as well as possible that it
>   won't create the chance of duplicates.  In practice, the drawbacks are
>   hopefully limited.

The chief drawback is the difficulty for the reinjecting agent to do that
verification.

>  There's no way for that check to be perfect, but
>   hopefully it's no worse than the other case of multiple injection that
>   we're already dealing with.  As a nice side effect, this also means
>   that delays in reinjection don't cause articles to be lost.

> * usepro-06 makes reinjection a special case with a different set of
>   rules that require that the identity of the article be preserved.  In
>   other to do this, negotiation between the posting agent and the
>   injecting agent is required .....

No, there was no necessity for negotiation.

>  ..........  The
>   core point is that reinjection preserves article identity, but in the
>   process maintains the current drawback that delays in reinjection may
>   cause articles to be lost.  There may be a bit of an attractive
>   nuisance here, the same one that's present in the existing protocol,
>   for reinjecting agents to drop the Injection-Date header anyway so as
>   to not lose articles, thereby reintroducing the problems the current
>   draft causes.

Which would be a Magic Exception, neither better nor worse than that
practised by current Magicians.

>Forrest has proposed only allowing reinjection between disjoint networks.

Apparently he did not intend to propose that. Anyway, it would be
dangerous and unenforceable.

>...........  The
>difficulty here, as Charles correctly points out, is that there's no
>general way of establishing that two networks are disjoint.  One can
>sometimes determine for individual articles that they passed through a
>host on network A and on network B and therefore the networks are not
>disjoint, but to be sure would require full knowledge of at least one of
>the networks.  There are some common cases where this is possible, but
>it's not possible in general, and it's possible to think that you know the
>networks are disjoint and be wrong.

Exactly! That is an excellent summary of my position. It is even possible
not to be aware that you are reinjecting.

>There is another solution that require modifying USEFOR:

> * Drop Injection-Date entirely and go back to the current protocol.  This
>   solves this problem and reintroduces the problem that we were trying to
>   fix originally.

No. If, as a minimum, Injection-Date follows exactly the same rules as the
existing Date, then you are never any worse than the present situation.
OTOH, you are sometimes better because you can gain some extra time before
the staleness check hits you.

>There is another solution that arguably requires modifying USEFOR:

Actually, I think the existing USEFOR works here just fine. What it says
is:

   "This header field MUST be inserted whenever an article is injected."

No mention of whether it is the injector or the injectee that is
responsible for that, so we are free to specify the mechanism as seems
best [1]. So the first sentence in what follows is not needed.

> * Change the definition of the Injection-Date header field so that the
>   posting agent can provide it, and indeed MUST provide it if they wish
>   Date to be treated as a comment rather than a protocol element.
>   Require posting agents doing multiple injection to include either a
>   Date or (preferrably) a Posting-Date header.

ITYM s/Posting-Date/Injection-Date/. Note that we already require posting
agents to provide Date and Message-ID when multi-injecting. We might
choose to add Injection-Date to that.

>                                                Don't allow injecting
>   agents to set Injection-Date at all if Date was provided.

No No! "Don't allow injecting agents to set Injection-Date at all if
Injection-Date was already provided". If Date alone was already provided,
the injecting agent should presume it was an authorship date and add a
proper Injection-Date (even if it was only a few seconds later). That
satisfies the MUST in USEFOR.

>  This has the
>   advantage of solving the multiple injection problem (present in both my
>   draft and in usepro-06), and allows reinjecting agents to assert the
>   same article identity in the same way as is possible with the current
>   protocol.  Basically, this would make Injection-Date *entirely* the
>   same as the protocol purpose of the current Date header, rather than
>   mostly the same.  It has the same difficulties with delayed
>   reinjection.

Which brings us back to the Magic Exception.

Yes, with the above provisos, I would support that proposal.

>One could make the argument that we could do this now, that nothing in the
>current USEFOR definition says that the posting agent can't provide this
>header.  The current text implies it, but doesn't say so outright.

Nor could it, because the terms "posting agent" and "injecting agent" are
not even defined in USEFOR.

>  The
>header name is confusing if the posting agent is providing it, but we
>could live with that.

Not confusing at all if you regard "injection" as an interaction between
an injector and an injectee.

>The drawback of this approach, of course, is that it still doesn't deal
>with the issue of delayed reinjection.

Right! So now we need to take a serious look at the "Magic Exception".
Firstly, we need to understand in which circumstances it would be safe.
For sure, it is safe in the case of a user who runs a private news server
on his own machine and never interacts with users outside that machine
except via proper Usenet injecting agents. Fortunately, that is the
commonest case in which reinjection is likely to arise (next most common
will be exotic gatewaying situations, and after that I am not sure beyond
people who are reinjecting without realising it).

Are there any other situation where it is guaranteed safe? Perhaps a small
private network which is a 'peninsula' to the rest of Usenet, with the
reinjecting server on the 'isthmus'. But can you be sure it is really a
'peninsula'? Perhaps you can if it is all behind a firewall that blocks
all unauthorized outgoing NNTP. But the further you go beyond that, the
more "iffy" it becomes.

So, if/when we feel we understand that, do we then write some form of
Magic Exception into the draft, and if so how? Or do we leave it as a
piece of Magic only to be practised by Magicians who "really REALLY" know
what they are up to? Because, for sure, such magicians will indeed exist
and do it whatever we say (merely claiming, if asked, that their "private
server" is really just an exotic user agent from the POV of the rest of
Usenet).

>  Some reinjecting agents are going
>to want to change the identity of a message so that it will still be
>accepted even though it's old.  Doing so inherently runs the risk of
>creating duplicates, but doing so is going to be a common desire --
>gateways go down or off-line for a while, networks go out, people want to
>pull down older news to bootstrap or seed a new Netnews network, etc.  If
>we just outlaw it, we have no control over how people do it if they choose
>to break the protocol.  If we can come up with something useful to say
>about *how* to change the identity of the message, maybe we can reduce the
>risk.

Yes, I think that is a good summary of what we would need to look at. But
all those issues already arise, in much the same form, on the existing
network, and the predicted "Death of Usenet" still has not happened.


[1] OK, there is a hint that Injection-Date is done by news servers in an
example in the definition of "generate", but it would be a minor tweak to
fix that.

-- 
Charles H. Lindsey ---------At Home, doing my own thing------------------------
Tel: +44 161 436 6131 Fax: +44 161 436 6133   Web: http://www.cs.man.ac.uk/~chl
Email: chl@xxxxxxxxxxxxxxxx      Snail: 5 Clerewood Ave, CHEADLE, SK8 3JU, U.K.
PGP: 2C15F1A9      Fingerprint: 73 6D C2 51 93 A0 01 E7 65 E8 64 7E 14 A4 AB A5