[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Injection-Date and reinjection



In <87fyayf8fc.fsf@xxxxxxxxxxxxxxxxxxxxx> Russ Allbery <rra@xxxxxxxxxxxx> writes:

>First, one issue that's *not* directly part of this.  Even if we keep the
>current USEPRO method, we need to have a staleness check on the Date
>header field at injection.

Well there we have the first major disagreement. The whole point of
introducing Injection-Date was to remove the need for staleness checks
based on Date (except in cases where older software had not provided an
Injection-Date).

The Date header now indicates when the article was composed (following RFC
2822), so it can arise that the poster carries it around on his laptop for
several days before it actually gets injected. That could reasonably be a
week later (whereas Russ proposes to declare it stale after 72 hours,
which is far too short). Maybe even a fortnight if he writes the article
and then goes off on holiday for two weeks and omits to dialin to his
server before he goes. Clearly, leaving it for several months would be
stupid, but such timing issues are a matter of sensible practice, and not
a matter for the protocol.

>  We can't make a single leap to allowing stale
>Date header fields from not having any Injection-Date header field since
>most existing software is going to ignore the Injection-Date header field
>entirely and impose staleness checks on Date after injection.  Therefore,
>an article with a stale Date is going to propagate very poorly, and
>accepting such an article just to let it be dropped silently later by most
>of the Netnews network would be a poor interoperability choice.

OK, so let it propagate poorly. In fact, I don't think its propagation
will be all that bad. For sure, the fast transit backbone may decline it,
but it will still flood its way through the system via serving agents with
normal expiry times. I would have expected it to have reached most sites
within one day. And as Injection-Date becomes more widely implemented,
things will of course get better.

But this is a thing that is easily testable. I have just posted a series
of articles "1 day stale", "2 days stale" up to "9 days stale" to
misc.test, and I will see what the autoresponders tell me (and it would be
useful for members of this list to look out for them also). I expect to
see some long and unusual Paths for the staler ones, but I expect them
all to arrive eventually. We shall see.

>Now, let's look at the conditions under which this situation arises.


>....  The INN
>code to permit people to inject posts via IHAVE is something I committed
>only under protest, and I'm entirely content to have its behavior declared
>non-comformant by a standard.  It is a *very* specialized configuration
>for strange cases where the people involved know just what they're doing.

Not really. I regard it as a method of avoiding unnecessary wastage of
server resources in cases where articles are injected at more than one
server. If one of the servers has already received it, then you save the
trouble of transmitting the whole thing again only for it then to be
thrown away.

>That being said, there are disjoint Netnews networks that people want to
>connect, and multiple injection or proxying injection isn't always the
>best approach (for one, it requires a real-time connection with the remote
>injecting agent).  My definition of two disjoint Netnews networks is two
>Netnews networks between which there is no relaying agent to relaying
>agent connection.

Not so. You also have to ensure that there is at most only one *-agent to
injecting agent route.

>  Significant examples are:

> * The many people who do not have any relaying agreement with a large
>   Netnews network such as Usenet, only a reading agent relationship, but
>   still want the features of a full-fledged news server (longer article
>   retention, multiple users sharing the same spool, faster service to
>   reading agents, local newsgroups without requiring clients use multiple
>   servers, off-line operation).  This has become increasingly common over
>   the last five years for both personal and small organization use.

That is probably safe for personal use, but in the case of small
organizations, and increasingly as those organizations get larger, there
is the possibility that some well-meaning user in the organization will
create some surreptitious path/leak to the outer world, even if only for
some supposedly safe group that suddenly becomes unsafe because of an
unanticipated crosspost. It is a sad fact of life that such leaks DO
happen.

> * Ad hoc connections to a stand-alone news server that serves only a
>   particular hierarchy. ...

And again these can lead to unanticipated leaks.


>The goal is to handle these cases with a solution that has the
>following properties:

> * Netnews loop detection and prevention is preserved.

> * Articles are stamped with trustworthy injection information.  Since,
>   particularly in the first and third cases, the reason why no relaying
>   agreement is in place is precisely because the original injection
>   information can't be trusted, this requires replacing injection
>   information when the message is gated from one Netnews network to
>   another.

This is certainly true in the case of Injection-Info. The case in dispute
is Injection-Date which, where it already exists, can probably be trusted
to have an accurate date, or at worst a date too far into the past which
is a safe error so far as loop avoidance is concerned.

>My realization when reviewing the draft was that all or nearly all of the
>cases where this arises and is not simply a bug (such as treating relaying
>and injecting as interchangeable) can be meaningfully treated as disjoint
>Netnews networks.

Yes, I see now where you are coming from as treating local networks at the
edges of Usenet as "disjoint", but my concern is that such disjointedness
is too hard to police.

>  The case which Charles raises of a news server that
>acts as a relaying agent to one server and a posting agent to another
>server on the same network is an interesting possible exception; it's
>rather pathological, but I can see the argument that the end results, if
>done properly, are not distinguishable from multiple injection.

Yes, I accept "pathological" as a valid description of what I do, but my
view is that most cases of reinjection will turn out to be "pathological"
to some degree, and that it is impossible to foresee all the strange
circumstances that might be involved.

>  But let's
>put that aside for a moment and look at the alternatives before us in the
>more typical case.

There are essentially two possibilities:

1. Injection-Dates are ALWAYS rewritten by injecting agents (or
alternatively articles containing them are ALWAYS dropped).

2. Once an article has acquired an Injection-Date (whatever network put it
there), it ALWAYS retains that Injection-Date wherever it subsequently
goes.

#2 (which I advocate) is evidently safe against causing loops, but might
cause articles to be lost. So one MIGHT allow an exception for admins who
really REALLY knew what they were up to.

#1 is not safe against loops unless the admin who allows the rewriting
really REALLY knows what he is up to.

>Option 1 (my current draft)

>    Proto-articles may not contain Injection-Date.  Injecting agents must
>    reject any message that contains Injection-Date.  If an agent needs to
>    reinject a message, it must think of itself as a gateway, remove the
>    injection header fields (including Injection-Date), perform whatever
>    checks it needs to ensure that it's not creating a loop, and then
>    reinject the message, at which point it gets a new Injection-Date
>    header field with the current date.  This gateway is fully responsible
>    for not creating loops.

>Option 2 (current USEPRO draft)

>    For a given post, an injecting agent may either be doing injection or
>    reinjection based on whether injection headers fields are already
>    present.  It may decline to do reinjection.  If it doesn't, it may
>    rename an existing Injection-Info header to some other name or remove
>    it and must retain any existing Injection-Date header field.  It may
>    perform a staleness check against the Date header field if no
>    Injection-Date header field is present.  (The current USEPRO draft
>    doesn't say an injecting agent can perform staleness checks against an
>    existing Injection-Date header, but I presume that's simply an
>    oversight.)

Not an oversight. It was presumed that the immediately following serving
or relaying agent would do that check.

>    The injecting agent is responsible for not creating
>    loops, not the agent that offers the article for reinjection.

If the injecting agent retains the Injection-Date (maybe it should check
it is not into the future) then it has fulfilled its loop-avoiding
responsibility. If the offering agent has removed the Injection-Date, then
that is another matter.

>I've already talked about why I think the first option is cleaner and
>clearer from a protocol description and implementation standpoint.  Let me
>instead focus on failure modes.

>The primary risk of the first option is that it could create a slow loop
>in the presence of bidirectional gatewaying as follows:

Which is why USEPRO required an outgoing gateway to retain the
Injection-Date if possible, i.e if the other medium had some place to
record it. If the other medium cannot record it anyway, then the two
Options are equivalent, and both must rely on the circularity checking by
the gateways (which SHOULD be provided anyway).


>The second option does not have this same failure mode, and indeed it
>opens the door to fewer additional problems as long as we still have to
>enforce fresh Date header fields.  The failure mode with the second option
>comes when we want to drop the staleness check on a Date header field.
>Suppose that a stand-alone news server B wants to pull articles from a
>Netnews network A for local consumption.  B has a power failure and is
>off-line for an extended period of time (a week, say).  When it comes back
>up, it wants to catch up on all of the news from A that it missed while it
>was down.  Now, as long as we have to enforce Date staleness, it can only
>do this by ignoring a SHOULD at reinjection, since those articles will
>likely now have stale Dates, but it can be configured to do that.
>However, under option 2, catching up on gatewaying has now become
>impossible because it is REQUIRED to retain the Injection-Date header of
>the original message but that Injection-Date header is now stale and will
>therefore be rejected by any relaying or serving agent in B.

Well that is hardly a common scenario, and for sure B is going to have to
bypass some checks somewhere whichever Option we use. For sure, it no
longer cares about relaying stuff onwards (the rest of the net will have
seen those articles long since), so it only cares about its own serving
agent(s). Now for serving agents, staleness is determined by "the earliest
articles of which it keeps record" (in its history file). Since the expiry
time of serving agents is typically longer than a week, it may not even
need to bypass the staleness check at all, but if not then a temporary
change to that "earliest articles of which it keeps record" is all that is
needed. That might indeed let in a few articles it had actually seen
before the breakdown but which got sent again from A as part of the mixup,
but that is a small price to pay for a breakdown and will not lead to any
loops so long as the original Injection-Date is still retained, and
provided it does not promptly break down again for a further week. Indeed,
it is that magic "72 hours" which seems to be built into Option 1 that is
likely to be more of a problem than Injection-Date staleness.

>Now, my basic argument, apart from the simplicity of option 1, is that the
>failure scenario for option 2 is worse than the failure scenario for
>option 1.

Which doesn't seem to be born out by my analysis.


>Okay, phew.  Now back to Charles's case.

>Suppose we have a new server with a relaying agreement to another
>unreliable news server and a posting agreement with a much more reliable
>news server, and that news server wants to send outgoing articles via both
>paths.  Under option 2, the injecting agent that this server talks to must
>support reinjection (it's an optional feature in the current USEPRO
>draft), must enable it for that client, and then takes responsibility for
>doing the appropriate transformation, but Injection-Date is retained.
>Under option 1, this news server must either support multiple injection
>(far and away the best option) and inject at the remote server at the same
>time as the message is injected locally, or it must both relay the message
>and gateway it to the remote injecting agent.  The gatewaying is more
>complex, but can be done entirely locally and the article will then look
>to the remote server like a regular proto-article.  No special support on
>the remote server is therefore required; all of the work is born by the
>agent doing something unusual.  In either case, two slightly different
>copies of the message will exist with different trace headers, but this is
>acceptable; the same is true for multiple injection.

I agree that the only cases in which Option 2 could be problematical is
with a privately operated local server such as my own. In all other cases,
as I have tried to show, it works at least as well as, and often better
than, Option 1.

Now for a local server there are two problems:

1. It may operate offline, so there will be a delay between when it is
injected locally and when it is relayed (whether by proper relaying or by
posting), though it would be unusual for that delay to be long enough to
cause premature staleness. Hence the reason I mentioned the possibility of
allowing exceptions for admins who "really REALLY" knew what they were
about. In this case, "really REALLY" means being quite sure that there is
no possibility of unofficial leaks to the outer world (easy if he is the
sole user of the system, harder in a small organization where other users
are not so easily controlled). But in that case, a reasonable "cheat"
would be to delay adding the Injection-Date until the time came to make
the outgoing connection.

2. If he only has posting privileges at his outgoing feed (or for some of
his outgoing feeds) then he has to be sure that they will accept what is
technically a reinjection. Again, the obvious "cheat" is to omit the
Injection-Date, and for that he has to be just as "really REALLY" sure as
in the first case.

Yes, the burden of preventing nasty things happening rests with the agent
wanting to do the unusual things.

But note also that the alternative scenario you have suggested for
injecting only to the outgoing feed, or to the outgoing feed and the local
system simultaneously, only works if you have an "always on" connection to
the outside.

-- 
Charles H. Lindsey ---------At Home, doing my own thing------------------------
Tel: +44 161 436 6131 Fax: +44 161 436 6133   Web: http://www.cs.man.ac.uk/~chl
Email: chl@xxxxxxxxxxxxxxxx      Snail: 5 Clerewood Ave, CHEADLE, SK8 3JU, U.K.
PGP: 2C15F1A9      Fingerprint: 73 6D C2 51 93 A0 01 E7 65 E8 64 7E 14 A4 AB A5