[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: #1416: USEPRO 3.9: Reinjection and Injection-Date



Forrest J Cavalier <mibsoft@xxxxxxxxxxxxxxx> writes:

> Russ, I think your lengthy analysis is accurate, except for two points.

> 1. The article identity (for preventing duplicates) is not actually a
> 2-tuple of Message-ID, Date.  It is Message-ID + fuzzy arrival date.
> Every date within the past 3 days is probably as good as any other date
> in that range.  The header fields do not always set the roll-off from
> message-id history.  It is arrival order.  That changes your conclusions
> a bit too, I think.

> Overconstraining the solutions to guaranteeing (Message, Date) is not
> strictly necessary, and I think you throw out some potentially rewarding
> arrangements.

This is, of course, entirely correct, and as soon as I read this, I went
"doh, yes."  However, after poking at this idea for a few minutes, I
couldn't come up with an effect of the distinction that would change the
conclusions.  Could you elaborate a bit more?  Having history expiration
be by arrival rather than by header means that in practice servers will
keep a longer history than strictly necessary and could check some older
articles, but if the Date header (by whatever name) is stale, the article
still loses.

(Plus, I think it's actually conforming to our current specification to
write a news server that expires history entries based on the
Injection-Date header of the article rather than by arrival time.  INN
optionally can do this, and I think such an implementation still fulfills
the requirements of the standard.  That, of course, doesn't change your
point, but does mean that if we want to use the normal arrival-based
expiration, we need to require it somewhere.)

> 2. I think you present all branches of your fault tree as if they were
> all equally likely in practice.  I disagree with that approach when we
> admit we don't have a solution for all cases for all people.

Yeah, that was partly a side effect of trying to look at this at a
theoretical level rather than analyzing practical examples.  I'm worried
that we're not agreeing on the overall theoretical analysis structure
before discussing what we can do for individual cases.

It's true that not all of these problems are equal.  For example, we're
theoretically breaking our uniqueness model with multiple injection a
minute apart.  This opens the door to duplicates.  However, to get a
duplicate in practice, the later copy of the article would have to arrive
less than a minute after the earlier copy had expired, something that in
practice is hideously unlikely.

> I prefer to take the most likely of situations, standardize something
> that works, and then ignore the least likely branches.  They fall
> outside our standard, or they can go look at Duties of a Gateway and
> figure out something.

> If we disallow "late (re)injection", don't most of the problems go away?

If we disallow late injection in general, both injection and reinjection,
then we don't need Injection-Date.  If we can drop Injection-Date, we get
back the current Netnews protocol, which we know works and which we can
make statements about based on current practice.  In this case, we can add
a requirement that posting agents doing multiple injection always provide
the same Message-ID and Date header fields in all copies and that
reinjection always preserves both of those headers, and everything works
reliably and without duplicates except for late injection and reinjection.

We may or may not want to say something about how to work around late
reinjection, or we could just leave it to the gateways section for someone
who really knows what they're doing.

This, however, reverses a decision made in USEFOR, which we were treating
as sacrosanct, which is why I was trying not to push that route.  (It's
been my preference for a while.  I think Injection-Date introduces more
problems than it solves and the complexity isn't worth it.)

> Drop late injection.  It almost requires a flag day, not because we get
> duplicates (we would), but because the contents of a newsgroup is going
> to be different at different sites, depending on whether they
> implemented expiry/incoming relay on Injection Date, or Date.  And isn't
> that is the problem that it was intending to prevent?  If an article is
> re-injected late, we have the amusing situation of an article appearing,
> not appearing, and appearing as a duplicate on various servers.

> And all this complexity about late injections for what?  A handful of
> posters per day on Usenet?  Hours later is not going to matter.    How
> many posters per day on Usenet will be posting days later than authoring?
> Does anyone have an estimate?

> Aren't we knocking ourselves out over mere annoyance?  Just tell those
> posters that their message had better have a current Date: header if
> they want it to propagate, (that's reality, and telling them different
> is not helpful) and if a late Date header seems like deception, they can
> note it in the body of their message. If they want a sooner expiry, add
> a header field.  Voila!  Backwards compatible too.

This is basically the line of reasoning under why I put Date staleness
checking back in for injecting agents.  I think that if we're introducing
Injection-Date, we need to provide a transition during which it's clear
that Injection-Date still isn't going to be a panacea, and in that
transition period, I don't think we're doing posters any favors accepting
articles that most people will keep ignoring.  It may be that checking at
the injecting agent so that we can return an error message isn't the best
transitional plan, but it feels better to me than not giving people any
warning at all.

One point on which I think Charles is definitely right, though, is that
Injection-Date is a different thing than Injection-Info and the other
non-proto-article headers like Xref.

-- 
Russ Allbery (rra@xxxxxxxxxxxx)             <http://www.eyrie.org/~eagle/>