From: Charles Lindsey (chl@clerew.man.ac.uk)
Date: Tue Feb 03 2004 - 10:56:51 CST
In <87fzduy3wp.fsf@windlord.stanford.edu> Russ Allbery <rra@stanford.edu> writes:
>It's unfortunately going to be very difficult to change the effective
>semantics of Date on Usenet since it's an integral part of the acceptance
>algorithms of pretty much every news server out there. That being said,
>the ideal would be to preserve the e-mail semantics of Date for
>compatibility and introduce a new header, like Injection-Date, which is
>used for the accept/reject algorithm that's dominated by history size.
>This would also make gatewaying far easier, since one could preserve the
>original Date header and just add the new Usenet-specific header.
>I'm very skeptical of our ability to get there from here, though.
Actually, I think it is relatively straightforward, and certainly not as
complicated as Bill's scheme involving a succession of standrads track
documents.
There are two dates to consider, the "composition-date" and the
"injection-date".
1. The official Date-header is the composition-date according to RFC 2822,
and according to many current posting agents which manufacture it on that
basis (though others leave it to the injecting agent to fill in). Also, it
is useful to see the time at which the poster composed it, because you can
then see the perspective from which he was writing (in the context of a
rapidly changing thread). Also, the consensus I detect amongst those
contributing to this thread is to accept that situation (and it is also
what our draft currently says).
2. OTOH, the injection-date would be a very useful thing to have in the
article in order to allow detection of stale articles, to allow history
files not to keep too much stuff around, and so on.
3. In practice, if the composition-date and the injection-date are
sufficiently close together, it does not really matter which one is used
in any staleness check. 24 hours is "sufficiently close" for that purpose.
4. Current software uses the Date-header, which is usually the
composition-date, and this works well enough in practice that it need not
be considered seriously broken. However, it does make things awkward if
injection/moderation is delayed for more than 24 hours. We would like to
extend that period (and IF a figure has to be placed on it, then 72 hours
seems about right).
5. We would like to devise a scheme that could be gradually introduced
into the existing Usenet without disruption, leading to a gradual
improvement in the long term.
6. I therefore propose the following:
Agents doing staleness checks SHOULD/MUST use either the
composition-date or the injection-date, WHICHEVER IS THE LATER.
Injecting agents/moderators/etc SHOULD/MUST include the injection-date
in the article if it is more that 24 (72/whatever) hours later than the
composition-date and MAY/SHOULD include it regardless.
That leaves some details to be worked out:
A. How to indicate the injection-date. Possibilities are the Injector-Info
header and a new Injection-Date header. I suppose even the old
NNTP-Posting-Date could be used as a stop gap.
B. Which bits are MAY/SHOULD/MUST?
C. 24 hours or 72 hours or whatever?
D. What happens to articles that get injected twice? It is not supposed to
happen, except in some rather rare gatewaying situations, and with some
particular BOFHish injectors, but it needs to be covered. I suspect the
2nd injector needs to overwrite whatever Injection-Date was present before
(which is what we currently prescribe for Injector-Info).
>> 2. Bases for expiration of an article from a storage system.
I think we are agreed this is a Red Herring for this discussion.
>> 3. Source of delay between composition/editing, injection, and arrival
>> at a given system. Offline composition has been mentioned.
>> Obviously, delays in the moderation process also cause delay between
>> composition and injection. And clearly there are transport delays --
>> it is not unusual for propagation of an article to take several days.
>Actually, that is pretty unusual on Usenet. Not unheard of, but you have
>to be behind a very slow initial link for it to take anywhere near that
>long, and at some point those cases start looking a lot like off-line
>composition rather than a true case of slow propagation.
But once in a blue moon, some disaster strikes. All the major relayers get
taken down with some worm/DOS-attack/whatever, and all the back channels
have to cope with the whole propagation load. We need to retain that
resilience.
>Parsing Injector-Info in its current form in order to get that information
>is, I believe, a non-starter. It's certainly code I have no interest in
>writing.
I don't think it would be that bad, but I hear preferences for a separate
new header.
-- Charles H. Lindsey ---------At Home, doing my own thing------------------------ Tel: +44 161 436 6131 Fax: +44 161 436 6133 Web: http://www.cs.man.ac.uk/~chl Email: chl@clerew.man.ac.uk Snail: 5 Clerewood Ave, CHEADLE, SK8 3JU, U.K. PGP: 2C15F1A9 Fingerprint: 73 6D C2 51 93 A0 01 E7 65 E8 64 7E 14 A4 AB A5