[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
#1416 Injection-Date: proposed diff
It's been a while since we discussed this, so here again is my current
proposed diff. If there are parts of this we can agree on independently,
I can start merging it into the main draft.
--- usepro.xml 2007-07-01 19:38:06.000000000 -0700
+++ usepro-1416.xml 2007-07-16 13:43:23.000000000 -0700
@@ -18,6 +18,8 @@
'http://xml.resource.org/public/rfc/bibxml/reference.RFC.2822.xml'>
<!ENTITY rfc3629 PUBLIC ''
'http://xml.resource.org/public/rfc/bibxml/reference.RFC.3629.xml'>
+ <!ENTITY rfc3798 PUBLIC ''
+ 'http://xml.resource.org/public/rfc/bibxml/reference.RFC.3798.xml'>
<!ENTITY rfc3977 PUBLIC ''
'http://xml.resource.org/public/rfc/bibxml/reference.RFC.3977.xml'>
<!ENTITY rfc4234 PUBLIC ''
@@ -165,8 +167,9 @@
<t>"Injecting" an article is the processing of a proto-article by
an injecting agent. Normally this action is done once and only
- once for a given article. "Reinjecting" an article is passing an
- already-injected article to an injection agent.</t>
+ once for a given article. "Multiple injection" is passing the
+ same article to multiple injecting agents, either serially or in
+ parallel.</t>
<t>A "gateway" is software which receives news articles and
converts them to messages of some other kind (such as <xref
@@ -452,6 +455,66 @@
</section>
</section>
+ <section anchor="history"
+ title="Article History and Duplicate Suppression">
+ <t>Netnews normally uses a flood-fill algorithm for propagation of
+ articles in which each news server offers articles it accepts to
+ multiple peers and each news server may be offered the same
+ article from multiple other news servers. Accordingly, duplicate
+ suppression is key; if a news server accepted every article it was
+ offered, it may needlessly accept (and then potentially
+ retransmit) dozens of copies of every article.</t>
+
+ <t>Relaying and serving agents therefore MUST keep a record of
+ articles they have already seen and use that record to reject
+ additional offers of the same article. This record is called the
+ "history" file or database.</t>
+
+ <t>Each article is uniquely identified by its message identifier,
+ so a relaying or serving agent could satisfy this requirement by
+ storing a record of every message identifier that agent has ever
+ seen. Such a history database would grow without bound, however,
+ so it is common and permitted to optimize based on the
+ Injection-Date or Date header field of an article as follows. (In
+ the following discussion, the "date" of an article is defined to
+ be the date represented by its Injection-Date header field if
+ present, otherwise its Date header field.)
+ <list style="symbols">
+ <t>Agents MAY select a cutoff interval and reject any article
+ with a date farther in the past than that cutoff interval. If
+ this interval is shorter than the time it takes for an article
+ to propagate through the network, the agent may reject an
+ article it had not yet seen, so it ought not be aggressively
+ short. For Usenet, for example, a cutoff interval of no less
+ than seven days is conventional.</t>
+
+ <t>Agents that enforce such a cutoff MAY then drop records of
+ articles that had dates older than the cutoff from their
+ history databases. If such an article were offered to the
+ agent again, it would be rejected due to the cutoff date, so
+ the history record is no longer required to suppress the
+ duplicate.</t>
+
+ <t>Alternatively, agents MAY drop history records according to
+ the date when the article was first seen by that agent rather
+ than the date of the article. In this case, the history
+ retention interval MUST be at least 24 hours longer than the
+ cutoff interval to allow for articles dated in the future.
+ This interval matches the allowable error in the date of the
+ article (see <xref target="injecting" />).</t>
+ </list>
+ </t>
+
+ <t>These are just two implementation strategies for article
+ history, albeit the most common ones. Relaying and serving agents
+ are not required to use these strategies, only to meet the
+ requirement of not accepting an article more than once. However,
+ these strategies are safe and widely deployed and implementors are
+ encouraged to use one of them, especially if they not have
+ extensive experience with Netnews and the subtle effects of its
+ flood-fill algorithm.</t>
+ </section>
+
<section anchor="posting" title="Duties of a Posting Agent">
<t>A posting agent is the component of a user agent that assists a
poster in creating a valid proto-article and forwarding it to an
@@ -459,9 +522,33 @@
<t>Posting agents SHOULD ensure that proto-articles they create
are valid according to <xref target="USEFOR" /> and any other
- applicable policies. They MUST NOT create any Injection-Date or
- Injection-Info header fields; these headers will be added by the
- injecting agent.</t>
+ applicable policies. They MUST NOT create any Injection-Info
+ header field; this header field will be added by the injecting
+ agent.</t>
+
+ <t>If the proto-article already contains both Message-ID and Date
+ header fields, posting agents MAY add an Injection-Date header
+ field to that proto-article immediately before passing that
+ proto-article to an injection agent. They SHOULD do so if the
+ Date header field (representing the composition time of the
+ proto-article) is more than a day in the past at the time of
+ injection. If the proto-article is being submitted to more than
+ one injecting agent, see <xref target="multi-injection" />.</t>
+
+ <t>The Injection-Date header field is new in this revision of the
+ Netnews protocol and is designed to allow the Date header field to
+ hold the composition date (as recommended in section 3.6.1 of
+ <xref target="RFC2822" />), even if the proto-article is not
+ injected for some time after its composition. However, note that
+ all implementations predating this specification ignore the
+ Injection-Date header field and use the Date header field in its
+ stead for rejecting articles older than their cutoff (see <xref
+ target="history" />), and injecting agents predating this
+ specification do not add an Injection-Date header. Articles with
+ a Date header field substantially in the past will still be
+ rejected by implementations predating this specification,
+ regardless of the Injection-Date header field, and hence may
+ suffer poorer propagation.</t>
<t>Contrary to <xref target="RFC2822" />, which implies that the
mailbox or mailboxes in the From header field should be that of
@@ -484,48 +571,73 @@
agent.</t>
<t>A proto-article has the same format as a normal article
- except that the Injection-Date, Injection-Info, and Xref header
- fields MUST NOT be present; the Path header field MUST NOT
- contain a "POSTED" <diag-keyword>; and any of the following
- mandatory header fields MAY be omitted: Message-ID, Date, and
- Path. In all other respects, a proto-article MUST be a valid
- Netnews article. In particular, the header fields which may be
- omitted MUST NOT be present with invalid content.</t>
+ except that the Injection-Info and Xref header fields MUST NOT
+ be present; the Path header field MUST NOT contain a "POSTED"
+ <diag-keyword>; and any of the following mandatory header
+ fields MAY be omitted: Message-ID, Date, and Path. In all other
+ respects, a proto-article MUST be a valid Netnews article. In
+ particular, the header fields which may be omitted MUST NOT be
+ present with invalid content.</t>
<t>If a posting agent intends to offer the same proto-article to
- multiple injecting agents, the header fields Message-ID and Date
- MUST be present and identical in all copies of the
- proto-article.</t>
+ multiple injecting agents, the header fields Message-ID, Date,
+ and Injection-Date MUST be present and identical in all copies
+ of the proto-article. See <xref target="multi-injection" />.</t>
</section>
- <section anchor="reinjection" title="Reinjection of Articles">
- <t>A given article SHOULD be processed by an injecting agent
- once and only once. The Injection-Date or Injection-Info
- header fields are added by an injecting agent and are not
- permitted in a proto-article. Their presence (or the presence
- of other unstandardized or obsolete trace headers such as
- NNTP-Posting-Host, NNTP-Posting-Date, or X-Trace) indicates
- that the proto-article is instead an article and has already
- been processed by an injecting agent. A posting agent SHOULD
- normally reject such articles.</t>
-
- <t>In the exceptional case that an article needs to be
- reinjected for some reason (such as transferring an article from
- one Netnews to another where those networks have no relaying
- agreement), the posting agent doing the reinjection MUST convert
- the article back into a proto-article before passing it to an
- injecting agent (such as by renaming the Injection-Info and
- Injection-Date header fields and removing any Xref header field)
- and MUST perform the date checks on the existing Injection-Date
- or Date header fields that would otherwise be done by the
- injecting agent.</t>
-
- <t>Reinjecting articles may cause loops, loss of trace
- information, and other problems and should only be done with
- care and when there is no available alternative. A posting
- agent that does reinjection is a limited type of gateway and as
- such is subject to all of the requirements of an incoming
- gateway in addition to the requirements of a posting agent.</t>
+ <section anchor="multi-injection"
+ title="Multiple Injection of Articles">
+ <t>Under some circumstances (posting to multiple disjoint
+ networks, injecting agents with spotty connectivity, or for
+ redundancy, for example), a posting agent may wish to offer the
+ same article to multiple injecting agents. In this unusual
+ case, the goal is to not create multiple independent articles
+ but rather to inject the same article at multiple points and let
+ the normal duplicate suppression facility of Netnews (see <xref
+ target="history" />) ensure that any given agent accepts the
+ article only once.</t>
+
+ <t>Whenever possible, multiple injection SHOULD be done by
+ offering the same proto-article to multiple injecting agents.
+ The posting agent MUST supply the Message-ID, Date, and
+ Injection-Date header fields, and the proto-article as offered
+ to each injecting agent MUST be identical.</t>
+
+ <t>In some cases, offering the same proto-article to all
+ injecting agents may not be possible (such as when gatewaying,
+ after the fact, articles found on one Netnews network to
+ another, supposedly unconnected one). In this case, the posting
+ agent MUST convert the article back into a proto-article before
+ passing it to another injecting agent, but it MUST retain
+ unmodified the Message-ID, Date, and Injection-Date header
+ fields. It MUST NOT add an Injection-Date header field if it is
+ missing from the existing article. It MUST remove any Xref
+ header field and either rename or remove any Injection-Info
+ header field and other trace fields.
+ <list style="empty">
+ <t>NOTE: Multiple injection inherently risks duplicating
+ articles. Multiple injection after the fact, by converting
+ an article back to a proto-article and injecting it again,
+ additionally risks loops, loss of trace information,
+ unintended repeat injection into the same network, and other
+ problems. It should be done with care and only when there
+ is no alternative. The requirement to retain Message-ID,
+ Date, and Injection-Date header fields minimizes the
+ possibility of a loop and ensures that the newly injected
+ article is not treated as a new, separate article.</t>
+ </list>
+ </t>
+
+ <t>Multiple injection of an article listing one or more
+ moderated newsgroups in its Newsgroups header field SHOULD only
+ be done by a moderator and MUST only be done after the
+ proto-article is approved for all moderated groups to which it
+ is to be posted and has an Approved header field (see <xref
+ target="moderator" />). Multiple injection of an unapproved
+ article intended for moderated newsgroups will normally only
+ result in the moderator receiving multiple copies, and if the
+ newsgroup status is not consistent across all injecting agents,
+ may result in duplication of the article or other problems.</t>
</section>
<section anchor="followups" title="Followups">
@@ -650,23 +762,27 @@
<t>It MUST reject any proto-article that does not have the
proper mandatory header fields for a proto-article; that has
- Injection-Date, Injection-Info, or Xref header fields; that
- has a Path header field containing the "POSTED"
- <diag-keyword>; or that is not syntactically valid as
- defined by <xref target="USEFOR" />. It SHOULD reject any
- proto-article which contains a header field deprecated for
- Netnews. It MAY reject any proto-article that contains trace
- header fields indicating that it was already injected by an
- injecting agent that did not add Injection-Info or
- Injection-Date.</t>
-
- <t>It SHOULD reject any article whose Date header field is
- more than 24 hours into the future (and MAY use a margin less
- than 24 hours). It SHOULD reject any article whose Date
- header appears to be stale (more than 72 hours into the past,
- for example, or too old to still be recorded in the database
- of a relaying agent the injecting agent will be using) since
- not all news servers support Injection-Date.</t>
+ Injection-Info or Xref header fields; that has a Path header
+ field containing the "POSTED" <diag-keyword>; or that is
+ not syntactically valid as defined by <xref target="USEFOR"
+ />. It SHOULD reject any proto-article which contains a
+ header field deprecated for Netnews (see, for example, <xref
+ target="RFC3798" />). It MAY reject any proto-article that
+ contains trace header fields (e.g., NNTP-Posting-Host)
+ indicating that it was already injected by an injecting agent
+ that did not add Injection-Info or Injection-Date.</t>
+
+ <t>It SHOULD reject any article whose Injection-Date or Date
+ header field is more than 24 hours into the future (and MAY
+ use a margin less than 24 hours). It SHOULD reject any
+ article whose Injection-Date header field is too far in the
+ past (older than the cutoff interval of a relaying agent the
+ injecting agent is using, for example). It SHOULD similarly
+ reject any article whose Date header field is too far in the
+ past, since not all news servers support Injection-Date and
+ only the injecting agent can provide a useful error message to
+ the posting agent. In either case, this interval SHOULD NOT
+ be any shorter than 72 hours into the past.</t>
<t>It SHOULD reject any proto-article whose Newsgroups header
field does not contain at least one <newsgroup-name> for a
@@ -710,8 +826,14 @@
the source of the article and possibly other trace information
as described in Section 3.2.8 of <xref target="USEFOR" />.</t>
- <t>The injecting agent MUST then add an Injection-Date header
- field containing the current date and time.</t>
+ <t>If the proto-article already had an Injection-Date header
+ field, it MUST NOT be modified or replaced. If the
+ proto-article had both a Message-ID header field and a Date
+ header field, an Injection-Date header field MUST NOT be
+ added, since the proto-article may have been multiply injected
+ by a posting agent that predates this standard. Otherwise,
+ the injecting agent MUST add an Injection-Date header field
+ containing the current date and time.</t>
<t>Finally, the injecting agent forwards the article to one or
more relaying agents, and the injection process is
@@ -806,18 +928,18 @@
field or Message-ID header field, or without either an
Injection-Date or Date header field.</t>
- <t>It MUST reject any article that has already been
- successfully sent to it, based on the Message-ID header field
- of the article. To satisfy this requirement, a relaying agent
- normally keeps a database of message identifiers it has
- already accepted.</t>
-
<t>It MUST examine the Injection-Date header field or, if
absent, the Date header field, and reject the article if that
- date predates the earliest articles of which it keeps record
- or if that date is more than 24 hours into the future. It MAY
- reject articles with dates in the future with a smaller margin
- than 24 hours.</t>
+ date is more than 24 hours into the future. It MAY reject
+ articles with dates in the future with a smaller margin than
+ 24 hours.</t>
+
+ <t>It MUST reject any article that has already been accepted.
+ If it implements the mechanism described in <xref
+ target="history" />, this means that it MUST reject any
+ article whose date falls outside the cutoff interval since it
+ won't know whether such articles had been accepted previously
+ or not.</t>
<t>It SHOULD reject any article that does not include all the
mandatory header fields. It MAY reject any article that
@@ -891,16 +1013,16 @@
<t>It MUST examine the Injection-Date header field or, if
absent, the Date header field, and reject the article if that
- date predates the earliest articles of which it keeps record
- or if that date is more than 24 hours into the future. It MAY
- reject articles with dates in the future with a smaller margin
- than 24 hours.</t>
-
- <t>It MUST reject any article that has already been
- successfully sent to it, based on the Message-ID header field
- of the article. To satisfy this requirement, a relaying agent
- normally keeps a database of message identifiers it has
- already accepted.</t>
+ date is more than 24 hours into the future. It MAY reject
+ articles with dates in the future with a smaller margin than
+ 24 hours.</t>
+
+ <t>It MUST reject any article that has already been accepted.
+ If it implements the mechanism described in <xref
+ target="history" />, this means that it MUST reject any
+ article whose date falls outside the cutoff interval since it
+ won't know whether such articles had been accepted previously
+ or not.</t>
<t>It SHOULD reject any article that matches an
already-received and honored cancel message or Supersedes
@@ -1008,8 +1130,7 @@
for reasons understood by the moderator (such as delays in the
moderation process) in which case they MAY substitute the
current date. Any Injection-Date, Injection-Info, or Xref
- header fields already present (though there should be none)
- MUST be removed.</t>
+ header fields already present MUST be removed.</t>
<t>Any Path header field MUST either be removed or truncated
to only those entries following its "POSTED"
@@ -2042,6 +2163,7 @@
&rfc1036;
&rfc2045;
&rfc2606;
+ &rfc3798;
&rfc3977;
<reference anchor="USEAGE">
<front>
--
Russ Allbery (rra@xxxxxxxxxxxx) <http://www.eyrie.org/~eagle/>