[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
#1416 Injection-Date: current wording proposal (corrected)
Gr. Ignore the previous version of this message. Saving the file before
regenerating the diff is useful. Please ignore the previous version and
use this one instead.
Here's the updated wording proposal taking Frank's review into account.
This time I'll try XML diff, which I think is a little more readable (and
way easier to prepare) than trying to extract meaning from the text diff
with all the section numbering and pagination changes.
I think it may be worthwhile to reach consensus on several parts of this
separately. For example, if we decide that the whole current history
section is fine, I can incorporate that into the draft and then stop
including it in further diffs for discussion.
--- usepro.xml 2007-02-19 22:20:06.000000000 -0800
+++ usepro-1416.xml 2007-04-28 19:30:11.000000000 -0700
@@ -165,8 +165,9 @@
<t>"Injecting" an article is the processing of a proto-article by
an injecting agent. Normally this action is done once and only
- once for a given article. "Reinjecting" an article is passing an
- already-injected article to an injection agent.</t>
+ once for a given article. "Multiple injection" is passing the
+ same article to multiple injecting agents, either serially or in
+ parallel.</t>
<t>A "gateway" is software which receives news articles and
converts them to messages of some other kind (such as <xref
@@ -446,6 +447,68 @@
</section>
</section>
+ <section anchor="history"
+ title="Article History and Duplicate Suppression">
+ <t>Netnews normally uses a flood-fill algorithm for propagation of
+ articles in which each news server offers articles it accepts to
+ multiple peers and each news server may be offered the same
+ article from multiple other news servers. Accordingly, duplicate
+ suppression is key; if a news server accepted every article it was
+ offered, it may needlessly accept (and then potentially
+ retransmit) dozens of copies of every article.</t>
+
+ <t>Relaying and serving agents therefore MUST keep a record of
+ articles they have already seen and use that record to reject
+ additional offers of the same article. This record is called the
+ history file or database.</t>
+
+ <t>Each article is uniquely identified by its message identifier,
+ so a relaying or serving agent could satisfy this requirement by
+ storing a record of every message identifier that agent has ever
+ seen. Such a history database would grow without bound, however,
+ so it is common and permitted to optimize based on the
+ Injection-Date or Date header field of an article as follows. (In
+ the following discussion, the "date" of an article is defined to
+ be the date represented by its Injection-Date header field if
+ present, otherwise its Date header field.)
+ <list style="symbols">
+ <t>Agents MAY select a cutoff interval and reject any article
+ with a date farther in the past than that cutoff interval. If
+ this interval is shorter than the time it takes for an article
+ to propagate through the network, the agent may reject an
+ article it had not yet seen, so it ought not be aggressively
+ short. For Usenet, for example, a cutoff interval of no less
+ than seven days is conventional.</t>
+
+ <t>Agents that enforce such a cutoff MAY then drop records of
+ articles that had dates older than the cutoff from their
+ history databases. If such an article were offered to the
+ agent again, it would be rejected due to the cutoff date, so
+ the history record is no longer required to suppress the
+ duplicate.</t>
+
+ <t>As an optimization for easier history database
+ manipulation, agents MAY instead drop history records written
+ longer ago than the cutoff interval plus one day. If this
+ retention mechanism is used, the history retention period MUST
+ be longer than the cutoff interval to allow for articles dated
+ in the future unless the agent rejects all articles dated in
+ the future. One day is the maximum allowed error into the
+ future for article dates, so it is a convenient and safe
+ extension for the retention interval.</t>
+ </list>
+ </t>
+
+ <t>This is just one implementation strategy for article history,
+ albeit the most common one. Relaying and serving agents are not
+ required to use this strategy, only to meet the requirement of not
+ accepting an article more than once. However, implementors of
+ general-purpose Netnews relaying and serving agents who do not
+ have extensive experience with Netnews and the subtle effects of
+ its flood-fill algorithm are encouraged to use the above algorithm
+ by default.</t>
+ </section>
+
<section anchor="posting" title="Duties of a Posting Agent">
<t>A posting agent is the component of a user agent that assists a
poster in creating a valid proto-article and forwarding it to an
@@ -453,9 +516,33 @@
<t>Posting agents SHOULD ensure that proto-articles they create
are valid according to <xref target="USEFOR" /> and any other
- applicable policies. They MUST NOT create any Injection-Date or
- Injection-Info header fields; these headers will be added by the
- injecting agent.</t>
+ applicable policies. They MUST NOT create an Injection-Info
+ header field; this header field will be added by the injecting
+ agent.</t>
+
+ <t>If a proto-article contains both Message-ID and Date header
+ fields, posting agents MAY add an Injection-Date header field to
+ that proto-article immediately before passing that proto-article
+ to an injection agent. They SHOULD do so if the Date header field
+ (representing the composition time of the proto-article) is more
+ than a day in the past at the time of injection. If the
+ proto-article is being sent to more than one injecting agent, see
+ <xref target="multi-injection" />.</t>
+
+ <t>The Injection-Date header field is new in this revision of the
+ Netnews protocol and is designed to permit the Date header field
+ to hold the composition date (as defined in section 3.6.1 of <xref
+ target="RFC2822" />), even if the proto-article is not injected
+ for some time after its composition. However, note that all
+ implementations predating this specification ignore the
+ Injection-Date header field and use the Date header field in its
+ stead for rejecting articles older than their cutoff (see <xref
+ target="history" />), and injecting agents predating this
+ specification do not add an Injection-Date header. Articles with
+ a Date header field substantially in the past will still be
+ rejected by implementations predating this specification,
+ regardless of the Injection-Date header field, and may suffer poor
+ propagation.</t>
<t>Contrary to <xref target="RFC2822" />, which implies that the
mailbox or mailboxes in the From header field should be that of
@@ -478,48 +565,70 @@
agent.</t>
<t>A proto-article has the same format as a normal article
- except that the Injection-Date, Injection-Info, and Xref header
- fields MUST NOT be present; the Path header field MUST NOT
- contain a "POSTED" <diag-keyword>; and any of the following
- mandatory header fields MAY be omitted: Message-ID, Date, and
- Path. In all other respects, a proto-article MUST be a valid
- Netnews article. In particular, the header fields which may be
- omitted MUST NOT be present with invalid content.</t>
+ except that the Injection-Info and Xref header fields MUST NOT
+ be present; the Path header field MUST NOT contain a "POSTED"
+ <diag-keyword>; and any of the following mandatory header
+ fields MAY be omitted: Message-ID, Date, and Path. In all other
+ respects, a proto-article MUST be a valid Netnews article. In
+ particular, the header fields which may be omitted MUST NOT be
+ present with invalid content.</t>
<t>If a posting agent intends to offer the same proto-article to
- multiple injecting agents, the header fields Message-ID and Date
- MUST be present and identical in all copies of the
- proto-article.</t>
+ multiple injecting agents, the header fields Message-ID, Date,
+ and Injection-Date MUST be present and identical in all copies
+ of the proto-article. See <xref target="multi-injection" />.</t>
</section>
- <section anchor="reinjection" title="Reinjection of Articles">
- <t>A given article SHOULD be processed by an injecting agent
- once and only once. The Injection-Date or Injection-Info
- header fields are added by an injecting agent and are not
- permitted in a proto-article. Their presence (or the presence
- of other unstandardized or obsolete trace headers such as
- NNTP-Posting-Host, NNTP-Posting-Date, or X-Trace) indicates
- that the proto-article is instead an article and has already
- been processed by an injecting agent. A posting agent SHOULD
- normally reject such articles.</t>
-
- <t>In the exceptional case that an article needs to be
- reinjected for some reason (such as transferring an article from
- one Netnews to another where those networks have no relaying
- agreement), the posting agent doing the reinjection MUST convert
- the article back into a proto-article before passing it to an
- injecting agent (such as by renaming the Injection-Info and
- Injection-Date header fields and removing any Xref header field)
- and MUST perform the date checks on the existing Injection-Date
- or Date header fields that would otherwise be done by the
- injecting agent.</t>
-
- <t>Reinjecting articles may cause loops, loss of trace
- information, and other problems and should only be done with
- care and when there is no available alternative. A posting
- agent that does reinjection is a limited type of gateway and as
- such is subject to all of the requirements of an incoming
- gateway in addition to the requirements of a posting agent.</t>
+ <section anchor="multi-injection"
+ title="Multiple Injection of Articles">
+ <t>Under some circumstances (posting to multiple disjoint
+ networks, injecting agents with spotty connectivity, or
+ redundancy, for example), a posting agent may wish to offer the
+ same article to multiple injecting agents. In this unusual
+ case, the goal is to not create multiple independent articles
+ but rather to inject the same article at multiple points and let
+ the normal duplicate suppression facility of Netnews (see
+ <xref target="history" />) ensure that any given agent only
+ accepts the article once.</t>
+
+ <t>Whenever possible, multiple injection SHOULD be done by
+ offering the same proto-article to multiple injecting agents.
+ The posting agent MUST supply the Message-ID, Date, and
+ Injection-Date header fields, and the proto-article offered to
+ each injecting agent MUST be identical.</t>
+
+ <t>In some cases, offering the same proto-article to all
+ injecting agents may not be possible (such as when transferring,
+ after the fact, articles found on one Netnews network to
+ another, unconnected one). In this case, the posting agent MUST
+ convert the article back into a proto-article before passing it
+ to another injecting agent, but it MUST retain unmodified the
+ Message-ID, Date, and Injection-Date header fields. It MUST NOT
+ add an Injection-Date header field if it is missing from the
+ existing article. It MUST remove any Xref header field and
+ either rename or remove any Injection-Info header field and
+ other trace fields.</t>
+
+ <t>Multiple injection inherently risks duplicating articles.
+ Multiple injection after the fact, by converting an article back
+ to a proto-article and injecting it again, additionally risks
+ loops, loss of trace information, and other problems and should
+ be done with care and only when there is no alternative. The
+ requirement to retain Message-ID, Date, and Injection-Date
+ header fields minimizes the possibility of a loop and ensures
+ that the newly injected article is not treated as a new,
+ separate article.</t>
+
+ <t>Multiple injection of an article listing one or more
+ moderated newsgroups in its Newsgroups header field SHOULD only
+ be done by a moderator and MUST only be done after the
+ proto-article is approved for all moderated groups to which it
+ is posted and has an Approved header field. Multiple injection
+ of an unapproved article intended for moderated newsgroups will
+ normally only result in the moderator receiving multiple copies,
+ and if the newsgroup status is not consistent across all
+ injecting agents, may result in duplication of the article or
+ other problems.</t>
</section>
<section anchor="followups" title="Followups">
@@ -644,23 +753,24 @@
<t>It MUST reject any proto-article that does not have the
proper mandatory header fields for a proto-article; that has
- Injection-Date, Injection-Info, or Xref header fields; that
- has a Path header field containing the "POSTED"
- <diag-keyword>; or that is not syntactically valid as
- defined by <xref target="USEFOR" />. It SHOULD reject any
- proto-article which contains a header field deprecated for
- Netnews. It MAY reject any proto-article that contains trace
- header fields indicating that it was already injected by an
- injecting agent that did not add Injection-Info or
- Injection-Date.</t>
+ Injection-Info or Xref header fields; that has a Path header
+ field containing the "POSTED" <diag-keyword>; or that is
+ not syntactically valid as defined by <xref target="USEFOR"
+ />. It SHOULD reject any proto-article which contains a
+ header field deprecated for Netnews. It MAY reject any
+ proto-article that contains trace header fields indicating
+ that it was already injected by an injecting agent that did
+ not add Injection-Info or Injection-Date.</t>
<t>It SHOULD reject any article whose Date header field is
more than 24 hours into the future (and MAY use a margin less
than 24 hours). It SHOULD reject any article whose Date
- header appears to be stale (more than 72 hours into the past,
- for example, or too old to still be recorded in the database
- of a relaying agent the injecting agent will be using) since
- not all news servers support Injection-Date.</t>
+ header is too far into the past (older than the cutoff
+ interval of a relaying agent the injecting agent is using, for
+ example), since not all news servers support Injection-Date
+ and only the injecting agent can provide a useful error
+ message to the posting agent. This interval SHOULD NOT be any
+ shorter than 72 hours into the past.</t>
<t>It SHOULD reject any proto-article whose Newsgroups header
field does not contain at least one <newsgroup-name> for a
@@ -704,8 +814,14 @@
the source of the article and possibly other trace information
as described in Section 3.2.8 of <xref target="USEFOR" />.</t>
- <t>The injecting agent MUST then add an Injection-Date header
- field containing the current date and time.</t>
+ <t>If the proto-article already had an Injection-Date header
+ field, it MUST NOT be modified or replaced. If the
+ proto-article had both a Message-ID header field and a Date
+ header field, an Injection-Date header field MUST NOT be
+ added, since the proto-article may have been multiply injected
+ by a posting agent that predates this standard. Otherwise,
+ the injecting agent MUST add an Injection-Date header field
+ containing the current date and time.</t>
<t>Finally, the injecting agent forwards the article to one or
more relaying agents, and the injection process is
@@ -801,18 +917,15 @@
field or Message-ID header field, or without either an
Injection-Date or Date header field.</t>
- <t>It MUST reject any article that has already been
- successfully sent to it, based on the Message-ID header field
- of the article. To satisfy this requirement, a relaying agent
- normally keeps a database of message identifiers it has
- already accepted.</t>
-
<t>It MUST examine the Injection-Date header field or, if
absent, the Date header field, and reject the article if that
- date predates the earliest articles of which it keeps record
- or if that date is more than 24 hours into the future. It MAY
- reject articles with dates in the future with a smaller margin
- than 24 hours.</t>
+ date is more than 24 hours into the future. It MAY reject
+ articles with dates in the future with a smaller margin than
+ 24 hours.</t>
+
+ <t>It MUST reject any article that has already been
+ successfully sent to it or that is dated older than its cutoff
+ date, as described in <xref target="history" />.</t>
<t>It SHOULD reject any article that does not include all the
mandatory header fields. It MAY reject any article that
@@ -886,16 +999,13 @@
<t>It MUST examine the Injection-Date header field or, if
absent, the Date header field, and reject the article if that
- date predates the earliest articles of which it keeps record
- or if that date is more than 24 hours into the future. It MAY
- reject articles with dates in the future with a smaller margin
- than 24 hours.</t>
+ date is more than 24 hours into the future. It MAY reject
+ articles with dates in the future with a smaller margin than
+ 24 hours.</t>
<t>It MUST reject any article that has already been
- successfully sent to it, based on the Message-ID header field
- of the article. To satisfy this requirement, a relaying agent
- normally keeps a database of message identifiers it has
- already accepted.</t>
+ successfully sent to it or that is dated older than its cutoff
+ date, as described in <xref target="history" />.</t>
<t>It SHOULD reject any article that matches an
already-received and honored cancel message or Supersedes
@@ -1003,8 +1113,7 @@
for reasons understood by the moderator (such as delays in the
moderation process) in which case they MAY substitute the
current date. Any Injection-Date, Injection-Info, or Xref
- header fields already present (though there should be none)
- MUST be removed.</t>
+ header fields already present MUST be removed.</t>
<t>Any Path header field MUST either be removed or truncated
to only those entries following its "POSTED"
--
Russ Allbery (rra@xxxxxxxxxxxx) <http://www.eyrie.org/~eagle/>