From: Charles Lindsey (chl@clerew.man.ac.uk)
Date: Thu May 08 2003 - 14:47:34 CDT
Whilst looking into splitting Section 5 for USEAGE, I spotted a few
syntactic issues that still required attention. And I also thought it
would be better to move some of the special syntax that distinguishes us
from RFC 2822 out of the particular headers involved and into section
2.4.2 (since most of those differences affect more than one header, and
it seemed better to have all such syntactic oddities in one place). And
it seemed easier to do all this before the USEAGE split.
The actual changes to the syntax fixed at this time are:
msg-id now prohibits ">" anywhere within it (this is essential to be
consistent with the NNTPEXT draft, and with current practice).
I allow "From: Joe Q. Public <joe@public.example>" on a MUST accept
SHOULD NOT generate yet basis, to bring us in line with RFC 2822 (Bruce
raised this some while back, and the idea seemed to be agreed).
So here is the new 2.4.2. Most of the new texts have been removed from
their original places, so the overall size is much as it was.
2.4.2. Syntax adapted from Email and MIME
Much of the syntax of Netnews Articles is based on the corresponding
syntax defined in [RFC 2822]. Therefore, wherever in this standard
the syntax is stated to be taken from [RFC 2822], it is to be
understood, unless explicitly stated to the contrary, as the syntax
defined by [RFC 2822], but NOT including any syntax defined in
section 4 ("Obsolete syntax") of [RFC 2822]. Software compliant with
this standard MUST NOT generate any of the syntactic forms defined in
that Obsolete Syntax, although it MAY accept such syntactic forms.
Likewise, certain syntax from the MIME specifications [RFC 2045] et
seq is also considered to have been incorporated into this standard
(see 6.21).
However, there are some differences arising from some special
requirements of Netnews, and the following syntactic rules therefore
supersede the corresponding rules given in [RFC 2822].
NOTE: Netnews parsers historically have been much less
permissive than Email parsers, and this is reflected in the
modifications referred to, and in some further specific rules.
In contradistinction to [RFC 2822], an unstructured header (e.g. a
Subject-header) MUST contain at least one non-whitespace character
(see also remarks about empty headers in 4.2.6).
unstructured = 1*( [FWS] ( utext / encoded-word ) ) [FWS]
Extended-phrases (known somewhat confusingly in [RFC 2822] as obs-
phrases) are introduced to allow headers such as
From: Joe Q. Public <joe@public.example>
without the necessity of using a quoted-string. They MUST be accepted
by compliant software, but they SHOULD NOT be generated until
software capable of accepting them has become widely deployed.
phrase = 1*( [CFWS] encoded-word [CFWS] / word ) /
extended-phrase
extended-phrase = ( [CFWS] encoded-word [CFWS] / word )
*( [CFWS] encoded-word [CFWS] / word /
[CFWS] "." [CFWS] )
[ [RFC 2822] had
obs-phrase = word *( word / "." / CFWS )
Please can Pete check that what I have is equivalent?]
Within a date-time, two of the obs-zones from [RFC 2822] are retained
because of current widespread usage.
zone = (( "+" / "-" ) 4DIGIT) / "UT" / "GMT"
The forms "UT" and "GMT" (indicating universal time) are to be
regarded as obsolete synonyms for "+0000". They MUST be accepted, and
passed on unchanged, by all agents, but they MUST NOT be generated as
part of new articles by posting and injecting agents.
Msg-ids are redefined to be a "normalized" subset of those defined by
[RFC 2822], ensuring that no string of characters is quoted unless
strictly necessary (it must contain at least one mqspecial) and no
single character is prefixed by a "\" in the form of a quoted-pair
unless strictly necessary, and moreover there is no possibility for
">" or WSP to occur inside a msg-id, whether quoted or not. Thus,
whereas under [RFC 2822]
<abcd@example.com>
<"abcd"@example.com>
<"ab\cd"@example.com>
would be considered semantically equivalent, only the first of them
is syntactically permitted by this standard, and hence a simple
comparison of octets will always suffice to determine the identity of
two msg-ids.
msg-id = "<" id-left "@" id-right ">"
id-left = dot-atom-text / no-fold-quote
id-right = dot-atom-text / no-fold-literal
no-fold-quote = DQUOTE
*( mqtext / "\\" / "\" DQUOTE )
mqspecial
*( mqtext / "\\" / "\" DQUOTE )
DQUOTE
mqtext = NO-WS-CTL / ; all of <text> except
%d33 / ; SP, HTAB, "\", ">"
%d35-61 / ; and DQUOTE
%d63-91 /
%d93-126
mqspecial = "(" / ")" / ; same as specials except
"<" / ; "\" and DQUOTE quoted
"[" / "]" / ; and ">" omitted
":" / ";" /
"@" / "\\" /
"," / "." /
"\" DQUOTE
no-fold-literal = "[" *( mdtext / "\[" / "\]" / "\\" ) "]"
mdtext = NO-WS-CTL / ; Non white space controls
%d33-61 / ; The rest of the US-ASCII
%d63-90 / ; characters not including
%d94-126 ; ">", "[", "]", or "\"
Charles H. Lindsey ---------At Home, doing my own thing------------------------
Tel: +44 161 436 6131 Fax: +44 161 436 6133 Web: http://www.cs.man.ac.uk/~chl
Email: chl@clerew.man.ac.uk Snail: 5 Clerewood Ave, CHEADLE, SK8 3JU, U.K.
PGP: 2C15F1A9 Fingerprint: 73 6D C2 51 93 A0 01 E7 65 E8 64 7E 14 A4 AB A5