From: Charles Lindsey (chl@clw.cs.man.ac.uk)
Date: Fri Feb 08 2002 - 13:56:14 CST
In <Gr09nL.AHI@clw.cs.man.ac.uk> chl@clw.cs.man.ac.uk (Charles Lindsey) writes:
>In <7cKPhbFkxmW8QAmq@pillar.turnpike.com> Paul Overell <paulo@turnpike.com> writes:
>>I don't want to get into a discussion of the various merits of different
>>syntactic meta-languages. ABNF may not be perfect but it has become the
>>syntactic meta-language of choice for RFCs, it is a proposed standard.
>>We should use it as is, without extensions. The above is not ABNF.
>No, it is not strict ABNF, but it has been in the draft for two, maybe
>three, years now, and this is the first time somebody has complained. Does
>anybody else on this list want this to be reviewed?
Nobody has responded to this, so I propose to leave the mechanism in the
draft, but with a few tweaks in response to this discussion.
>I will grant you that a short explanatory paragraph in section 2.4 would
>help, and I will write one.
I now have:
2.4. Syntax
2.4.1. Syntax Notation
This standard uses the Augmented Backus Naur Form described in [RFC
2234]. Additionally, some syntax rules are given in the form of
schemata from each of which several rules in the [RFC 2234] format
can be derived. For example, the schema
{USENET}-header = {USENET}-name ":" 1*SP {USENET}-content
*( [CFWS] ";" ( {USENET}-parameter /
other-parameter ) )
(see section 4.1) implies the existence of a large number of rules,
one for each header defined by this standard. Substituting the
template "{USENET}" by, for example, "Archive" thus gives rise to the
actual rule
Archive-header = Archive-name ":" 1*SP Archive-content
*( [CFWS] ";" ( Archive-parameter /
other-parameter ) )
[Observe the templates now enclosed in {...} rather than <...>. In our
previous drafts they were not enclosed at all, but I think they need some
specific notation to make them stand out.]
>>From: paulo@turnpike.com
>>Because without an explicit production for the from header there is
>>nothing in your syntax to link "From" with from-content. The best it
>>could do is the general header-content.
Yes. There are now a full set of rules for {USENET}-name, such as
>Foo-name = "Foo"
Here now is the full Basic Format syntax:
4. Basic Format
4.1. Syntax of News Articles
The overall syntax of a news article is:
article = 1*( header CRLF ) separator body
header = {USENET}-header / other-header
{USENET}-header = {USENET}-name ":" 1*SP {USENET}-content
*( [CFWS] ";" ( {USENET}-parameter /
other-parameter ) )
{USENET}-name = <a header-name defined in this standard
(or an extension of it) for a specific
{USENET}-header>
header-name = 1*name-character *( "-" 1*name-character )
name-character = ALPHA / DIGIT
{USENET}-content = <the content of a header defined in this
standard (or an extension of it) for a
specific {USENET}-header>
{USENET}-parameter= <an other-parameter defined in this standard
(or an extension of it) for a specific
{USENET}-header>
other-parameter = attribute "=" value
attribute = {USENET}-token / iana-token / x-token
{USENET}-token = <A token defined in this standard for
use in conjunction with a specific
{USENET}-parameter>
iana-token = <a token defined in an experimental
or standards-track RFC and registered
with IANA>
x-token = [CFWS] "x-" token-core [CFWS]
token = [CFWS] token-core [CFWS]
token-core = 1*<any (US-ASCII) CHAR except SP, CTLs,
or tspecials>
tspecials = "(" / ")" / "<" / ">" / "@" /
"," / ";" / ":" / "\" / DQUOTE /
"/" / "[" / "]" / "?" / "="
value = token / quoted-string
other-header = header-name ":" 1*SP other-content
other-content
= <the content of a header defined by some
other standard>
separator = CRLF
body = *( *998text CRLF )
And finally, here is the complete Collected Syntax as it now stands:
Appendix B - Collected Syntax
In the following syntactic rules, nunbers in the left hand margin
indicate rules taken from other documents, specifically:
2 from [RFC 2822] with the exception of those elements described
therein as "obsolete";
4 from [RFC 2234];
5 from [RFC 2045].
Where the number is followed by an asterisk ('*'), it indicates that
the rule in question has been modified for the purposes of this
standard.
B.1 Characters, Atoms and Folding
4 ALPHA = %x41-5A / ; A-Z
%x61-7A ; a-z
2 CFWS = *([FWS] comment) (([FWS] comment) / FWS )
4 CR = %x0D ; carriage return
4 CRLF = CR LF
4 DIGIT = %x30-39 ; 0-9
4 DQUOTE = %d34 ; quote mark
2 FWS = ([*WSP CRLF] 1*WSP); Folding whitespace
4 HTAB = %x09 ; horizontal tab
4 LF = %x0A ; line feed
2 NO-WS-CTL = %d1-8 / ; US-ASCII control characters
%d11 / ; which do not include the
%d12 / ; carriage return, line feed,
%d14-31 / ; and whitespace characters
%d127
4 SP = %x20 ; space
4 WSP = SP / HTAB ; Whitespace characters
UTF8-xtra-2-head = %xC2-DF
UTF8-xtra-3-head = %xE0 %xA0-BF / %xE1-EC %x80-BF /
%xED %x80-9F / %xEE-EF %x80-BF
UTF8-xtra-4-head = %xF0 %x90-BF / %xF1-F7 %x80-BF
UTF8-xtra-5-head = %xF8 %x88-BF / %xF9-FB %x80-BF
UTF8-xtra-6-head = %xFC %x84-BF / %xFD %x80-BF
UTF8-xtra-char = UTF8-xtra-2-head 1( UTF8-xtra-tail ) /
UTF8-xtra-3-head 1( UTF8-xtra-tail ) /
UTF8-xtra-4-head 2( UTF8-xtra-tail ) /
UTF8-xtra-5-head 3( UTF8-xtra-tail ) /
UTF8-xtra-6-head 4( UTF8-xtra-tail )
UTF8-xtra-tail = %x80-BF
2 atext = ALPHA / DIGIT /
"!" / "#" / ; Any character except
"$" / "%" / ; controls, SP, and specials.
"&" / "'" / ; Used for atoms
"*" / "+" /
"-" / "/" /
"=" / "?" /
"^" / "_" /
"`" / "{" /
"|" / "}" /
"~"
2 atom = [CFWS] 1*atext [CFWS]
2 ccontent = ctext / quoted-pair / comment
2 comment = "(" *([FWS] ccontent) [FWS] ")"
2* ctext = NO-WS-CTL / ; all of <text> except
%d33-39 / ; SP, HTAB, "(", ")"
%d42-91 / ; and "\"
%d93-126 /
UTF8-xtra-char
2 dcontent = dtext / quoted-pair
2 dot-atom = [CFWS] dot-atom-text [CFWS]
2 dot-atom-text = 1*atext *( "." 1*atext )
2 dtext = NO-WS-CTL / ; Non white space controls
%d33-90 / ; The rest of the US-ASCII
%d94-126 ; characters not including
; "[", "]", or "
2 phrase = 1*word
2 qcontent = qtext / quoted-pair
2* qtext = NO-WS-CTL / ; all of <text> except
%d33 / ; SP, HTAB, "\" and DQUOTE
%d35-91 /
%d93-126 /
UTF8-xtra-char
2 quoted-pair = "\" text
2 quoted-string = [CFWS] DQUOTE
*( [FWS] qcontent ) [FWS]
DQUOTE [CFWS]
2 specials = "(" / ")" / ; Special characters used in
"<" / ">" / ; other parts of the syntax
"[" / "]" /
":" / ";" /
"@" / "\" /
"," / "." /
DQUOTE
strict-qcontent = strict-qtext / strict-quoted-pair
strict-quoted-pair = "\" strict-text
strict-quoted-string
= [CFWS] DQUOTE
*( [FWS] strict-qcontent ) [FWS]
DQUOTE [CFWS]
strict-qtext = NO-WS-CTL / ; qtext restricted to
%d33 / ; US-ASCII
%d35-91 /
%d93-126
strict-text = %d1-9 / ; text restricted to
%d11-12 / ; US-ASCII
%d14-127
2* text = %d1-9 / ; all UTF-8 characters except
%d11-12 / ; US-ASCII NUL, CR and LF
%d14-127 /
UTF8-xtra-char
5 tspecials = "(" / ")" / "<" / ">" / "@" /
"," / ";" / ":" / "\" / DQUOTE /
"/" / "[" / "]" / "?" / "="
2* utext = NO-WS-CTL / ; Non white space controls
%d33-126 / ; The rest of US-ASCII
UTF8-xtra-char
2 word = atom / quoted-string
B.2 Basic Forms
{USENET}-header = {USENET}-name ":" 1*SP {USENET}-content
*( [CFWS] ";" ( {USENET}-parameter /
other-parameter ) )
2 addr-spec = local-part "@" domain
2 address = mailbox / group
2 address-list = address *( "," address )
2 angle-addr = [CFWS] "<" addr-spec ">" [CFWS]
article = 1*( header CRLF ) separator body
attribute = {USENET}-token / iana-token / x-token
body = *( *998text CRLF )
2 display-name = phrase
2 date = day month year
2 date-time = [ day-of-week "," ] date FWS time [CFWS]
2 day = [FWS] 1*2DIGIT
2 day-name = "Mon" / "Tue" / "Wed" / "Thu" /
"Fri" / "Sat" / "Sun"
2 day-of-week = [FWS] day-name
2 domain = dot-atom / domain-literal
2 domain-literal = [CFWS] "[" *([FWS] dcontent) [FWS] "]" [CFWS]
2 group = display-name ":" [ mailbox-list / CFWS ] ";"
[CFWS]
header = {USENET}-header / other-header
header-name = 1*name-character *( "-" 1*name-character )
2 hour = 2DIGIT
iana-token = <A token defined in an experimental
or standards-track RFC and registered
with IANA>
2* local-part = dot-atom / strict-quoted-string
2 mailbox = name-addr / addr-spec
2 mailbox-list = mailbox *( "," mailbox )
2 minute = 2DIGIT
2 month = FWS month-name FWS
2 month-name = "Jan" / "Feb" / "Mar" / "Apr" /
"May" / "Jun" / "Jul" / "Aug" /
"Sep" / "Oct" / "Nov" / "Dec"
2 name-addr = [display-name] angle-addr
name-character = ALPHA / DIGIT
other-header = header-name ":" 1*SP other-content
other-content
= <the content of a header defined by some
other standard>
other-parameter
= attribute "=" value
2 second = 2DIGIT
separator = CRLF
2 time = time-of-day FWS zone
2 time-of-day = hour ":" minute [ ":" second ]
5* token = [CFWS] token-core [CFWS]
5* token-core = 1*<any (US-ASCII) CHAR except SP, CTLs,
or tspecials>
5 value = token / quoted-string
x-token = [CFWS] "x-" token-core [CFWS]
2 year = 4*DIGIT
2* zone = (( "+" / "-" ) 4DIGIT) / "UT" / "GMT"
B.3 Headers
B.3.1 Template definitions
{CONTROL}-verb = <the verb defined in this standard
(or an extension of it) for a specific
{CONTROL} message>
{CONTROL}-arguments = <the arguments defined in this standard
(or an extension of it) for a specific
{CONTROL} message>
{USENET}-content
= <the content of a header defined in this
standard (or an extension of it) for a
specific {USENET}-header>
{USENET}-name
= <a header-name defined in this standard
(or an extension of it) for a specific
{USENET}-header>
{USENET}-parameter
= <an other-parameter defined in this standard
(or an extension of it) for a specific
{USENET}-header>
{USENET}-token = <a token defined in this standard for
use in conjunction with a specific
{USENET}-parameter>
B.3.2 Template instantiations
Approved-content = From-content
Approved-name = "Approved"
Archive-content = [CFWS] ("no" | "yes" ) [CFWS]
Archive-name = "Archive"
Archive-parameter = Filename-token "=" value
Cancel-arguments = CFWS msg-id
Cancel-verb = "cancel"
Checkgroup-arguments = [ chkscope ] [ chksernr ]
Checkgroup-verb = "checkgroups"
Complaints-To-content= address-list
Complaints-To-name = "Complaints-To"
Control-content = {CONTROL}-verb {CONTROL}-arguments
Control-name = "Control"
Date-content = date-time
Date-name = "Date"
Distribution-content = distribution *( dist-delim distribution )
Distribution-name = "Distribution"
Expires-content = date-time
Expires-name = "Expires"
Filename-token = [CFWS] "filename" [CFWS]
Followup-To-content = Newsgroups-content / "poster"
Followup-To-name = "Followup-To"
From-content = mailbox-list
From-name = "From"
Ihave-arguments = *( msg-id SP ) relayer-name
Ihave-verb = "ihave"
Injector-Info-content= path-identity
Injector-Info-name = "Injector-Info"
Injector-Info-parameter
= posting-host-parameter /
posting-account-parameter /
posting-sender-parameter /
posting-logging-parameter /
posting-date-parameter
Keywords-content = phrase *( "," phrase )
Keywords-name = "Keywords"
Lines-content = [CFWS] 1*DIGIT
Lines-name = "Lines"
Mail-Copies-To-content
= copy-addr / "nobody" / "poster"
Mail-Copies-To-name = "Mail-Copies-To"
Message-ID-content = msg-id
Message-ID-name = "Message-ID"
Mvgroup-arguments = CFWS newsgroup-name CFWS newsgroup-name
[ CFWS newgroup-flag ]
Mvgroup-verb = "mvgroup"
Newgroup-verb = "newgroup"
Newgroup-arguments = CFWS newsgroup-name [ CFWS newgroup-flag ]
Newsgroups-content = newsgroup-name
*( *FWS ng-delim *FWS newsgroup-name )
*FWS
Newsgroups-name = "Newsgroups"
Organization-content
= 1*( [FWS] utext )
Organization-name = "Organization"
Path-content = *( path-identity [FWS] path-delimiter [FWS] )
tail-entry *FWS
Path-name = "Path"
Posted-And-Mailed-content
= "yes" / "no"
Posted-And-Mailed-name
= "Posted-And-Mailed"
Posting-Account-token= "posting-account"
Posting-Date-token = "posting-date"
Posting-Host-token = "posting-host"
Posting-Logging-token= "logging-data"
Posting-Sender-token = "sender"
References-content = msg-id *( CFWS msg-id )
References-name = "References"
Reply-To-content = address-list
Reply-To-name = "Reply-To"
Rmgroup-arguments = CFWS newsgroup-name
Rmgroup-verb = "rmgroup"
Sender-content = mailbox
Sender-name = "Sender"
Sendme-arguments = Ihave-arguments
Sendme-verb = "sendme"
Subject-content = [ back-reference ] pure-subject
Subject-name = "Subject"
Summary-content = 1*( [FWS] utext )
Summary-name = "Summary"
Supersedes-content = msg-id
Supersedes-name = "Supersedes"
User-Agent-content = product-token *( CFWS product-token )
User-Agent-name = "User-Agent"
Xref-content = [CFWS] server-name 1*( CFWS location )
Xref-name = "Xref"
B.3.3 Other header rules
arguments = *( CFWS value )
article-locator = 1*( %x21-7E ) ; US-ASCII printable characters
article-size = 1*DIGIT
back-reference = %x52.65.3A.20
; which is a case-sensitive "Re: "
batch = 1*( batch-header article )
batch-header = "#!" SP rnews SP article-size CRLF
checkgroups-body = *( valid-group CRLF )
chkscope = 1*( CFWS ["!"] newsgroup-name )
chksernr = CFWS "#" 1*DIGIT
combiner-ASCII = DIGIT / ALPHA / "+" / "-" / "_"
combiner-base = combiner-ASCII / combiner-extended
combiner-extended = <any character with a Unicode code value of
0080 or greater and a combining class of 0,
but excluding any character in Unicode
categories Cc, Cf, Cs, Zs, Zl, and Zp>
combiner-mark = <any character with a Unicode code value of
0080 or greater and a combining class other
than 0>
component = 1*component-glyph
component-glyph = combiner-base *combiner-mark
copy-addr = address-list
date-value = 1*DIGIT [ ":" date-time ]
dist-delim = ","
distribution = positive-distribution /
negative-distribution
distribution-name = ALPHA 1*distribution-rest
distribution-rest = ALPHA / "+" / "-" / "_"
groupinfo-body = [ newsgroups-tag CRLF ]
newsgroups-line CRLF
host-value = dot-atom /
[ dot-atom ":" ]
( dotted-quad / ; see [RFC 820]
ipv6-numeric ) ; see [RFC 2373]
2 id-left = dot-atom-text / no-fold-quote
2 id-right = dot-atom-text / no-fold-literal
ihave-body = *( msg-id CRLF )
location = newsgroup-name ":" article-locator
moderation-flag = %x28.4D.6F.64.65.72.61.74.65.64.29
; case sensitive "(Moderated)"
2 msg-id = [CFWS] "<" id-left "@" id-right ">" [CFWS]
negative-distribution
= *FWS "!" distribution-name *FWS
newgroup-flag = "moderated"
newsgroup-description
= 1*( [WSP] utext)
newsgroup-name = component *( "." component )
newsgroups-line = newsgroup-name
[ 1*HTAB newsgroup-description ]
[ 1*WSP moderation-flag ]
newsgroups-tag = %x46.6F.72 SP %x79.6F.75.72 SP
%x6E.65.77.73.67.72.6F.75.70.73 SP
%x66.69.6C.65.3A
; case sensitive
; "For your newsgroups file:"
ng-delim = ","
2* no-fold-literal = DQUOTE *( dtext / strict-quoted-pair ) DQUOTE
2* no-fold-quote = "[" *( strict-qtext / strict-quoted-pair ) "]"
path-delimiter = "/" / "?" / "%" / "," / "!"
path-identity = 1*( ALPHA / DIGIT / "-" / "." / ":" / "_" )
positive-distribution
= *FWS distribution-name *FWS
posting-account-parameter
= [CFWS] Posting-Account-token" [CFWS] "=" value
posting-date-parameter
= [CFWS] Posting-Date-token [CFWS] "=" [CFWS]
( date-value /
DQUOTE date-value DQUOTE ) [CFWS]
posting-host-parameter
= [CFWS] Posting-Host-token [CFWS] "=" [CFWS]
( host-value /
DQUOTE host-value DQUOTE ) [CFWS]
posting-logging-parameter
= [CFWS] Posting-Logging-token [CFWS] "=" value
posting-sender-parameter
= [CFWS] Posting-Sender-token [CFWS] "=" [CFWS]
( sender-value /
DQUOTE sender-value DQUOTE ) [CFWS]
product-token = value [ "/" product-version ]
product-version = value
pure-subject = 1*( [FWS] utext )
relayer-name = path-identity
rnews = %x72.6E.65.77.73 ; case sensitive "rnews"
sender-value = ( mailbox / "verified" )
sendme-body = ihave-body
server-name = path-identity
tail-entry = 1*( ALPHA / DIGIT / "-" / "." / ":" / "_" )
valid-group = newsgroups-line
verb = token
-- Charles H. Lindsey ---------At Home, doing my own thing------------------------ Tel: +44 161 436 6131 Fax: +44 161 436 6133 Web: http://www.cs.man.ac.uk/~chl Email: chl@clw.cs.man.ac.uk Snail: 5 Clerewood Ave, CHEADLE, SK8 3JU, U.K. PGP: 2C15F1A9 Fingerprint: 73 6D C2 51 93 A0 01 E7 65 E8 64 7E 14 A4 AB A5