Re: Collected syntax

New Message Reply About this list Date view Thread view Subject view Author view

From: Charles Lindsey (chl@clw.cs.man.ac.uk)
Date: Fri Feb 08 2002 - 13:56:14 CST


In <Gr09nL.AHI@clw.cs.man.ac.uk> chl@clw.cs.man.ac.uk (Charles Lindsey) writes:

>In <7cKPhbFkxmW8QAmq@pillar.turnpike.com> Paul Overell <paulo@turnpike.com> writes:

>>I don't want to get into a discussion of the various merits of different
>>syntactic meta-languages. ABNF may not be perfect but it has become the
>>syntactic meta-language of choice for RFCs, it is a proposed standard.
>>We should use it as is, without extensions. The above is not ABNF.

>No, it is not strict ABNF, but it has been in the draft for two, maybe
>three, years now, and this is the first time somebody has complained. Does
>anybody else on this list want this to be reviewed?

Nobody has responded to this, so I propose to leave the mechanism in the
draft, but with a few tweaks in response to this discussion.

>I will grant you that a short explanatory paragraph in section 2.4 would
>help, and I will write one.

I now have:

2.4. Syntax
 
2.4.1. Syntax Notation

   This standard uses the Augmented Backus Naur Form described in [RFC
   2234]. Additionally, some syntax rules are given in the form of
   schemata from each of which several rules in the [RFC 2234] format
   can be derived. For example, the schema

      {USENET}-header = {USENET}-name ":" 1*SP {USENET}-content
                             *( [CFWS] ";" ( {USENET}-parameter /
                                             other-parameter ) )

   (see section 4.1) implies the existence of a large number of rules,
   one for each header defined by this standard. Substituting the
   template "{USENET}" by, for example, "Archive" thus gives rise to the
   actual rule

      Archive-header = Archive-name ":" 1*SP Archive-content
                             *( [CFWS] ";" ( Archive-parameter /
                                             other-parameter ) )

[Observe the templates now enclosed in {...} rather than <...>. In our
previous drafts they were not enclosed at all, but I think they need some
specific notation to make them stand out.]

>>From: paulo@turnpike.com

>>Because without an explicit production for the from header there is
>>nothing in your syntax to link "From" with from-content. The best it
>>could do is the general header-content.

Yes. There are now a full set of rules for {USENET}-name, such as

>Foo-name = "Foo"

Here now is the full Basic Format syntax:

4. Basic Format
 
4.1. Syntax of News Articles

   The overall syntax of a news article is:

      article = 1*( header CRLF ) separator body
      header = {USENET}-header / other-header
      {USENET}-header = {USENET}-name ":" 1*SP {USENET}-content
                             *( [CFWS] ";" ( {USENET}-parameter /
                                             other-parameter ) )
      {USENET}-name = <a header-name defined in this standard
                             (or an extension of it) for a specific
                             {USENET}-header>
      header-name = 1*name-character *( "-" 1*name-character )
      name-character = ALPHA / DIGIT
      {USENET}-content = <the content of a header defined in this
                             standard (or an extension of it) for a
                             specific {USENET}-header>
      {USENET}-parameter= <an other-parameter defined in this standard
                             (or an extension of it) for a specific
                             {USENET}-header>
      other-parameter = attribute "=" value
      attribute = {USENET}-token / iana-token / x-token
      {USENET}-token = <A token defined in this standard for
                             use in conjunction with a specific
                             {USENET}-parameter>
      iana-token = <a token defined in an experimental
                             or standards-track RFC and registered
                             with IANA>
      x-token = [CFWS] "x-" token-core [CFWS]
      token = [CFWS] token-core [CFWS]
      token-core = 1*<any (US-ASCII) CHAR except SP, CTLs,
                             or tspecials>
      tspecials = "(" / ")" / "<" / ">" / "@" /
                          "," / ";" / ":" / "\" / DQUOTE /
                          "/" / "[" / "]" / "?" / "="
      value = token / quoted-string
      other-header = header-name ":" 1*SP other-content
      other-content
                        = <the content of a header defined by some
                             other standard>
      separator = CRLF
      body = *( *998text CRLF )

And finally, here is the complete Collected Syntax as it now stands:

Appendix B - Collected Syntax

   In the following syntactic rules, nunbers in the left hand margin
   indicate rules taken from other documents, specifically:
     2 from [RFC 2822] with the exception of those elements described
       therein as "obsolete";
     4 from [RFC 2234];
     5 from [RFC 2045].

   Where the number is followed by an asterisk ('*'), it indicates that
   the rule in question has been modified for the purposes of this
   standard.

B.1 Characters, Atoms and Folding

4 ALPHA = %x41-5A / ; A-Z
                          %x61-7A ; a-z
2 CFWS = *([FWS] comment) (([FWS] comment) / FWS )
4 CR = %x0D ; carriage return
4 CRLF = CR LF
4 DIGIT = %x30-39 ; 0-9
4 DQUOTE = %d34 ; quote mark
2 FWS = ([*WSP CRLF] 1*WSP); Folding whitespace
4 HTAB = %x09 ; horizontal tab
4 LF = %x0A ; line feed
2 NO-WS-CTL = %d1-8 / ; US-ASCII control characters
                          %d11 / ; which do not include the
                          %d12 / ; carriage return, line feed,
                          %d14-31 / ; and whitespace characters
                          %d127
4 SP = %x20 ; space
4 WSP = SP / HTAB ; Whitespace characters
   UTF8-xtra-2-head = %xC2-DF
   UTF8-xtra-3-head = %xE0 %xA0-BF / %xE1-EC %x80-BF /
                          %xED %x80-9F / %xEE-EF %x80-BF
   UTF8-xtra-4-head = %xF0 %x90-BF / %xF1-F7 %x80-BF
   UTF8-xtra-5-head = %xF8 %x88-BF / %xF9-FB %x80-BF
   UTF8-xtra-6-head = %xFC %x84-BF / %xFD %x80-BF
   UTF8-xtra-char = UTF8-xtra-2-head 1( UTF8-xtra-tail ) /
                          UTF8-xtra-3-head 1( UTF8-xtra-tail ) /
                          UTF8-xtra-4-head 2( UTF8-xtra-tail ) /
                          UTF8-xtra-5-head 3( UTF8-xtra-tail ) /
                          UTF8-xtra-6-head 4( UTF8-xtra-tail )
   UTF8-xtra-tail = %x80-BF
2 atext = ALPHA / DIGIT /
                          "!" / "#" / ; Any character except
                          "$" / "%" / ; controls, SP, and specials.
                          "&" / "'" / ; Used for atoms
                          "*" / "+" /
                          "-" / "/" /
                          "=" / "?" /
                          "^" / "_" /
                          "`" / "{" /
                          "|" / "}" /
                          "~"
2 atom = [CFWS] 1*atext [CFWS]
2 ccontent = ctext / quoted-pair / comment
2 comment = "(" *([FWS] ccontent) [FWS] ")"
2* ctext = NO-WS-CTL / ; all of <text> except
                          %d33-39 / ; SP, HTAB, "(", ")"
                          %d42-91 / ; and "\"
                          %d93-126 /
                          UTF8-xtra-char
2 dcontent = dtext / quoted-pair
2 dot-atom = [CFWS] dot-atom-text [CFWS]
2 dot-atom-text = 1*atext *( "." 1*atext )
2 dtext = NO-WS-CTL / ; Non white space controls
                          %d33-90 / ; The rest of the US-ASCII
                          %d94-126 ; characters not including
                                           ; "[", "]", or "
2 phrase = 1*word
2 qcontent = qtext / quoted-pair
2* qtext = NO-WS-CTL / ; all of <text> except
                          %d33 / ; SP, HTAB, "\" and DQUOTE
                          %d35-91 /
                          %d93-126 /
                          UTF8-xtra-char
2 quoted-pair = "\" text
2 quoted-string = [CFWS] DQUOTE
                             *( [FWS] qcontent ) [FWS]
                             DQUOTE [CFWS]
2 specials = "(" / ")" / ; Special characters used in
                          "<" / ">" / ; other parts of the syntax
                          "[" / "]" /
                          ":" / ";" /
                          "@" / "\" /
                          "," / "." /
                          DQUOTE
   strict-qcontent = strict-qtext / strict-quoted-pair
   strict-quoted-pair = "\" strict-text
   strict-quoted-string
                        = [CFWS] DQUOTE
                             *( [FWS] strict-qcontent ) [FWS]
                             DQUOTE [CFWS]
   strict-qtext = NO-WS-CTL / ; qtext restricted to
                          %d33 / ; US-ASCII
                          %d35-91 /
                          %d93-126
   strict-text = %d1-9 / ; text restricted to
                          %d11-12 / ; US-ASCII
                          %d14-127
2* text = %d1-9 / ; all UTF-8 characters except
                          %d11-12 / ; US-ASCII NUL, CR and LF
                          %d14-127 /
                          UTF8-xtra-char
5 tspecials = "(" / ")" / "<" / ">" / "@" /
                          "," / ";" / ":" / "\" / DQUOTE /
                          "/" / "[" / "]" / "?" / "="
2* utext = NO-WS-CTL / ; Non white space controls
                          %d33-126 / ; The rest of US-ASCII
                          UTF8-xtra-char
2 word = atom / quoted-string

B.2 Basic Forms

   {USENET}-header = {USENET}-name ":" 1*SP {USENET}-content
                             *( [CFWS] ";" ( {USENET}-parameter /
                                             other-parameter ) )

2 addr-spec = local-part "@" domain
2 address = mailbox / group
2 address-list = address *( "," address )
2 angle-addr = [CFWS] "<" addr-spec ">" [CFWS]
   article = 1*( header CRLF ) separator body
   attribute = {USENET}-token / iana-token / x-token
   body = *( *998text CRLF )
2 display-name = phrase
2 date = day month year
2 date-time = [ day-of-week "," ] date FWS time [CFWS]
2 day = [FWS] 1*2DIGIT
2 day-name = "Mon" / "Tue" / "Wed" / "Thu" /
                          "Fri" / "Sat" / "Sun"
2 day-of-week = [FWS] day-name
2 domain = dot-atom / domain-literal
2 domain-literal = [CFWS] "[" *([FWS] dcontent) [FWS] "]" [CFWS]
2 group = display-name ":" [ mailbox-list / CFWS ] ";"
                             [CFWS]
   header = {USENET}-header / other-header
   header-name = 1*name-character *( "-" 1*name-character )
2 hour = 2DIGIT
   iana-token = <A token defined in an experimental
                             or standards-track RFC and registered
                             with IANA>
2* local-part = dot-atom / strict-quoted-string
2 mailbox = name-addr / addr-spec
2 mailbox-list = mailbox *( "," mailbox )
2 minute = 2DIGIT
2 month = FWS month-name FWS
2 month-name = "Jan" / "Feb" / "Mar" / "Apr" /
                          "May" / "Jun" / "Jul" / "Aug" /
                          "Sep" / "Oct" / "Nov" / "Dec"
2 name-addr = [display-name] angle-addr
   name-character = ALPHA / DIGIT
   other-header = header-name ":" 1*SP other-content
   other-content
                        = <the content of a header defined by some
                             other standard>
   other-parameter
                        = attribute "=" value
2 second = 2DIGIT
   separator = CRLF
2 time = time-of-day FWS zone
2 time-of-day = hour ":" minute [ ":" second ]
5* token = [CFWS] token-core [CFWS]
5* token-core = 1*<any (US-ASCII) CHAR except SP, CTLs,
                             or tspecials>
5 value = token / quoted-string
   x-token = [CFWS] "x-" token-core [CFWS]
2 year = 4*DIGIT
2* zone = (( "+" / "-" ) 4DIGIT) / "UT" / "GMT"

B.3 Headers

B.3.1 Template definitions

   {CONTROL}-verb = <the verb defined in this standard
                             (or an extension of it) for a specific
                             {CONTROL} message>
   {CONTROL}-arguments = <the arguments defined in this standard
                             (or an extension of it) for a specific
                             {CONTROL} message>
   {USENET}-content
                        = <the content of a header defined in this
                             standard (or an extension of it) for a
                             specific {USENET}-header>
   {USENET}-name
                        = <a header-name defined in this standard
                             (or an extension of it) for a specific
                             {USENET}-header>
   {USENET}-parameter
                        = <an other-parameter defined in this standard
                             (or an extension of it) for a specific
                             {USENET}-header>
   {USENET}-token = <a token defined in this standard for
                             use in conjunction with a specific
                             {USENET}-parameter>

B.3.2 Template instantiations

   Approved-content = From-content
   Approved-name = "Approved"
   Archive-content = [CFWS] ("no" | "yes" ) [CFWS]
   Archive-name = "Archive"
   Archive-parameter = Filename-token "=" value
   Cancel-arguments = CFWS msg-id
   Cancel-verb = "cancel"
   Checkgroup-arguments = [ chkscope ] [ chksernr ]
   Checkgroup-verb = "checkgroups"
   Complaints-To-content= address-list
   Complaints-To-name = "Complaints-To"
   Control-content = {CONTROL}-verb {CONTROL}-arguments
   Control-name = "Control"
   Date-content = date-time
   Date-name = "Date"
   Distribution-content = distribution *( dist-delim distribution )
   Distribution-name = "Distribution"
   Expires-content = date-time
   Expires-name = "Expires"
   Filename-token = [CFWS] "filename" [CFWS]
   Followup-To-content = Newsgroups-content / "poster"
   Followup-To-name = "Followup-To"
   From-content = mailbox-list
   From-name = "From"
   Ihave-arguments = *( msg-id SP ) relayer-name
   Ihave-verb = "ihave"

   Injector-Info-content= path-identity
   Injector-Info-name = "Injector-Info"
   Injector-Info-parameter
                        = posting-host-parameter /
                          posting-account-parameter /
                          posting-sender-parameter /
                          posting-logging-parameter /
                          posting-date-parameter
   Keywords-content = phrase *( "," phrase )
   Keywords-name = "Keywords"
   Lines-content = [CFWS] 1*DIGIT
   Lines-name = "Lines"
   Mail-Copies-To-content
                        = copy-addr / "nobody" / "poster"
   Mail-Copies-To-name = "Mail-Copies-To"
   Message-ID-content = msg-id
   Message-ID-name = "Message-ID"
   Mvgroup-arguments = CFWS newsgroup-name CFWS newsgroup-name
                             [ CFWS newgroup-flag ]
   Mvgroup-verb = "mvgroup"
   Newgroup-verb = "newgroup"
   Newgroup-arguments = CFWS newsgroup-name [ CFWS newgroup-flag ]
   Newsgroups-content = newsgroup-name
                             *( *FWS ng-delim *FWS newsgroup-name )
                             *FWS
   Newsgroups-name = "Newsgroups"
   Organization-content
                        = 1*( [FWS] utext )
   Organization-name = "Organization"
   Path-content = *( path-identity [FWS] path-delimiter [FWS] )
                             tail-entry *FWS
   Path-name = "Path"
   Posted-And-Mailed-content
                        = "yes" / "no"
   Posted-And-Mailed-name
                        = "Posted-And-Mailed"
   Posting-Account-token= "posting-account"
   Posting-Date-token = "posting-date"
   Posting-Host-token = "posting-host"
   Posting-Logging-token= "logging-data"
   Posting-Sender-token = "sender"
   References-content = msg-id *( CFWS msg-id )
   References-name = "References"
   Reply-To-content = address-list
   Reply-To-name = "Reply-To"
   Rmgroup-arguments = CFWS newsgroup-name
   Rmgroup-verb = "rmgroup"
   Sender-content = mailbox
   Sender-name = "Sender"
   Sendme-arguments = Ihave-arguments
   Sendme-verb = "sendme"
   Subject-content = [ back-reference ] pure-subject
   Subject-name = "Subject"
   Summary-content = 1*( [FWS] utext )
   Summary-name = "Summary"
   Supersedes-content = msg-id
   Supersedes-name = "Supersedes"
   User-Agent-content = product-token *( CFWS product-token )
   User-Agent-name = "User-Agent"
   Xref-content = [CFWS] server-name 1*( CFWS location )
   Xref-name = "Xref"

B.3.3 Other header rules

   arguments = *( CFWS value )
   article-locator = 1*( %x21-7E ) ; US-ASCII printable characters
   article-size = 1*DIGIT
   back-reference = %x52.65.3A.20
                                  ; which is a case-sensitive "Re: "
   batch = 1*( batch-header article )
   batch-header = "#!" SP rnews SP article-size CRLF
   checkgroups-body = *( valid-group CRLF )
   chkscope = 1*( CFWS ["!"] newsgroup-name )
   chksernr = CFWS "#" 1*DIGIT
   combiner-ASCII = DIGIT / ALPHA / "+" / "-" / "_"
   combiner-base = combiner-ASCII / combiner-extended
   combiner-extended = <any character with a Unicode code value of
                           0080 or greater and a combining class of 0,
                           but excluding any character in Unicode
                           categories Cc, Cf, Cs, Zs, Zl, and Zp>
   combiner-mark = <any character with a Unicode code value of
                           0080 or greater and a combining class other
                           than 0>
   component = 1*component-glyph
   component-glyph = combiner-base *combiner-mark
   copy-addr = address-list
   date-value = 1*DIGIT [ ":" date-time ]
   dist-delim = ","
   distribution = positive-distribution /
                             negative-distribution
   distribution-name = ALPHA 1*distribution-rest
   distribution-rest = ALPHA / "+" / "-" / "_"
   groupinfo-body = [ newsgroups-tag CRLF ]
                             newsgroups-line CRLF
   host-value = dot-atom /
                          [ dot-atom ":" ]
                            ( dotted-quad / ; see [RFC 820]
                              ipv6-numeric ) ; see [RFC 2373]
2 id-left = dot-atom-text / no-fold-quote
2 id-right = dot-atom-text / no-fold-literal
   ihave-body = *( msg-id CRLF )
   location = newsgroup-name ":" article-locator
   moderation-flag = %x28.4D.6F.64.65.72.61.74.65.64.29
                             ; case sensitive "(Moderated)"
2 msg-id = [CFWS] "<" id-left "@" id-right ">" [CFWS]
   negative-distribution
                        = *FWS "!" distribution-name *FWS
   newgroup-flag = "moderated"
   newsgroup-description
                        = 1*( [WSP] utext)
   newsgroup-name = component *( "." component )
   newsgroups-line = newsgroup-name
                             [ 1*HTAB newsgroup-description ]
                             [ 1*WSP moderation-flag ]

   newsgroups-tag = %x46.6F.72 SP %x79.6F.75.72 SP
                             %x6E.65.77.73.67.72.6F.75.70.73 SP
                             %x66.69.6C.65.3A
                             ; case sensitive
                             ; "For your newsgroups file:"
   ng-delim = ","
2* no-fold-literal = DQUOTE *( dtext / strict-quoted-pair ) DQUOTE
2* no-fold-quote = "[" *( strict-qtext / strict-quoted-pair ) "]"
   path-delimiter = "/" / "?" / "%" / "," / "!"
   path-identity = 1*( ALPHA / DIGIT / "-" / "." / ":" / "_" )
   positive-distribution
                        = *FWS distribution-name *FWS
   posting-account-parameter
                        = [CFWS] Posting-Account-token" [CFWS] "=" value
   posting-date-parameter
                        = [CFWS] Posting-Date-token [CFWS] "=" [CFWS]
                            ( date-value /
                              DQUOTE date-value DQUOTE ) [CFWS]
   posting-host-parameter
                        = [CFWS] Posting-Host-token [CFWS] "=" [CFWS]
                            ( host-value /
                              DQUOTE host-value DQUOTE ) [CFWS]
   posting-logging-parameter
                        = [CFWS] Posting-Logging-token [CFWS] "=" value
   posting-sender-parameter
                        = [CFWS] Posting-Sender-token [CFWS] "=" [CFWS]
                            ( sender-value /
                              DQUOTE sender-value DQUOTE ) [CFWS]
   product-token = value [ "/" product-version ]
   product-version = value
   pure-subject = 1*( [FWS] utext )
   relayer-name = path-identity
   rnews = %x72.6E.65.77.73 ; case sensitive "rnews"
   sender-value = ( mailbox / "verified" )
   sendme-body = ihave-body
   server-name = path-identity
   tail-entry = 1*( ALPHA / DIGIT / "-" / "." / ":" / "_" )
   valid-group = newsgroups-line
   verb = token

-- 
Charles H. Lindsey ---------At Home, doing my own thing------------------------
Tel: +44 161 436 6131 Fax: +44 161 436 6133   Web: http://www.cs.man.ac.uk/~chl
Email: chl@clw.cs.man.ac.uk      Snail: 5 Clerewood Ave, CHEADLE, SK8 3JU, U.K.
PGP: 2C15F1A9      Fingerprint: 73 6D C2 51 93 A0 01 E7 65 E8 64 7E 14 A4 AB A5


New Message Reply About this list Date view Thread view Subject view Author view


This archive was generated by hypermail 2b29.