Re: Collected syntax

New Message Reply About this list Date view Thread view Subject view Author view

From: Charles Lindsey (chl@clw.cs.man.ac.uk)
Date: Fri Feb 22 2002 - 12:10:35 CST


In <3xIcKzJT34Z8QA9I@pillar.turnpike.com> Paul Overell <paulo@turnpike.com> writes:

>Since all {USENET}-content starts with [CFWS], and we only require a
>single space after the colon, this should be

> {USENET}-header =3D {USENET}-name ":" SP {USENET}-content
> *( [CFWS] ";" ( {USENET}-parameter /
> other-parameter ) )

>(This just removes a trivial parsing ambiguity, it is not a material
>change to the syntax).

Well that looked simple enough, but on inspection I found that some
contents did not start with CFWS :-( .

So there followed a long trawl to fix that bug in umpteen places. I have
now established the following invariant:

        NOTE: It may be observed that every {USENET]-content begins and
        ends with an optional CFWS (or FWS in the case of the
        Newsgroups-, Distribution-, Path- and Followup-To-headers).
        Moreover, every {USENET}- or other-parameter also begins and
        ends with an optional CFWS.

The complete Collected Syntax as fixed is reproduced below. Will Paul
Overell please check is carefully to see that this problem is truly fixed?

That still leaves the question of the use of templates such as {USENET}. I
have been looking into this, and my present inclination is to change
things as you requested. I am able to do this by using the little known
"Incremental Alternatives" feature in RFC 2234, which allows you to say

        header =/ Foo-header

in order to add an extra alternative to the already-defined header rule.
This will actually make one or two other things simpler.

But I have not done it yet, because I prefer to make such huge changes to
the Syntax one at a time. So let's get this CFWS business corrected first.

Appendix B - Collected Syntax

Appendix B.1 - Characters, Atoms and Folding

   In the following syntactic rules, nunbers in the left hand margin
   indicate rules taken from other documents, specifically:
     2 from with the exception of those elements described therein as
       "obsolete";
     4 from;
     5 from.

   Where the number is followed by an asterisk ('*'), it indicates that
   the rule in question has been modified for the purposes of this
   standard.

4 ALPHA = %x41-5A / ; A-Z
                          %x61-7A ; a-z
2 CFWS = *([FWS] comment) (([FWS] comment) / FWS )
4 CR = %x0D ; carriage return
4 CRLF = CR LF
4 DIGIT = %x30-39 ; 0-9
4 DQUOTE = %d34 ; quote mark
2 FWS = ([*WSP CRLF] 1*WSP); Folding whitespace
4 HTAB = %x09 ; horizontal tab
4 LF = %x0A ; line feed
2 NO-WS-CTL = %d1-8 / ; US-ASCII control characters
                          %d11 / ; which do not include the
                          %d12 / ; carriage return, line feed,
                          %d14-31 / ; and whitespace characters
                          %d127
4 SP = %x20 ; space
4 WSP = SP / HTAB ; Whitespace characters
   UTF8-xtra-2-head = %xC2-DF
   UTF8-xtra-3-head = %xE0 %xA0-BF / %xE1-EC %x80-BF /
                          %xED %x80-9F / %xEE-EF %x80-BF
   UTF8-xtra-4-head = %xF0 %x90-BF / %xF1-F7 %x80-BF
   UTF8-xtra-5-head = %xF8 %x88-BF / %xF9-FB %x80-BF
   UTF8-xtra-6-head = %xFC %x84-BF / %xFD %x80-BF
   UTF8-xtra-char = UTF8-xtra-2-head 1( UTF8-xtra-tail ) /
                          UTF8-xtra-3-head 1( UTF8-xtra-tail ) /
                          UTF8-xtra-4-head 2( UTF8-xtra-tail ) /
                          UTF8-xtra-5-head 3( UTF8-xtra-tail ) /
                          UTF8-xtra-6-head 4( UTF8-xtra-tail )
   UTF8-xtra-tail = %x80-BF
2 atext = ALPHA / DIGIT /
                          "!" / "#" / ; Any character except
                          "$" / "%" / ; controls, SP, and specials.
                          "&" / "'" / ; Used for atoms
                          "*" / "+" /
                          "-" / "/" /
                          "=" / "?" /
                          "^" / "_" /
                          "`" / "{" /
                          "|" / "}" /
                          "~"
2 atom = [CFWS] 1*atext [CFWS]
2 ccontent = ctext / quoted-pair / comment
2 comment = "(" *([FWS] ccontent) [FWS] ")"
2* ctext = NO-WS-CTL / ; all of <text> except
                          %d33-39 / ; SP, HTAB, "(", ")"
                          %d42-91 / ; and "\"
                          %d93-126 /
                          UTF8-xtra-char
2 dcontent = dtext / quoted-pair
2 dot-atom = [CFWS] dot-atom-text [CFWS]
2 dot-atom-text = 1*atext *( "." 1*atext )
2 dtext = NO-WS-CTL / ; Non white space controls
                          %d33-90 / ; The rest of the US-ASCII
                          %d94-126 ; characters not including
                                           ; "[", "]", or "
2 phrase = 1*word
2 qcontent = qtext / quoted-pair
2* qtext = NO-WS-CTL / ; all of <text> except
                          %d33 / ; SP, HTAB, "\" and DQUOTE
                          %d35-91 /
                          %d93-126 /
                          UTF8-xtra-char
2 quoted-pair = "\" text
2 quoted-string = [CFWS] DQUOTE
                             *( [FWS] qcontent ) [FWS]
                             DQUOTE [CFWS]
2 specials = "(" / ")" / ; Special characters used in
                          "<" / ">" / ; other parts of the syntax
                          "[" / "]" /
                          ":" / ";" /
                          "@" / "\" /
                          "," / "." /
                          DQUOTE
   strict-qcontent = strict-qtext / strict-quoted-pair
   strict-quoted-pair = "\" strict-text
   strict-quoted-string
                        = [CFWS] DQUOTE
                             *( [FWS] strict-qcontent ) [FWS]
                             DQUOTE [CFWS]
   strict-qtext = NO-WS-CTL / ; qtext restricted to
                          %d33 / ; US-ASCII
                          %d35-91 /
                          %d93-126
   strict-text = %d1-9 / ; text restricted to
                          %d11-12 / ; US-ASCII
                          %d14-127
2* text = %d1-9 / ; all UTF-8 characters except
                          %d11-12 / ; US-ASCII NUL, CR and LF
                          %d14-127 /
<EOF> UTF8-xtra-char
5 tspecials = "(" / ")" / "<" / ">" / "@" /
                          "," / ";" / ":" / "\" / DQUOTE /
                          "/" / "[" / "]" / "?" / "="
2* utext = NO-WS-CTL / ; Non white space controls
                          %d33-126 / ; The rest of US-ASCII
                          UTF8-xtra-char
2 word = atom / quoted-string

Appendix B.2 - Basic Forms

   {USENET}-header = {USENET}-name ":" SP {USENET}-content
                             *( ";" ( {USENET}-parameter /
                                      other-parameter ) )

2 addr-spec = local-part "@" domain
2 address = mailbox / group
2 address-list = address *( "," address )
2 angle-addr = [CFWS] "<" addr-spec ">" [CFWS]
   article = 1*( header CRLF ) separator body
5* attribute = {USENET}-token / iana-token / x-token
   body = *( *998text CRLF )
2 display-name = phrase
2 date = day month year
2 date-time = [ day-of-week "," ] date FWS time [CFWS]
2 day = [FWS] 1*2DIGIT
2 day-name = "Mon" / "Tue" / "Wed" / "Thu" /
                          "Fri" / "Sat" / "Sun"
2 day-of-week = [FWS] day-name
2 domain = dot-atom / domain-literal
2 domain-literal = [CFWS] "[" *([FWS] dcontent) [FWS] "]" [CFWS]
2 group = display-name ":" [ mailbox-list / CFWS ] ";"
                             [CFWS]
   header = {USENET}-header / other-header
   header-name = 1*name-character *( "-" 1*name-character )
2 hour = 2DIGIT
5* iana-token = <A token defined in an experimental
                             or standards-track RFC and registered with
                             with IANA>
2* local-part = dot-atom / strict-quoted-string
2 mailbox = name-addr / addr-spec
2 mailbox-list = mailbox *( "," mailbox )
2 minute = 2DIGIT
2 month = FWS month-name FWS
2 month-name = "Jan" / "Feb" / "Mar" / "Apr" /
                          "May" / "Jun" / "Jul" / "Aug" /
                          "Sep" / "Oct" / "Nov" / "Dec"
2 name-addr = [display-name] angle-addr
   name-character = ALPHA / DIGIT
   other-header = header-name ":" 1*SP other-content
   other-content
                        = <the content of a header defined by some
                             other standard>
   other-parameter
                        = attribute "=" value
2 second = 2DIGIT
   separator = CRLF
2 time = time-of-day FWS zone
2 time-of-day = hour ":" minute [ ":" second ]
5* token = [CFWS] token-core [CFWS]
5* token-core = 1*<any (US-ASCII) CHAR except SP, CTLs,
                             or tspecials>
5 value = token / quoted-string
5* x-token = [CFWS] "x-" token-core [CFWS]
2 year = 4*DIGIT
2* zone = (( "+" / "-" ) 4DIGIT) / "UT" / "GMT"

Appendix B.3 - Headers

Appendix B.3.1 - Template definitions

   {CONTROL}-verb = <the verb defined in this standard
                             (or an extension of it) for a specific
                             {CONTROL} message>
   {CONTROL}-arguments = <the arguments defined in this standard
                             (or an extension of it) for a specific
                             {CONTROL} message>
   {USENET}-content
                        = <the content of a header defined in this
                             standard (or an extension of it) for a
                             specific {USENET}-header>
   {USENET}-name
                        = <a header-name defined in this standard
                             (or an extension of it) for a specific
                             {USENET}-header>
   {USENET}-parameter
                        = <an other-parameter defined in this standard
                             (or an extension of it) for a specific
                             {USENET}-header>
   {USENET}-token = <a token defined in this standard for
                             use in conjunction with a specific
                             {USENET}-parameter>

Appendix B.3.2 - Template instantiations

   Approved-content = From-content
   Approved-name = "Approved"
   Archive-content = [CFWS] ("no" / "yes" ) [CFWS]
   Archive-name = "Archive"
   Archive-parameter = Filename-token "=" value
   Cancel-arguments = CFWS msg-id
   Cancel-verb = "cancel"
   Checkgroup-arguments = [ chkscope ] [ chksernr ]
   Checkgroup-verb = "checkgroups"
   Complaints-To-content= address-list
   Complaints-To-name = "Complaints-To"
   Control-content = [CFWS] {CONTROL}-verb {CONTROL}-arguments [CFWS]
   Control-name = "Control"
   Date-content = date-time
   Date-name = "Date"
   Distribution-content = distribution *( dist-delim distribution )
   Distribution-name = "Distribution"
   Expires-content = date-time
   Expires-name = "Expires"
   Filename-token = [CFWS] "filename" [CFWS]
   Followup-To-content = Newsgroups-content / [FWS] "poster" [FWS]
   Followup-To-name = "Followup-To"
   From-content = mailbox-list
   From-name = "From"
   Ihave-arguments = *( msg-id SP ) relayer-name
   Ihave-verb = "ihave"
   Injector-Info-content= [CFWS] path-identity [CFWS]
   Injector-Info-name = "Injector-Info"
   Injector-Info-parameter
                        = posting-host-parameter /
                          posting-account-parameter /
                          posting-sender-parameter /
                          posting-logging-parameter /
                          posting-date-parameter
   Keywords-content = phrase *( "," phrase )
   Keywords-name = "Keywords"
   Lines-content = [CFWS] 1*DIGIT [CFWS]
   Lines-name = "Lines"
   Mail-Copies-To-content
                        = copy-addr / [CFWS] ( "nobody" / "poster" ) [CFWS]
   Mail-Copies-To-name = "Mail-Copies-To"
   Message-ID-content = msg-id
   Message-ID-name = "Message-ID"
   Mvgroup-arguments = CFWS newsgroup-name CFWS newsgroup-name
                             [ CFWS newgroup-flag ]
   Mvgroup-verb = "mvgroup"
   Newgroup-verb = "newgroup"
   Newgroup-arguments = CFWS newsgroup-name [ CFWS newgroup-flag ]
   Newsgroups-content = [FWS] newsgroup-name
                             *( [FWS] ng-delim [FWS] newsgroup-name )
                             [FWS]
   Newsgroups-name = "Newsgroups"
   Organization-content
                        = 1*( [FWS] utext )
   Organization-name = "Organization"
   Path-content = [FWS] *( path-identity [FWS] path-delimiter [FWS] )
                             tail-entry [FWS]
   Path-name = "Path"
   Posted-And-Mailed-content
                        = [CFWS] ( "yes" / "no" ) [CFWS]
   Posted-And-Mailed-name
                        = "Posted-And-Mailed"
   Posting-Account-token= "posting-account"
   Posting-Date-token = "posting-date"
   Posting-Host-token = "posting-host"
   Posting-Logging-token= "logging-data"
   Posting-Sender-token = "sender"
   References-content = msg-id *( CFWS msg-id )
   References-name = "References"
   Reply-To-content = address-list
   Reply-To-name = "Reply-To"
   Rmgroup-arguments = CFWS newsgroup-name
   Rmgroup-verb = "rmgroup"
   Sender-content = mailbox
   Sender-name = "Sender"
   Sendme-arguments = Ihave-arguments
   Sendme-verb = "sendme"
   Subject-content = [ [FWS] back-reference ] pure-subject
   Subject-name = "Subject"
   Summary-content = 1*( [FWS] utext )
   Summary-name = "Summary"
   Supersedes-content = msg-id
   Supersedes-name = "Supersedes"
   User-Agent-content = product-token *( CFWS product-token )
   User-Agent-name = "User-Agent"
   Xref-content = [CFWS] server-name 1*( CFWS location ) [CFWS]
   Xref-name = "Xref"

Appendix B.3.3 - Other header rules
 
   arguments = *( CFWS value )
   article-locator = 1*( %x21-7E ) ; US-ASCII printable characters
   article-size = 1*DIGIT
   back-reference = %x52.65.3A.20
                                  ; which is a case-sensitive "Re: "
   batch = 1*( batch-header article )
   batch-header = "#!" SP rnews SP article-size CRLF
   checkgroups-body = *( valid-group CRLF )
   chkscope = 1*( CFWS ["!"] newsgroup-name )
   chksernr = CFWS "#" 1*DIGIT
   combiner-ASCII = DIGIT / ALPHA / "+" / "-" / "_"
   combiner-base = combiner-ASCII / combiner-extended
   combiner-extended = <any character with a Unicode code value of
                           0080 or greater and a combining class of 0,
                           but excluding any character in Unicode
                           categories Cc, Cf, Cs, Zs, Zl, and Zp>
   combiner-mark = <any character with a Unicode code value of
                           0080 or greater and a combining class other
                           than 0>
   component = 1*component-glyph
   component-glyph = combiner-base *combiner-mark
   copy-addr = address-list
   date-value = 1*DIGIT [ ":" date-time ]
   dist-delim = ","
   distribution = positive-distribution /
                             negative-distribution
   distribution-name = ALPHA 1*distribution-rest
   distribution-rest = ALPHA / "+" / "-" / "_"
   groupinfo-body = [ newsgroups-tag CRLF ]
                             newsgroups-line CRLF
   host-value = dot-atom /
                          [ dot-atom ":" ]
                            ( dotted-quad / ; see
                              ipv6-numeric ) ; see
2 id-left = dot-atom-text / no-fold-quote
2 id-right = dot-atom-text / no-fold-literal
   ihave-body = *( msg-id CRLF )
   location = newsgroup-name ":" article-locator
   moderation-flag = %x28.4D.6F.64.65.72.61.74.65.64.29
                             ; case sensitive "(Moderated)"
2 msg-id = [CFWS] "<" id-left "@" id-right ">" [CFWS]
   negative-distribution
                        = [FWS] "!" distribution-name [FWS]
   newgroup-flag = "moderated"
   newsgroup-description
                        = 1*( [WSP] utext)
   newsgroup-name = component *( "." component )
   newsgroups-line = newsgroup-name
                             [ 1*HTAB newsgroup-description ]
                             [ 1*WSP moderation-flag ]
   newsgroups-tag = %x46.6F.72 SP %x79.6F.75.72 SP
                             %x6E.65.77.73.67.72.6F.75.70.73 SP
                             %x66.69.6C.65.3A
                             ; case sensitive
                             ; "For your newsgroups file:"
   ng-delim = ","
2* no-fold-literal = DQUOTE *( dtext / strict-quoted-pair ) DQUOTE
2* no-fold-quote = "[" *( strict-qtext / strict-quoted-pair ) "]"
   path-delimiter = "/" / "?" / "%" / "," / "!"
   path-identity = 1*( ALPHA / DIGIT / "-" / "." / ":" / "_" )
   positive-distribution
                        = [FWS] distribution-name [FWS]
   posting-account-parameter
                        = [CFWS] Posting-Account-token" [CFWS] "=" value
   posting-date-parameter
                        = [CFWS] Posting-Date-token [CFWS] "=" [CFWS]
                            ( date-value /
                              DQUOTE date-value DQUOTE ) [CFWS]
   posting-host-parameter
                        = [CFWS] Posting-Host-token [CFWS] "=" [CFWS]
                            ( host-value /
                              DQUOTE host-value DQUOTE ) [CFWS]
   posting-logging-parameter
                        = [CFWS] Posting-Logging-token [CFWS] "=" value
   posting-sender-parameter
                        = [CFWS] Posting-Sender-token [CFWS] "=" [CFWS]
                            ( sender-value /
                              DQUOTE sender-value DQUOTE ) [CFWS]
   product-token = value [ "/" product-version ]
   product-version = value
   pure-subject = 1*( [FWS] utext )
   relayer-name = path-identity
   rnews = %x72.6E.65.77.73 ; case sensitive "rnews"
   sender-value = ( mailbox / "verified" )
   sendme-body = ihave-body
   server-name = path-identity
   tail-entry = 1*( ALPHA / DIGIT / "-" / "." / ":" / "_" )
   valid-group = newsgroups-line
   verb = token

-- 
Charles H. Lindsey ---------At Home, doing my own thing------------------------
Tel: +44 161 436 6131 Fax: +44 161 436 6133   Web: http://www.cs.man.ac.uk/~chl
Email: chl@clw.cs.man.ac.uk      Snail: 5 Clerewood Ave, CHEADLE, SK8 3JU, U.K.
PGP: 2C15F1A9      Fingerprint: 73 6D C2 51 93 A0 01 E7 65 E8 64 7E 14 A4 AB A5


New Message Reply About this list Date view Thread view Subject view Author view


This archive was generated by hypermail 2b29.