Re: Some syntactic bits

From: Bruce Lilly (blilly@erols.com)
Date: Mon May 12 2003 - 09:57:31 CDT


Charles Lindsey wrote:

> I allow "From: Joe Q. Public <joe@public.example>" on a MUST accept
> SHOULD NOT generate yet basis, to bring us in line with RFC 2822

That is not consistent with 2822, which states that unquoted dot in
a phrase MUST NOT (not merely "SHOULD NOT") be generated [2822 sections
3.1 and 4].

> unstructured = 1*( [FWS] ( utext / encoded-word ) ) [FWS]

> phrase = 1*( [CFWS] encoded-word [CFWS] / word ) /
> extended-phrase
> extended-phrase = ( [CFWS] encoded-word [CFWS] / word )
> *( [CFWS] encoded-word [CFWS] / word /
> [CFWS] "." [CFWS] )

Unfortunately the syntax for encoded words is not nearly that simple.
Aside from the issue of the allowable characters in the encoded-words
(which differs for phrases, comments, and unstructured text), there
must be linear whitespace on both sides of each encoded-word except
for three cases, viz. an encoded-word immediately following the '(' in
a comment (which obviously only applies to a subset of structured
fields), an encoded-word immediately followed by ')' in a comment,
and an encoded-word at the end of a field (either unstructured or as
the last word of a phrase at the end of a field body). I have confirmed
that in correspondence with 2047's author, Keith Moore. The ABNF
above does not say that; it permits all of the following, which are
illegal per the intent of RFC 2047:

Comments: foo=?us-ascii?q?bar?=baz
   (where =?us-ascii?q?bar?= is intended to be an encoded-word; utext
   and encoded-words must be separated by linear whitespace)

Subject: =?us-ascii?q?encoded-word1?==?us-ascii?q?encoded-word2?=
   (encoded-words (in all contexts) must be separated from other
    encoded-words by linear whitespace)

Keywords: =?us-ascii?q?foo?=,=?us-ascii?q?bar?=
   (encoded-words need to be separated from specials, including the
   list-element separator ',', by linear whitespace)

From: Bullwinkle =?us-ascii?q?J?=. Moose <b.j.moose@frostbite-falls.mn.us>
   ('.' is a special, and must be separated from encoded-words by
    linear whitespace)

From: Bullwinkle =?us-ascii?q?J?="." Moose <b.j.moose@frostbite-falls.mn.us>
   (DQUOTE is also a special)

From: Bullwinkle=?us-ascii?q?J?= "." Moose <b.j.moose@frostbite-falls.mn.us>
   (encoded-words in a phrase must be separated from words by linear
   whitespace)

Keywords: (a comment)=?us-ascii?q?foo?=(another comment)
   (encoded-words require linear-whitespace on both sides except for
   the specific cases noted above)

> Msg-ids are redefined [...]

> msg-id = "<" id-left "@" id-right ">"
> id-left = dot-atom-text / no-fold-quote
> id-right = dot-atom-text / no-fold-literal
> no-fold-quote = DQUOTE
> *( mqtext / "\\" / "\" DQUOTE )
> mqspecial
> *( mqtext / "\\" / "\" DQUOTE )
> DQUOTE
> mqtext = NO-WS-CTL / ; all of <text> except
> %d33 / ; SP, HTAB, "\", ">"
> %d35-61 / ; and DQUOTE
> %d63-91 /
> %d93-126
> mqspecial = "(" / ")" / ; same as specials except
> "<" / ; "\" and DQUOTE quoted
> "[" / "]" / ; and ">" omitted
> ":" / ";" /
> "@" / "\\" /
> "," / "." /
> "\" DQUOTE
> no-fold-literal = "[" *( mdtext / "\[" / "\]" / "\\" ) "]"
> mdtext = NO-WS-CTL / ; Non white space controls
> %d33-61 / ; The rest of the US-ASCII
> %d63-90 / ; characters not including
> %d94-126 ; ">", "[", "]", or "\"

Actually what is redefined is *not* the msg-id production but
no-fold-quote and no-fold-literal. In any event, it would be better
to use different terms for different entities to avoid confusion, e.g.
usefor-msg-id, usefor-id-left, usefor-id-right, usefor-no-fold-quote,
usefor-no-fold-literal.




This archive was generated by hypermail 2.1.7.