From: Bruce Lilly (blilly@erols.com)
Date: Mon May 12 2003 - 09:57:31 CDT
Charles Lindsey wrote:
> I allow "From: Joe Q. Public <joe@public.example>" on a MUST accept
> SHOULD NOT generate yet basis, to bring us in line with RFC 2822
That is not consistent with 2822, which states that unquoted dot in
a phrase MUST NOT (not merely "SHOULD NOT") be generated [2822 sections
3.1 and 4].
> unstructured = 1*( [FWS] ( utext / encoded-word ) ) [FWS]
> phrase = 1*( [CFWS] encoded-word [CFWS] / word ) /
> extended-phrase
> extended-phrase = ( [CFWS] encoded-word [CFWS] / word )
> *( [CFWS] encoded-word [CFWS] / word /
> [CFWS] "." [CFWS] )
Unfortunately the syntax for encoded words is not nearly that simple.
Aside from the issue of the allowable characters in the encoded-words
(which differs for phrases, comments, and unstructured text), there
must be linear whitespace on both sides of each encoded-word except
for three cases, viz. an encoded-word immediately following the '(' in
a comment (which obviously only applies to a subset of structured
fields), an encoded-word immediately followed by ')' in a comment,
and an encoded-word at the end of a field (either unstructured or as
the last word of a phrase at the end of a field body). I have confirmed
that in correspondence with 2047's author, Keith Moore. The ABNF
above does not say that; it permits all of the following, which are
illegal per the intent of RFC 2047:
Comments: foo=?us-ascii?q?bar?=baz
(where =?us-ascii?q?bar?= is intended to be an encoded-word; utext
and encoded-words must be separated by linear whitespace)
Subject: =?us-ascii?q?encoded-word1?==?us-ascii?q?encoded-word2?=
(encoded-words (in all contexts) must be separated from other
encoded-words by linear whitespace)
Keywords: =?us-ascii?q?foo?=,=?us-ascii?q?bar?=
(encoded-words need to be separated from specials, including the
list-element separator ',', by linear whitespace)
From: Bullwinkle =?us-ascii?q?J?=. Moose <b.j.moose@frostbite-falls.mn.us>
('.' is a special, and must be separated from encoded-words by
linear whitespace)
From: Bullwinkle =?us-ascii?q?J?="." Moose <b.j.moose@frostbite-falls.mn.us>
(DQUOTE is also a special)
From: Bullwinkle=?us-ascii?q?J?= "." Moose <b.j.moose@frostbite-falls.mn.us>
(encoded-words in a phrase must be separated from words by linear
whitespace)
Keywords: (a comment)=?us-ascii?q?foo?=(another comment)
(encoded-words require linear-whitespace on both sides except for
the specific cases noted above)
> Msg-ids are redefined [...]
> msg-id = "<" id-left "@" id-right ">"
> id-left = dot-atom-text / no-fold-quote
> id-right = dot-atom-text / no-fold-literal
> no-fold-quote = DQUOTE
> *( mqtext / "\\" / "\" DQUOTE )
> mqspecial
> *( mqtext / "\\" / "\" DQUOTE )
> DQUOTE
> mqtext = NO-WS-CTL / ; all of <text> except
> %d33 / ; SP, HTAB, "\", ">"
> %d35-61 / ; and DQUOTE
> %d63-91 /
> %d93-126
> mqspecial = "(" / ")" / ; same as specials except
> "<" / ; "\" and DQUOTE quoted
> "[" / "]" / ; and ">" omitted
> ":" / ";" /
> "@" / "\\" /
> "," / "." /
> "\" DQUOTE
> no-fold-literal = "[" *( mdtext / "\[" / "\]" / "\\" ) "]"
> mdtext = NO-WS-CTL / ; Non white space controls
> %d33-61 / ; The rest of the US-ASCII
> %d63-90 / ; characters not including
> %d94-126 ; ">", "[", "]", or "\"
Actually what is redefined is *not* the msg-id production but
no-fold-quote and no-fold-literal. In any event, it would be better
to use different terms for different entities to avoid confusion, e.g.
usefor-msg-id, usefor-id-left, usefor-id-right, usefor-no-fold-quote,
usefor-no-fold-literal.