From: Bruce Lilly (blilly@erols.com)
Date: Wed May 21 2003 - 09:15:05 CDT
Pete Resnick wrote:
> On 5/20/03 at 12:43 AM -0400, Bruce Lilly wrote:
>
>> It's not at all contrived -- it illustrates a very real issue, using a
>> registered charset and registered language tag.
>
>
> Have we ever seen this charset with that language tag in a
> 2046-formatted field?
Perhaps not -- have we ever seen a message exactly like the
example in RFC 2822 A.5? Examples are just that; they illustrate
what might happen and the issues that arise. Engineering (in
this case of a protocol or message format) should take into account
the worst-case scenarios, not merely what has in fact already
happened -- the latch on the stable door should be properly
designed *before* the horses all run away. Without proper attention
to these details, one does not have engineering; one has tinkering.
As has been explained regarding this topic, the issue is the
conflict between the must-have-non-whitespace-content-on-first-line
restriction and other requirements, including the maximum line
length when an encoded-word is used. That is an issue when the
combined lengths of the field name, colon, any CFWS excluding the CRLF
that starts a continuation line, and the encoded-word length totals more
than 76 octets. The encoded-word length is comprised of the lengths
of the lead-in, tail-out, and internal delimiters (fixed total of
6 octets), the charset, asterisk and language tag if present,
encoding tag, and encoded text. The problem can arise with shorter
charset names when any of the other items is longer; the field name
(From isn't exactly a long one...), the charset, the encoded-text
(the minimum possible text -- one character -- was used in the example),
language-tag (which has no maximum length), etc.