From owner-ietf-822@dimacs.rutgers.edu Sat Apr 6 14:57:48 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA13224; Sat, 6 Apr 91 14:12:51 EST Received: from porthos.rutgers.edu by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA13220; Sat, 6 Apr 91 14:12:46 EST Received: by porthos.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA27039; Sat, 6 Apr 91 14:12:41 EST Date: Sat, 6 Apr 91 14:12:38 EST From: David Paul Zimmerman To: ietf-smtp@dimacs.rutgers.edu, ietf-822@dimacs.rutgers.edu Reply-To: "Reply to ietf-smtp-request or ietf-822-request -- not dpz"@cs.rutgers.edu Subject: new 822 list is alive! Message-Id: The IETF SMTP list's address is ietf-smtp@dimacs.rutgers.edu. The list's maintainer is ietf-smtp-request@dimacs.rutgers.edu. An archive of the mailing list is available via anonymous FTP from dimacs.rutgers.edu in pub/ietf-smtp-archive. The IETF 822 list's address is ietf-822@dimacs.rutgers.edu. The list's maintainer is ietf-822-request@dimacs.rutgers.edu. An archive of the mailing list is available via anonymous FTP from dimacs.rutgers.edu in pub/ietf-822-archive. Except for two specific requests (Mark Needleman and Don Jackson), everyone (including the redists) is currently on both of these lists. David From owner-ietf-822@dimacs.rutgers.edu Mon Apr 8 10:51:51 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA03480; Mon, 8 Apr 91 10:38:46 EDT Received: from thumper.bellcore.com by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA03476; Mon, 8 Apr 91 10:38:44 EDT Received: from greenbush.bellcore.com by thumper.bellcore.com (4.1/4.7) id for ietf-822@dimacs.rutgers.edu; Mon, 8 Apr 91 10:38:42 EDT Received: by greenbush.bellcore.com (4.12/4.7) id for ietf-822@dimacs.rutgers.edu; Mon, 8 Apr 91 10:41:58 edt Received: from Messages.7.14.N.CUILIB.3.45.SNAP.NOT.LINKED.greenbush.mouseclub.sun4.40 via MS.5.6.greenbush.mouseclub.sun4_40; Mon, 8 Apr 1991 10:41:51 -0400 (EDT) Message-Id: Date: Mon, 8 Apr 1991 10:41:51 -0400 (EDT) From: Nathaniel Borenstein To: ietf-822@dimacs.rutgers.edu Subject: text --> IA5 ? I've started working on yet another draft RFC-XXXX, which I hope will, as a result of our latest discussions, bring us much closer to a consensus. One of the things I'd like to do is get rid of "Content-type: text" which, as Stef, has pointed out, is kind of ambiguous. Neither Stef nor I, however, are sure what the right replacement would be. Here are some possibilities: Content-type: IA5 Content-type: USASCII Content-type: NVT-ASCII Does anyone have a strong feeling about the "right" name for this content-type, which is to be used as the formal designator for "the established default"? At this point, anyone with a strong opinion has a very good chance of winning by default.... -- Nathaniel From owner-ietf-822@dimacs.rutgers.edu Mon Apr 8 11:58:18 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA04283; Mon, 8 Apr 91 11:17:33 EDT Received: from qualcom.qualcomm.com by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA04279; Mon, 8 Apr 91 11:17:29 EDT Received: from [129.46.4.152] with SMTP by QUALCOMM.COM (5.64+/QUALCOMM/V1.0) id AA20373 for ietf-822@dimacs.rutgers.edu; Mon, 8 Apr 91 08:17:23 -0700 Date: Mon, 8 Apr 91 08:17:23 -0700 Message-Id: <9104081517.AA20373@QUALCOMM.COM> To: ietf-822@dimacs.rutgers.edu, Nathaniel Borenstein From: jwn2@qualcom.qualcomm.com Subject: Re: text --> IA5 ? > >Content-type: IA5 Too cryptic. >Content-type: USASCII Non-Americans will complain we're being chauvinistic :-) >Content-type: NVT-ASCII Too long. If we're voting, I vote for USASCII. Actually, the US is redundant...but then we still have to distinguish 7-bit ASCII from 8-bit, eh? (No offense intended to citizens of other "American" states...) -jwn2 From owner-ietf-822@dimacs.rutgers.edu Mon Apr 8 12:21:53 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA04371; Mon, 8 Apr 91 11:38:58 EDT Received: from [129.34.139.4] by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA04367; Mon, 8 Apr 91 11:38:55 EDT Received: from YKTVMV by watson.ibm.com (IBM VM SMTP V2R1) with BSMTP id 4164; Mon, 08 Apr 91 11:38:50 EDT Date: 08 Apr 1991 11:37:51 EDT From: dan@watson.ibm.com (Walt Daniels) Phone: 914-784/863-6736 To: ietf-822@dimacs.rutgers.edu Message-Id: <040891.113751.dan@watson.ibm.com> Subject: text --> IA5 ? >Content-type: IA5 >Content-type: USASCII >Content-type: NVT-ASCII IA5 is a character set. USASCII is a codeset. I don't know what NVT-ASCII is. I would prefer if it refered to an ISO standard. I think IA5 is ISO 6429. In any case I think we are trying to specify the character set not the codeset. This mail is being sent from an EBCDIC system and is in IA5 but not in USACII (at least until it leaves here :-). From owner-ietf-822@dimacs.rutgers.edu Mon Apr 8 12:51:51 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA05032; Mon, 8 Apr 91 11:51:48 EDT Received: from INFOODS.MIT.EDU by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA05028; Mon, 8 Apr 91 11:51:46 EDT Date: Mon 8 Apr 91 11:51:32-EDT From: John C Klensin Subject: Re: Inline excerpts (was: Re: Another look at 8 bit transport for...) To: kankkune@cs.helsinki.fi Cc: ietf-822@dimacs.rutgers.edu Message-Id: <671125892.390521.KLENSIN@INFOODS.MIT.EDU> In-Reply-To: <9104081458.AA10916@hydra.Helsinki.FI> Mail-System-Version: >However, would it be reasonable to have inline parts as >part of multipart messages? You could use some flag to indicate that the >part should be positioned right after the previous EOL, not starting a >new line. I suppose that if one wanted to do something with "content order" (I'm not sure what happened with that thread) it would be quite natural to attach this sort of thing to it, along with very precise language about where the embedded part actually stopped and ended, possibly by adding some special bracketing characters that would not be EOL sensitive. E.g., Content-order: 3,delimiter=/ Content-type: ASCII /The ancient symbol / Content-order: 4, delimiter=/ Content-type= GIF Content-encoding: HEX /010.../ Whether it is worth the trouble and complexity is another question. Clearly a lot of overhead to put in a single "character". john ------- From owner-ietf-822@dimacs.rutgers.edu Mon Apr 8 15:21:52 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA11828; Mon, 8 Apr 91 14:55:09 EDT Received: from thumper.bellcore.com by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA11824; Mon, 8 Apr 91 14:55:07 EDT Received: from greenbush.bellcore.com by thumper.bellcore.com (4.1/4.7) id for ietf-822@dimacs.rutgers.edu; Mon, 8 Apr 91 14:55:03 EDT Received: by greenbush.bellcore.com (4.12/4.7) id for ietf-822@dimacs.rutgers.edu; Mon, 8 Apr 91 14:58:20 edt Received: from Messages.7.14.N.CUILIB.3.45.SNAP.NOT.LINKED.greenbush.mouseclub.sun4.40 via MS.5.6.greenbush.mouseclub.sun4_40; Mon, 8 Apr 1991 14:58:16 -0400 (EDT) Message-Id: Date: Mon, 8 Apr 1991 14:58:16 -0400 (EDT) From: Nathaniel Borenstein To: ietf-822@dimacs.rutgers.edu Subject: Re: text --> IA5 ? In-Reply-To: <9104081517.AA20373@QUALCOMM.COM> References: <9104081517.AA20373@QUALCOMM.COM> Between the answers posted to this list and the answers sent to me as mail, it's obvious that none of my suggested alternatives was good enough. The real problem, I think, is that the current default content-type for mail has never been well-enough defined for any of the more specific terms to properly apply. Thus, for example, calling it "NVT-ASCII" or "IA5" might actually be strengthening the constraints on it. While that might be desirable, it opens a major can of worms, and was not the intent. What I'm seeking is just a descriptive term for "what mail bodies are now assumed to be." How about simply Content-type: US-7-bit which is, at least, a non-technical term that somewhat describes the status quo.... From owner-ietf-822@dimacs.rutgers.edu Mon Apr 8 17:22:01 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA14517; Mon, 8 Apr 91 16:22:08 EDT Received: from Sun.COM by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA14510; Mon, 8 Apr 91 16:22:01 EDT Received: from Eng.Sun.COM (zigzag-bb.Corp.Sun.COM) by Sun.COM (4.1/SMI-4.1) id AA25266; Mon, 8 Apr 91 13:21:58 PDT Received: from vision.Eng.Sun.COM by Eng.Sun.COM (4.1/SMI-4.1) id AA09515; Mon, 8 Apr 91 13:21:57 PDT Received: by vision.Eng.Sun.COM (4.1/SMI-4.1) id AA11991; Mon, 8 Apr 91 13:22:40 PDT Date: Mon, 8 Apr 91 13:22:40 PDT From: lau@eng.sun.com (Vincent Lau) Message-Id: <9104082022.AA11991@vision.Eng.Sun.COM> To: ietf-822@dimacs.rutgers.edu Subject: Re: text --> IA5 ? > One of the things I'd like to do is get rid of > "Content-type: text" which, as Stef, has pointed out, is kind of > ambiguous. Neither Stef nor I, however, are sure what the right > replacement would be. Here are some possibilities: > Although "text" may sound ambiguous, the contents should be human readable. I would like to suggest a slightly different approach. Warning: it may be controversial. That is to create a new header (e.g. Codeset: ) to identify the codeset being used in the contents. Why? I have an NROFF document which uses tbl and mm macros, but it contains ISO-8859-1 characters. According to RFC 1148, the content-type becomes: Content-Type: nroff; null; tbl, ms There is no place to identify the character set/codeset. If a new header is created, I can specify it as: Content-Type: nroff; null; tbl, ms Codeset: ISO-8859-1 (or whatever the convention we define) Note, it is merely a suggestion. It is controversial because some mailers don't care if nroff document contains non-7-bit-ASCII as long as Content-Encoding does the right thing. If you feel that it is too controversial, I will not mention it again. Some people may feel that Content-Type should contain the semantic meaning, but not the actual implementation. I am not saying that it is a right way, but it brings up an interesting question. Should we recommend the type names format: should it be in hierarchy format (such as "company"-"type") or just a flat type space? For example, in flat type space, if company A has implemented voice data-type and registered it as "voice". When company B has implemented its own voice data-type, it must register the type as anything other than "voice". In hierachy format, the name will be "A-voice" and "B-voice". In a drastic approach, there will be only *one* type "voice" registered and it uses different field to identify the implementation. Any comment on this? -Vincent From owner-ietf-822@dimacs.rutgers.edu Mon Apr 8 18:22:03 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA17384; Mon, 8 Apr 91 17:39:17 EDT Received: from RUTGERS.EDU by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA17379; Mon, 8 Apr 91 17:39:13 EDT Received: from nrtc.northrop.com by rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA02030; Mon, 8 Apr 91 17:39:00 EDT Received: from nma.com by nrtc.nrtc.northrop.com id ab02524; 8 Apr 91 13:34 PST Received: from odin.nma.com by nma.com id aa02594; 8 Apr 91 12:11 PDT To: Nathaniel Borenstein Cc: ietf-822@dimacs.rutgers.edu Subject: Re: text --> IA5 ? In-Reply-To: Your message of Mon, 08 Apr 91 10:41:51 -0500. Reply-To: Stef@ics.uci.edu From: Einar Stefferud Date: Mon, 08 Apr 91 13:09:54 MDT Message-Id: <9318.671141394@nma.com> Sender: stef@nma.com I would hope that the winner (see below) would be someone with a strong logical case for the choice, like "Here is the formal, unique, adopted International Standard, designation for exactly what RFC822 intended all along!" Best...\Stef > Here are some possibilities: > Content-type: IA5 > Content-type: USASCII > Content-type: NVT-ASCII > Does anyone have a strong feeling about the "right" name for this > acontent-type, which is to be used as the formal designator for "the > established default"? At this point, anyone with a strong opinion has a > very good chance of winning by default.... -- Nathaniel From owner-ietf-822@dimacs.rutgers.edu Mon Apr 8 20:22:02 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA20510; Mon, 8 Apr 91 19:57:02 EDT Received: from RUTGERS.EDU by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA20506; Mon, 8 Apr 91 19:57:00 EDT Received: from nrtc.northrop.com by rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA10211; Mon, 8 Apr 91 19:56:57 EDT Received: from nma.com by nrtc.nrtc.northrop.com id ac03614; 8 Apr 91 15:56 PST Received: from odin.nma.com by nma.com id aa00187; 8 Apr 91 15:16 PDT To: Nathaniel Borenstein Cc: ietf-822@dimacs.rutgers.edu Subject: Re: text --> IA5 ? In-Reply-To: Your message of Mon, 08 Apr 91 14:58:16 -0500. Reply-To: Stef@ics.uci.edu From: Einar Stefferud Date: Mon, 08 Apr 91 16:15:20 MDT Message-Id: <9445.671152520@nma.com> Sender: stef@nma.com Well;-) I guess I now feel that my instincts were right on the mark. We don't know for sure what "text" means, do we! Best...\Stef From owner-ietf-822@dimacs.rutgers.edu Mon Apr 8 20:52:03 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA20519; Mon, 8 Apr 91 19:57:55 EDT Received: from RUTGERS.EDU by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA20514; Mon, 8 Apr 91 19:57:52 EDT Received: from nrtc.northrop.com by rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA10251; Mon, 8 Apr 91 19:57:47 EDT Received: from nma.com by nrtc.nrtc.northrop.com id ad03614; 8 Apr 91 15:56 PST Received: from odin.nma.com by nma.com id aa00201; 8 Apr 91 15:23 PDT To: John C Klensin Cc: kankkune@cs.helsinki.fi, ietf-822@dimacs.rutgers.edu Subject: Re: Inline excerpts (was: Re: Another look at 8 bit transport for...) In-Reply-To: Your message of Mon, 08 Apr 91 11:51:32 -0500. <671125892.390521.KLENSIN@INFOODS.MIT.EDU> Reply-To: Stef@ics.uci.edu From: Einar Stefferud Date: Mon, 08 Apr 91 16:22:14 MDT Message-Id: <9454.671152934@nma.com> Sender: stef@nma.com I am bothered by the comments shown below (with > prefixes). It seems to me that multi-part adn multi-media parts are very different beasts, whcih we should understand. We sho9uld avoid mixing them to gether too much. I don't see any point in trying to make out multi-part mechainsm into ODA where I can mix all sortsof theings to gether in a single body part. What I want is to be able to carry an ODA body part when I need such a thing, and to carry other body-parts when they suit my needs, but I do not want to convert all of RFC822bis into a new form of ODA, or even a partial form of ODA. A half baked RFC822oda is not of much value as I see it. Lets abandon trying to too far beyond hierarchical multi-part structures. Best...\Stef >>However, would it be reasonable to have inline parts as >>part of multipart messages? You could use some flag to indicate that the >>part should be positioned right after the previous EOL, not starting a >>new line. > >I suppose that if one wanted to do something with "content order" (I'm >not sure what happened with that thread) it would be quite natural to >attach this sort of thing to it, along with very precise language about >where the embedded part actually stopped and ended, possibly by adding >some special bracketing characters that would not be EOL sensitive. >E.g., >a Content-order: 3,delimiter=/ > Content-type: ASCII > /The ancient symbol / > > Content-order: 4, delimiter=/ > Content-type= GIF > Content-encoding: HEX > > /010.../ > >Whether it is worth the trouble and complexity is another question. >Clearly a lot of overhead to put in a single "character". > john From owner-ietf-822@dimacs.rutgers.edu Mon Apr 8 21:22:03 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA22893; Mon, 8 Apr 91 21:16:45 EDT Received: from RUTGERS.EDU by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA22889; Mon, 8 Apr 91 21:16:42 EDT Received: from nrtc.northrop.com by rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA15012; Mon, 8 Apr 91 21:16:32 EDT Received: from nma.com by nrtc.nrtc.northrop.com id ab03955; 8 Apr 91 17:15 PST Received: from odin.nma.com by nma.com id aa00341; 8 Apr 91 16:41 PDT To: jwn2@qualcom.qualcomm.com Cc: ietf-822@dimacs.rutgers.edu, Nathaniel Borenstein Subject: Re: text --> IA5 ? In-Reply-To: Your message of Mon, 08 Apr 91 08:17:23 -0800. <9104081517.AA20373@QUALCOMM.COM> Reply-To: Stef@ics.uci.edu From: Einar Stefferud Date: Mon, 08 Apr 91 17:40:33 MDT Message-Id: <9567.671157633@nma.com> Sender: stef@nma.com I note that you did not suggest that IA5 is too unambiguous! >>Content-type: IA5 >Too cryptic. >>Content-type: USASCII >Non-Americans will complain we're being chauvinistic :-) >>Content-type: NVT-ASCII >Too long. > >If we're voting, I vote for USASCII. Actually, the US is redundant...but >then we still have to distinguish 7-bit ASCII from 8-bit, eh? (No offense >intended to citizens of other "American" states...) -jwn2 How is it that something can be redundant, and its absence can then cause a failure to distinguish. )-:-) I vote for the one that is most unambiguous, believing that accepting ambiguity in the interests of avoiding things like "too cryptic" "chauvanistic" is just not useful in terms of our objectives. I am becoming very baffled by this discussion. Is this a standards meeting, or am I really at the IMPROV in Hollywood? I always did want to be a stand-up comic, but I never thought to do it here. Best...\Stef From owner-ietf-822@dimacs.rutgers.edu Tue Apr 9 03:50:35 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA07630; Tue, 9 Apr 91 03:33:22 EDT Received: from citi.umich.edu by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA07625; Tue, 9 Apr 91 03:33:16 EDT Message-Id: <9104090733.AA07625@dimacs.rutgers.edu> From: Bruce Howard To: jwn2@qualcom.qualcomm.com, Stef@ics.uci.edu Cc: nsb@thumper.bellcore.com, ietf-822@dimacs.rutgers.edu Date: Tue, 9 Apr 91 03:32 EDT Subject: Re: text --> IA5 ? From: Einar Stefferud To: jwn2@qualcom.qualcomm.com Cc: ietf-822@dimacs.rutgers.edu, Nathaniel Borenstein Subject: Re: text --> IA5 ? I note that you did not suggest that IA5 is too unambiguous! >>Content-type: IA5 >Too cryptic. >>Content-type: USASCII >Non-Americans will complain we're being chauvinistic :-) [ rest deleted ] hmm. does spanish have any funky characters? i seem to remember reading that puerto rico voted spanish the "state" language recently...perhaps united states americans will complain as well...or not if there are no special characters. unfortunately, ignorance leaves me uncertain on this point... bruce From owner-ietf-822@dimacs.rutgers.edu Tue Apr 9 10:19:09 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA16803; Tue, 9 Apr 91 10:17:46 EDT Received: from thumper.bellcore.com by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA16799; Tue, 9 Apr 91 10:17:44 EDT Received: from greenbush.bellcore.com by thumper.bellcore.com (4.1/4.7) id for ietf-822@dimacs.rutgers.edu; Tue, 9 Apr 91 10:17:40 EDT Received: by greenbush.bellcore.com (4.12/4.7) id for ietf-822@dimacs.rutgers.edu; Tue, 9 Apr 91 10:21:01 edt Received: from Messages.7.14.N.CUILIB.3.45.SNAP.NOT.LINKED.greenbush.mouseclub.sun4.40 via MS.5.6.greenbush.mouseclub.sun4_40; Tue, 9 Apr 1991 10:20:56 -0400 (EDT) Message-Id: Date: Tue, 9 Apr 1991 10:20:56 -0400 (EDT) From: Nathaniel Borenstein To: ietf-822@dimacs.rutgers.edu Subject: Re: text --> IA5 ? In-Reply-To: <9104082022.AA11991@vision.Eng.Sun.COM> References: <9104082022.AA11991@vision.Eng.Sun.COM> Vincent's latest message was, as usual, thoughtful and reasonable. It also worries me. The problem appears to be what I would characterize as the attempt to *combine* two different content-types. In this case, we have the apparently reasonable request for nroff source with an international character set. This seems to violate a very fundamental assumption that we've been making all along, which is that mail messages (or parts of messages) are uniquely typed. Here we have a sort of "multiple inheritance" problem, and at least two radical solutions might spring to mind: -- make character sets & content types independent (Vincent mentioned this one, as did Timo earlier) -- allow somehow for general-case multiple or cascaded content-types I'm very reluctant to go down either of these paths, because I think that there is a LOT of potential complexity here to support what I really think are likely to be "pathological" cases. In this particular example, nroff is an ancient and poorly-defined "rich text" representation. Unlike most such representations, it doesn't really explicitly address character set issues, and so it is almost necessary to think of nroff and character sets as independent. However, this is NOT the case for most modern text representations. For example, when Andrew (which is by no means state-of-the-art in its dealing with multiple charactersets) sends messages in non-standard character sets, it uses a representation for them that means that the whole overall message is in US ASCII. In almost all other cases that I know of, rich text formats explicitly deal with character set issues, as well they should. That is, there is a standard file format, for each of these representations, that encodes character sets among other things. This is NOT the case for nroff, of course, for which charsets are "out-of-band" information. The question that remains, I think, is a simple one: how many troublesome cases like Vincent's example will there really be, and how important are they? Without much evidence to support it, I have a gut feeling that the answer is "not too many and not too important." If this is the case, we can handle them much more simply with a small proliferation of content-types, e.g.: Content-Type: nroff/iso-8859-1; null; tbl, ms In other words, if there aren't too many such cases, we can handle them by defining different content-types to handle the charset-variations within a type such as nroff. Anyway, the argument above is my gut response, which is strongly motivated by a desire not to open new cans of worms at this stage. I fully realize that there are some cans that just have to be opened, and I suppose I might yet be convinced that this is one of them, but I'm not convinced just yet... -- Nathaniel From owner-ietf-822@dimacs.rutgers.edu Tue Apr 9 11:19:10 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA18323; Tue, 9 Apr 91 11:07:43 EDT Received: from NRI.RESTON.VA.US by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA18319; Tue, 9 Apr 91 11:07:38 EDT Received: from NRI by NRI.NRI.Reston.VA.US id aa18412; 9 Apr 91 10:58 EDT To: Nathaniel Borenstein Cc: ietf-822@dimacs.rutgers.edu, gvaudre@nri.reston.va.us Subject: Compression, and nested encodings In-Reply-To: Your message of "Tue, 09 Apr 91 10:20:56 EDT." Date: Tue, 09 Apr 91 10:58:50 -0400 From: Greg Vaudreuil Message-Id: <9104091058.aa18412@NRI.NRI.Reston.VA.US> > The question that remains, I think, is a simple one: how many > troublesome cases like Vincent's example will there really be, and how > important are they? There is one more significant case we decided not to look at at St. Louis for fear of getting a migraine. Many people believe that compression should be included in the message format. I'm not sure where the best place to put it is but both of the current headers are good ideas. Compression can be viewed as another level of encoding, which is independent of the base encoding of a part. This is a tar|compress|uuencode case. These are all ways of altering the data representation and are not mutually exclusive. I have heard people suggest that this concept of nested encodings is a good thing. Another way to view compression is as a separate data type. This is: type: fax|compressed encoding: uuencode This argues for nested content-type fields. This idea stems from Nathaniels question about other strange encoding cases. I'm not sure the need for nested content-types is justified by the compression case, but is another possibility for such a facility. In the case of 8 bit nroff, or LaTeX or any of our favorite mark-up, document enhancement languages, we should not try to identify what is contained in the part. What is a LaTeX document with boxes and charts called? Text? Think of this as a parallel of ODA. We just name it and send it. If the application cannot differentiate the data, it is not up to the mail message format protocols to define it for the application. Think of Word Perfect. It can represent many foreign characters, but I do not think that we need to declare in the content-type header what characters are used in the document. Mail is just providing transport to these applications. If the nroff has a 8 bit character in it, then it obviously needs to be encoded, so specify that. This is wierd in that nroff may or may not need encoding, but I do not think we need to define a character set for it. We may have to be explicit in the definition of the Type: nroff whether it needs to be encoded or not when sending over a normal 7 bit datapath, and that will largely determined by whether or not the program itself is capabile of dealing with 8 bit characters. If it is.... it must be encoded! Quoted Printable was defined for just such cases. Thoughts? Greg V. From owner-ietf-822@dimacs.rutgers.edu Tue Apr 9 11:52:56 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA19008; Tue, 9 Apr 91 11:30:08 EDT Received: from INFOODS.MIT.EDU by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA18999; Tue, 9 Apr 91 11:30:05 EDT Date: Tue 9 Apr 91 11:29:58-EDT From: John C Klensin Subject: Re: text --> IA5 ? To: nsb@thumper.bellcore.com, ietf-822@dimacs.rutgers.edu Message-Id: <671210998.413521.KLENSIN@INFOODS.MIT.EDU> In-Reply-To: Mail-System-Version: Nathaniel writes... >Vincent's latest message was, as usual, thoughtful and reasonable. It >also worries me. Me, too. >The question that remains, I think, is a simple one: how many >troublesome cases like Vincent's example will there really be, and how >important are they? Without much evidence to support it, I have a gut >feeling that the answer is "not too many and not too important." My gut feeling is that, if this is carried to its logical extreme, the answer would be "perhaps very important, and a great many". Consider the multipliers involved: if you accept "nroff" as motivating a content type, recall that nroff is the product of one operating system and a relatively small number of things that have cloned its (nroff's) code. But fairly low-level formatting languages exist at roughly a one-one ratio to operating systems, plus or minus a few, so... Content-type: dsr/... Content-type: script/... Content-type: scribe/... Content-type: compose/... and so one and so forth. This makes a very good place to consider how far one wants to go, and I'd refer people back to Stef's recent message for what I consider a persuasive case. There is a critter called ODA. It is designed for the handling and representation of structured, rich text format, documents that contain combinations of variant character sets and/or, e.g., special format codings. IETF management, in its infinite wisdom, has created a WG on ODA over Internet mail. I would suggest that this is a reasonable place to draw the line and that, if the ODA types are not rich enough to handle nroff and 8859-1, we send people off to the ISO WGs and make that happen. In other words, for both this case, and for other cases where carefully- structured text is needed (maybe imbedded in-line funny glyphs, per Risto's example, maybe even ordered content sections), we introduce one additional content type, ODA, and then let those folks do their thing. The question shouldn't be whether we can diddle Content-type to handle every imaginable case (we are collectively certainly smart enough to do that), but how to figure out where to stop. I think Vincent's example should be used to open debate on the question. Stef thinks Risto's example (and my comments on it) should be used to open debate on the question. We are probably both right, not necessarily about the conclusion, but that a little meta-level consideration of whether it is desirable and necessary to make this sort of thing work as part of Content-type is in order before working on the engineering details of how to do it. Curiously, being able to say "that is an ODA problem", when appropriate, meets Nathaniel's criterion about not opening cans of worms: it involves taking the closed can, worms and all, and passing it to someone else with instructions that it is their problem. If ODA is adequate today (it may not be), "they" will have very docile worms when they do open the can. ---john ------- From owner-ietf-822@dimacs.rutgers.edu Tue Apr 9 13:19:12 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA23068; Tue, 9 Apr 91 13:00:20 EDT Received: from relay.cdnnet.ca by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA23064; Tue, 9 Apr 91 13:00:13 EDT Received: by relay.CDNnet.CA (4.1/1.14) id AA08437; Tue, 9 Apr 91 09:59:51 PDT Date: 9 Apr 91 9:58 -0700 From: Brian Wideen To: John C Klensin Cc: nsb@thumper.bellcore.com, ietf-822@dimacs.rutgers.edu In-Reply-To: <671210998.413521.KLENSIN@INFOODS.MIT.EDU> Message-Id: <9104092791*Brian.Wideen@Vancouver.osiware.bc.ca> Subject: Re: text --> IA5 ? John argues that ODA is the answer that meets key requirements regarding flexibility (to carry documents from a variety of sources) and simplicity (in terms of content-type handling). Further, details regarding format can be punted to the IETF & ISO WGs responsible. I like this general solution to a general problem, being superior to the specific solutions outlined so far. To mandate nroff is too limiting and too parochial (Unix-oriented). It does not facilitate general exchange, only allowing some members of the Unix-community to exchange documents. From owner-ietf-822@dimacs.rutgers.edu Tue Apr 9 14:19:10 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA26083; Tue, 9 Apr 91 14:16:44 EDT Received: from RUTGERS.EDU by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA26076; Tue, 9 Apr 91 14:16:37 EDT Received: from nrtc.northrop.com by rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA00602; Tue, 9 Apr 91 14:16:17 EDT Received: from nma.com by nrtc.nrtc.northrop.com id aa06980; 9 Apr 91 10:15 PST Received: from odin.nma.com by nma.com id aa01352; 9 Apr 91 10:02 PDT To: Brian Wideen Cc: John C Klensin , nsb@thumper.bellcore.com, ietf-822@dimacs.rutgers.edu Subject: Re: text --> IA5 ? In-Reply-To: Your message of 09 Apr 91 09:58:00 -0800. <9104092791*Brian.Wideen@Vancouver.osiware.bc.ca> Reply-To: Stef@ics.uci.edu From: Einar Stefferud Date: Tue, 09 Apr 91 11:00:51 MDT Message-Id: <9995.671220051@nma.com> Sender: stef@nma.com Hello Brian -- I believe we agree. Just want to be sure. I am not advocating ODA per se, but noting that where users want to include different fonts and different character sets in single body parts, they should do so with tools designed for that purpose, rather then imposing a general requirement on our RFC822bis design to be all things to all poeple, and in essence reinvent a combination of ODA/et-al. Best...\Stef From owner-ietf-822@dimacs.rutgers.edu Tue Apr 9 14:50:31 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA26911; Tue, 9 Apr 91 14:41:09 EDT Received: from INFOODS.MIT.EDU by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA26906; Tue, 9 Apr 91 14:41:06 EDT Date: Tue 9 Apr 91 14:40:55-EDT From: John C Klensin Subject: Re: text --> IA5 ? To: Stef@ics.uci.edu Cc: ietf-822@dimacs.rutgers.edu Message-Id: <671222455.282521.KLENSIN@INFOODS.MIT.EDU> In-Reply-To: <9995.671220051@nma.com> Mail-System-Version: Stef writes... >I am not advocating ODA per se, but noting that where users want to >include different fonts and different character sets in single body >parts, they should do so with tools designed for that purpose, rather >then imposing a general requirement on our RFC822bis design to be all >things to all poeple, and in essence reinvent a combination of >ODA/et-al. Best...\Stef Me neither. ODA --probably in some combination with SGML-- just seems to be an appropriate framework for doing this sort of thing right and, perhaps more important, *once*. It is really Brian's comment that would characterize my hope: that passing this problem off to someone else who is at an appropriate stage in *their* development to do something with it, and then "encourage" them to make sure that cases relevant to us are covered will result in better solutions in their area, a better RFC822 extension, and complete compatibility between X.400 and SMTP/RFC822bis for the most complex documents around. One could ask for little more; the only question is just where to draw the lines. john ------- From owner-ietf-822@dimacs.rutgers.edu Tue Apr 9 15:19:10 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA27835; Tue, 9 Apr 91 14:58:39 EDT Received: from RUTGERS.EDU by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA27828; Tue, 9 Apr 91 14:58:32 EDT Received: from nrtc.northrop.com by rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA03344; Tue, 9 Apr 91 14:58:14 EDT Received: from nma.com by nrtc.nrtc.northrop.com id ad07181; 9 Apr 91 10:56 PST Received: from odin.nma.com by nma.com id aa01418; 9 Apr 91 10:53 PDT To: Bruce Howard Cc: jwn2@qualcom.qualcomm.com, nsb@thumper.bellcore.com, ietf-822@dimacs.rutgers.edu Subject: Re: text --> IA5 ? In-Reply-To: Your message of Tue, 09 Apr 91 03:32:00 -0500. <9104090733.AA07625@dimacs.rutgers.edu> Reply-To: Stef@ics.uci.edu From: Einar Stefferud Date: Tue, 09 Apr 91 11:52:01 MDT Message-Id: <10091.671223121@nma.com> Sender: stef@nma.com Oh, come off it! What is all this Americans baiting about? I am only talking about putting an unabiguous label on whatever it is that RFC822 defined to be the character set/encoding for RFC822 mail. This has nothing whatever to with why it was chosen or whether it is good, or bad, or whatever. Whatever it is, it is! Quite frankly, I am not even a little bit amused by these no-sequitors! Over and out...\Stef > hmm. does spanish have any funky characters? i seem to remember > reading that puerto rico voted spanish the "state" language > recently...perhaps united states americans will complain as well...or > not if there are no special characters. unfortunately, ignorance > leaves me uncertain on this point...bruce From owner-ietf-822@dimacs.rutgers.edu Tue Apr 9 15:49:10 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA28935; Tue, 9 Apr 91 15:26:24 EDT Received: from relay.cdnnet.ca by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA28931; Tue, 9 Apr 91 15:26:10 EDT Received: by relay.CDNnet.CA (4.1/1.14) id AA11394; Tue, 9 Apr 91 12:25:30 PDT Date: 9 Apr 91 12:24 -0700 From: Einar Stefferud Sender: Brian Wideen To: Brian Wideen Cc: John C Klensin , nsb@thumper.bellcore.com, ietf-822@dimacs.rutgers.edu Reply-To: Stef@ics.uci.edu In-Reply-To: Message-Id: <9104092798*Brian.Wideen@Vancouver.osiware.bc.ca> Subject: Re: text --> IA5 ? Hello Brian -- I believe we agree. Just want to be sure. I am not advocating ODA per se, but noting that where users want to include different fonts and different character sets in single body parts, they should do so with tools designed for that purpose, rather then imposing a general requirement on our RFC822bis design to be all things to all poeple, and in essence reinvent a combination of ODA/et-al. Best...\Stef From owner-ietf-822@dimacs.rutgers.edu Tue Apr 9 16:19:10 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA29413; Tue, 9 Apr 91 15:36:41 EDT Received: from RUTGERS.EDU by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA29404; Tue, 9 Apr 91 15:36:35 EDT Received: from nrtc.northrop.com by rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA05667; Tue, 9 Apr 91 15:36:11 EDT Date: Tue, 9 Apr 91 15:36:11 EDT From: @odin.nma.com:stef@nma.com Message-Id: <9104091936.AA05667@rutgers.edu> Received: from nma.com by nrtc.nrtc.northrop.com id ab07370; 9 Apr 91 11:34 PST Received: from odin.nma.com by nma.com id aa01483; 9 Apr 91 11:35 PDT To: John C Klensin Cc: ietf-822@dimacs.rutgers.edu ietf-822@dimacs.rutgers.EDU Subject: Re: text --> IA5 ? In-reply-to: Your message of Tue, 09 Apr 91 14: 40:55 -0500. <671222455.282521.KLENSIN@INFOODS.MIT.EDU> Reply-to: Stef@ics.uci.edu From: Einar Stefferud Date: Tue, 09 Apr 91 12:33:44 MDT Message-ID: <10179.671225624@nma.com> Sender: stef@nma.com Thanks John -- I think the right place to draw the line is at the concept of identifiable whole body parts, with perhaps a little extension to indicate the existance of some potential concurrency among some body-parts, but no requirement that the indication of concurency must result in concrrent action on the part of any UA. (Per some comments from Nathaniel a while back.) I expect that we should not exceed the capabilities of X.400 in this regard, for any linkage among the body-parts carried in an IPM envelope. Lets not exceed X.400 in this reard. I believe that anything more complext than this should be part of a separate standard for structured multi-media objects that are entirely independent of mail, so they may be transported in any way that is interesting and available. FTP, TAPE, DISKETTES, BAR-CODES, whatever.. To me, multi-media mail is just a tool for carrying multiple objects of arbitrary kinds. I do not want to limit the kinds in any way, though I understand fully that we need standards so that composers can expect recipients to be able to understand what composers compose. So, I want to shunt this complecity to the multi-media object developemnt track. Best...\Stef From owner-ietf-822@dimacs.rutgers.edu Tue Apr 9 17:19:13 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA03406; Tue, 9 Apr 91 16:59:23 EDT Received: from Sun.COM by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA03400; Tue, 9 Apr 91 16:59:17 EDT Received: from Eng.Sun.COM (zigzag-bb.Corp.Sun.COM) by Sun.COM (4.1/SMI-4.1) id AA19233; Tue, 9 Apr 91 13:59:14 PDT Received: from skylark.Eng.Sun.COM by Eng.Sun.COM (4.1/SMI-4.1) id AA15605; Tue, 9 Apr 91 13:59:12 PDT Received: by skylark.Eng.Sun.COM (4.1/SMI-4.1) id AA06062; Tue, 9 Apr 91 13:53:46 PDT Date: Tue, 9 Apr 91 13:53:46 PDT From: katin@eng.sun.com (Neil Katin) Message-Id: <9104092053.AA06062@skylark.Eng.Sun.COM> To: ietf-822@dimacs.rutgers.edu Subject: Re: text --> IA5 ? Wait. I think some people have missed Vincent's point, and have gotten too wrapped up in the particular example. The example of nroff was *not* proposing nroff as the "net standard formatting language". It was only trying to be an example of an older program which does not internally identify the character set that it uses; therefore the character set needs to be represented on the outside of the body part. There have been two proposed solutions: concatenate the character set identifier along with the type name, or represent the character set in a separate header field of its own. These solutions are clearly equivalent in the information transmission sense -- anything that one can represent the other can too. The choice between them will be made because of differences in style and religion rather than because "something couldn't be done" one way or the other. With that said, I'ld like to state my opinion: I think that the "character set encoding" information should be split to a different field. Why? I have a model where there are pieces of information that the "mail system" wants to understand, independent of the data type -- things like encoding method(s), compression method(s), character encodings, etc. This is all information on the body part "envelope"; things that are carried around in addition to the actual data. Just today we've already heard two different positional combinations of the type field: type/character_set and type/compression. One can easily believe that new things will need to be standardized. The easiest way to do this is to give these attributes their own header field in the body part, rather than trying to positionally represent them in the type field. About Nathaniel's point with respect to nroff being "old technology" that doesn't internally identify a character set (whereas things like postscript, Andrew, etc, internally represent the character set) -- like it or not these things abound in today's world, and we should make sure that we pass enough information to properly display the document on the user's screen. Neil From owner-ietf-822@dimacs.rutgers.edu Tue Apr 9 18:11:00 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA04975; Tue, 9 Apr 91 17:48:28 EDT Received: from qualcom.qualcomm.com by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA04969; Tue, 9 Apr 91 17:48:19 EDT Received: by QUALCOMM.COM (5.64+/QUALCOMM/V1.0) id AA06091 for ietf-822@dimacs.rutgers.edu; Tue, 9 Apr 91 14:48:07 -0700 Date: Tue, 9 Apr 91 14:48:07 -0700 From: jwn2@qualcomm.com (John Noerenberg) Message-Id: <9104092148.AA06091@QUALCOMM.COM> To: ietf-822@dimacs.rutgers.edu, nsb@thumper.bellcore.com Subject: Re: text --> IA5 ? > The question that remains, I think, is a simple one: how many > troublesome cases like Vincent's example will there really be, and how > important are they? Without much evidence to support it, I have a gut > feeling that the answer is "not too many and not too important." If > this is the case, we can handle them much more simply with a small > proliferation of content-types, e.g.: > > Content-Type: nroff/iso-8859-1; null; tbl, ms > > In other words, if there aren't too many such cases, we can handle them > by defining different content-types to handle the charset-variations > within a type such as nroff. > I disagree, Nathaniel. Vincent's example demonstrates that the central question remains "What is meant by content-type?". Is a content-type a symbol-mapping, or is it a language-mapping? Your original choices -- IA5, USASCII, or NVT-ASCII specify a symbol map. But Vincent suggests the latter. It seems to me that part of what you are trying to do is specify the lingua franca of messages, besides provide a method to negotiate some other encoding. I believe this is the proper thing to do. But trying to enumerate the universe of language maps is probably a fruitless task. Any enumeration will restrict the set of content-types, which is definitely the wrong thing. Let us specify that the symbol set for describing message parts shall be USASCII. Further, I suggest that the interpretation of the content-type argument be left to the UAs. In the absence of any content-type specifi- cation, the bytes of data should be interpreted as USASCII symbols. This provides a framework for communication, and provides a means to negotiate more complex encodings without restricting the set of content- types available. (It also avoids having to decide if content-type is a symbol map or a language :-) But I believe this _should_ be left to the discretion of the UAs.) I think it would be wise to provide a list of well-known content types, and the rules for their interpretation as examples. But I don't believe (nor do I think it is your intent that) the RFC should restrict the range of content-types included in messages. -jwn2 From owner-ietf-822@dimacs.rutgers.edu Tue Apr 9 20:49:12 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA09767; Tue, 9 Apr 91 20:16:37 EDT Received: from RUTGERS.EDU by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA09761; Tue, 9 Apr 91 20:16:33 EDT Received: from nrtc.northrop.com by rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA22014; Tue, 9 Apr 91 20:16:16 EDT Received: from nma.com by nrtc.nrtc.northrop.com id aa08827; 9 Apr 91 16:15 PST Received: from odin.nma.com by nma.com id aa01829; 9 Apr 91 16:08 PDT To: John Noerenberg Cc: ietf-822@dimacs.rutgers.edu, nsb@thumper.bellcore.com Subject: Re: text --> IA5 ? In-Reply-To: Your message of Tue, 09 Apr 91 14:48:07 -0800. <9104092148.AA06091@QUALCOMM.COM> Reply-To: Stef@ics.uci.edu From: Einar Stefferud Date: Tue, 09 Apr 91 17:06:07 MDT Message-Id: <10507.671241967@nma.com> Sender: stef@nma.com I am becoming more and more confused. I thought we were (at this point0 only rtrying to decide exactly how to identify (what label to use to identify) the character set that is (was) specified in the original RFC822. Howabout calling it "822ASCII" since that is exactly what we mean. Now, did RFC822 really nail down what character set it meant to be used? I have heard of something called "NVT-ASCII" whatever that is. I am beginning to wonder how we have managed for all these years to not get messed up on this question of what the original character set was (is)! Beyond this, I do agree that we should decide what is the exact character set to be allowed in the body-part headers, but that was not the original question. Fine by me if it is chosen to be "822ASCII". Best...\Stef From owner-ietf-822@dimacs.rutgers.edu Tue Apr 9 21:50:13 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA11830; Tue, 9 Apr 91 21:18:49 EDT Received: from qualcom.qualcomm.com by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA11826; Tue, 9 Apr 91 21:18:44 EDT Received: by QUALCOMM.COM (5.64+/QUALCOMM/V1.0) id AA21210 for ietf-822@dimacs.rutgers.edu; Tue, 9 Apr 91 18:18:33 -0700 Date: Tue, 9 Apr 91 18:18:33 -0700 From: jwn2@qualcomm.com (John Noerenberg) Message-Id: <9104100118.AA21210@QUALCOMM.COM> To: Stef@ics.uci.edu, jwn2@qualcomm.com Subject: Re: text --> IA5 ? Cc: ietf-822@dimacs.rutgers.edu, nsb@thumper.bellcore.com > I am becoming more and more confused. I thought we were (at this > point0 only rtrying to decide exactly how to identify (what label to > use to identify) the character set that is (was) specified in the > original RFC822. OK, we kinda got off on the content-type tangent. > > Howabout calling it "822ASCII" since that is exactly what we mean. > > Now, did RFC822 really nail down what character set it meant to be > used? I have heard of something called "NVT-ASCII" whatever that is. Yes it did! Here's the relevant rules: CHAR = ; ( 0-177, 0.-127.) text = atoms, specials, CR & bare LF, but NOT ; comments and including CRLF> ; quoted-strings are ; NOT recognized. message = fields *( CRLF *text ) ; Everything after > > I am beginning to wonder how we have managed for all these > years to not get messed up on this question of what the original > character set was (is)! Also from RFC822, this is Crocker, et al.'s reference for ASCII. ANSI. "USA Standard Code for Information Interchange," X3.4. American National Standards Institute: New York (1968). Also in: Feinler, E. and J. Postel, eds., "ARPANET Protocol Hand- book", NIC 7104. When I say "ASCII", this is the ASCII I mean. But, I'm with Stef. This discussion is getting silly. I really don't care what the label is. Nathaniel, this is your RFC, you name the label, just pick something! :-) I promise not to say another word...at least for a while :-) -jwn2 From owner-ietf-822@dimacs.rutgers.edu Wed Apr 10 02:19:12 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA20371; Wed, 10 Apr 91 01:55:24 EDT Received: from RUTGERS.EDU by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA20367; Wed, 10 Apr 91 01:55:21 EDT Received: from nrtc.northrop.com by rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA15514; Wed, 10 Apr 91 01:55:01 EDT Received: from nma.com by nrtc.nrtc.northrop.com id aa09697; 9 Apr 91 21:54 PST Received: from odin.nma.com by nma.com id aa02124; 9 Apr 91 21:29 PDT To: ietf-822@dimacs.rutgers.edu, nsb@thumper.bellcore.com Cc: John Noerenberg Subject: Re: text --> IA5 ? In-Reply-To: Noerenberg message of Tue, 09 Apr 91 18:18:33 -0800. <9104100118.AA21210@QUALCOMM.COM> Reply-To: Stef@ics.uci.edu From: Einar Stefferud Date: Tue, 09 Apr 91 22:26:41 MDT Message-Id: <10784.671261201@nma.com> Sender: stef@nma.com OK -- From the last few messages, plus some off-list messages, we can see that the original definition was indeed ANSI X3.4, per the excerpt from RFC822. ANSI. "USA Standard Code for Information Interchange," X3.4. American National Standards Institute: New York (1968). Also in: Feinler, E. and J. Postel, eds., "ARPANET Protocol Hand- book", NIC 7104. Now then, RFC822 will never be changed from what it says, so we could easily refer to 822ascii and be entirely precise in RFC822bis by reference to this excerpted section of RFC822. It is also easy to see how it got the informal code of USascii. So, I will vote for either USascii or 822ascii, in the interests of being totally precise about what we mean. We should also cite it the same way in RFC822bis as in RFC822. That should not leave any wiggle room for anyone to miss what is intended. I will object to just using "ascii" because there seems to be a tendency for "ascii" to take on various shades of informal meaning among the populace. e.g., "8-bit ascii" has been mentioned here, along with other sorts of ascii in Europe, etc. Are there other versions of "ascii" in fact, or are these all just some sort of techno-mythology? I expect USascii to win the clarity of expression award. Fine by me...\Stef From owner-ietf-822@dimacs.rutgers.edu Wed Apr 10 11:49:14 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA03193; Wed, 10 Apr 91 11:35:12 EDT Received: from INFOODS.MIT.EDU by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA03188; Wed, 10 Apr 91 11:35:09 EDT Date: Wed 10 Apr 91 11:34:54-EDT From: John C Klensin Subject: Re: text --> IA5 ? To: Stef@ics.uci.edu Cc: ietf-822@dimacs.rutgers.edu, nsb@thumper.bellcore.com, jwn2@qualcomm.com Message-Id: <671297694.734521.KLENSIN@INFOODS.MIT.EDU> In-Reply-To: <10784.671261201@nma.com> Mail-System-Version: >Now then, RFC822 will never be changed from what it says, so we could >easily refer to 822ascii and be entirely precise in RFC822bis by >reference to this excerpted section of RFC822. Welcome to the world of ANSI and ISO standards :-( The document referenced by RFC822, more correctly and completely cited as X3.4-1968, is no longer available. It has been been superceded by later editions, and ANSI does not stock obsolete versions of Standards, nor do most libraries. I say "ANSI *and ISO*" here, because ISO behaves the same way, and one needs to be very careful about what one is referencing. I assume that most ISO Member Bodies do too, but there are probably exceptions. >It is also easy to see how it got the informal code of USascii. The organization that is now called ANSI went through an identity crises a number of years ago in which it had trouble standardizing its own name. It started that period as "ASA" (the "American Standards Association"), went through "USASA" or something quite similar, and finally ended up as "ANSI", with maybe an addition version in between. This was over a period of two or three years (aside to anyone who happens to be involved in advising registration authorities: reminding them about this period is probably not considered funny by senior ANSI staff, but should be considered if rules are proposed that would constrain name bindings for all time :-) ). Anyway, these name changes caused the parallel designations of national standards to go from "American Standard..." to "United States of America Standard..." to "American National Standard..." One could have redesignated the acronyms along the way, but many (probably most) weren't, so while ASA->USASA->...->ANSI occurred, "ASCII" (which started under the original regime and name) stayed "ASCII". But *that* is how the "informal code of USascii" happened. "ASCII" as actually part of the title. Then, with the most recent revision, someone, in my opinion, screwed up and changed the name of the standard without changing the numeric designation. This was done to make symmetry with "8-bit ASCII", a separate standard which corresponds to ISO8859-1 (X3.134.1 sticks in my mind, but that might easily be wrong). That the name has been changed is not debatable; current ANSI catalogue lists it as "7-bit ASCII". >We should also cite it the >same way in RFC822bis as in RFC822. That should not leave any wiggle >room for anyone to miss what is intended. If you really want to do this -- there are a few characteristics of the 1968 version that don't precisely correspond to today's common usage, then it might be best for someone to go check the Protocol Handbook and be sure that it contains text identical to X3.4-1968, then cite that, forget about citing "X3.4" and use "822ASCII" or, better yet, "822CII" as the code. If I were IAB or the RFC editor, I might take exception to citing the old Protocol Handbook as the reference source for some important characteristic of a proposed new standard in 1991. But I'm not, obviously. Someone might ask for an advisory opinion on that, however. I recommend against USASCII--other than looking strange, it has all of the disadvantages attributed to "ASCII", plus just being wrong. >Are there other versions of "ascii" in fact, or >are these all just some sort of techno-mythology? Mythology and informal shorthand ways of expressing the same sorts of concepts as might have been expressed as ASCIIbis (ISO Latin-1 ?) in another community. ISCII (ISO Code...) has been used, but no one seems to know whether it refers to 646, 10646, 8859-1, 8859-2,..., or a variety of others. And then one hears about things like UK-ASCII, an oxymoron if there ever was one. --john ------- From owner-ietf-822@dimacs.rutgers.edu Wed Apr 10 19:19:17 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA20424; Wed, 10 Apr 91 19:02:40 EDT Received: from RUTGERS.EDU by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA20420; Wed, 10 Apr 91 19:02:31 EDT Received: from nrtc.northrop.com by rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA22820; Wed, 10 Apr 91 19:02:20 EDT Received: from nma.com by nrtc.nrtc.northrop.com id ab13443; 10 Apr 91 15:01 PST Received: from odin.nma.com by nma.com id aa02674; 10 Apr 91 10:27 PDT To: John C Klensin Cc: ietf-822@dimacs.rutgers.edu, nsb@thumper.bellcore.com, jwn2@qualcomm.com Subject: SIGH! Re: text --> IA5 ? In-Reply-To: Your message of Wed, 10 Apr 91 11:34:54 -0500. <671297694.734521.KLENSIN@INFOODS.MIT.EDU> Reply-To: Stef@ics.uci.edu From: Einar Stefferud Date: Wed, 10 Apr 91 11:25:32 MDT Message-Id: <11384.671307932@nma.com> Sender: stef@nma.com OK John -- So USascii is not a good label. What are the differences between X3.4-1968 and x3.4-? Are they worth worrying about, iin the case where we might cite X3.4- instead of RFC822 (or citing what RFC82 cited). I am begining to favor asking the IAB, or the IETF/IESG, but first, lets ask Dave Crocker who represents them to us for this kind of question. Dave! Please come over here and put us out of our misery! Best...\Stef From owner-ietf-822@dimacs.rutgers.edu Wed Apr 10 23:19:19 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA26135; Wed, 10 Apr 91 22:21:00 EDT Received: from RUTGERS.EDU by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA26131; Wed, 10 Apr 91 22:20:57 EDT Received: from Relay.Prime.COM by rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA04058; Wed, 10 Apr 91 22:20:45 EDT Message-Id: <9104110220.AA04058@rutgers.edu> Received: (from user DRB) by Relay.Prime.COM; 10 Apr 91 22:20:23 EDT To: IETF SMTP list , IETF RFC-822 list From: David Robinson Organization: Prime Computer, Inc. Systems Integration Subject: Why We Disagree So Much... A Thought Date: 10 Apr 91 22:20:24 EDT I think a source of much disagreement on these list has to do with the differences in assumptions about the lifetime of SMTP- and 822-based mail. If you think that SMTP and 822 are short-lived, then only minor adjustments, if any, are going to be practical. Furthermore, it will be impractical to hunt down and get fixed all the implementations that are misimplementing the protocol, to say nothing of getting any enhancements (no matter how slight) implemented on all the mail relays. On the other hand, if you think that SMTP and 822 may be slightly longer lived, then both extensions to SMTP and 822 as well as hunting down and fixing mailers which are unenhanced or misimplemented is just part of network normal operation. -David From owner-ietf-822@dimacs.rutgers.edu Wed Apr 10 23:49:18 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA27615; Wed, 10 Apr 91 23:13:28 EDT Received: from INFOODS.MIT.EDU by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA27611; Wed, 10 Apr 91 23:13:25 EDT Date: Wed 10 Apr 91 23:13:10-EDT From: John C Klensin Subject: Re: SIGH! Re: text --> IA5 ? To: Stef@ics.uci.edu Cc: ietf-822@dimacs.rutgers.edu, nsb@thumper.bellcore.com, jwn2@qualcomm.com Message-Id: <671339591.4521.KLENSIN@INFOODS.MIT.EDU> In-Reply-To: <11384.671307932@nma.com> Mail-System-Version: >What are the differences between X3.4-1968 and x3.4-? I've probably got a copy of the '68 version around somewhere, but it could take a year to find it. So this is from memory (and my memory is not good about these things). Let me try a little history.... There is a tension in the char set standards as to whether a code sequence is mapped onto a concept or onto a specific glyph or symbol and, if the latter, how specific? It explodes into debates about whether the EBCDIC "solid vertical line" is a different concrete character from ASCII 7/12, |, which is usually stylized as "broken vertical line". Before anyone says "who cares", recall that there is an abstraction called "or-symbol" in several programming languages and that ISO WGs and then-TC92/SC6 decided in several cases to map it onto ASCII (and ISO646) 2/1, !, on the theory that ! looked more like or-symbol (solid vertical bar) than | did. Anyone reading this on a European national 646-variant terminal will immediately see the other reason behind the SC6 reasoning :-). The earliest character standards and, specifically, the earliest versions of ASCII, were very much in the "map to concept" camp, and filled with weasel words and "alternate stylizations". So, for example, if you received 5/14, ^, you could display it as "hat" or "carat", or as "up arrow". The newest character standards, e.g., the ISO8859-n set, 10646, and UNICODE, have tended to map codes onto glyphs or onto abstractions that are sufficiently precise as to not make much difference. The trend in the control characters is precisely the opposite of this trend in the graphics. The earliest standards were quite precise about what the controls were expected to do, the recent ones tend to leave those specifications for separate standards that don't contain code-> graphic bindings at all. There were also a few other things. For example, in the first version of ASCII (I think the '68 version was the second, but it might have been the first, and I don't recall when this disappeared), the preferred interpretation of 0/10 was "NL" (first character on next line") not "LF" (same character position on next line), a similar "first character on applicable line" interpretation was applied to 0/11 (VT) and 0/12 (NP or FF), and there was very clear language about what 0/13 (CR) meant. >Are they worth worrying about, iin the case where we might cite >X3.4- instead of RFC822 (or citing what RFC82 cited). In my personal opinion, they are not worth worrying about. With the exception of the vertical carriage motion controls (which RFC821/822 specify themselves anyway--the elegance of requiring CR-LF combinations is that they work equally well in "old" LF==NL systems, where the CR is just noise and "new" LF==vertical_index systems, where both are needed), anything that conforms to "today" also conforms to "1968" and should produce the same page. This does impose a slight incompatability in the other direction, but I think it is insignificant and, morever, people have had years to get used to it. In the original ASCII, someone could print solid-vertical- bar in response to receipt of 7/12. If one had an ASCII OCR device, both solid-vertical-bar and broken-vertical-bar would produce 7/12. Today, if one took that text and applied, say, an ISO8859-1 OCR device, they would produce two distinct codes and, if an ASCII OCR device where used, ???. I don't consider that a big deal and strongly prefer reference to current versions of Standards when possible. But the history is above, and people should make their own decisions about how sensitive they feel. --john ------- From owner-ietf-822@dimacs.rutgers.edu Thu Apr 11 00:19:18 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA29337; Thu, 11 Apr 91 00:12:09 EDT Received: from ics.uci.edu by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA29333; Thu, 11 Apr 91 00:12:03 EDT Received: from ics.uci.edu by ICS.UCI.EDU id ab24293; 10 Apr 91 21:10 PDT To: John C Klensin Cc: ietf-822@dimacs.rutgers.edu, nsb@thumper.bellcore.com, jwn2@qualcomm.com Subject: Re: SIGH! Re: text --> IA5 ? In-Reply-To: Wed, 10 Apr 91 23:13:10 EDT. <671339591.4521.KLENSIN@INFOODS.MIT.EDU> Cc: Einar Stefferud Date: Wed, 10 Apr 91 21:10:55 -0700 Message-Id: <24290.671343055@ics.uci.edu> From: Einar Stefferud I fear that this insignificant little question is going to take more of Nathaniels time than we deserve to request. Lets see if we can set up tactics and strategies for getting around it all. Lets cite the new stnadard (someone needs to get a copy for Nathaniel, and for NIC.DDN.MIL or for Jon Postel's files), and also be explicit about the business to be sure we do not confuse anyone. What else should we note in RFC822bis to avoid connfusions, or trouble? Best...\Stef From owner-ietf-822@dimacs.rutgers.edu Thu Apr 11 00:49:19 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA00263; Thu, 11 Apr 91 00:33:46 EDT Received: from enet-gw.pa.dec.com by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA00244; Thu, 11 Apr 91 00:33:35 EDT Received: by enet-gw.pa.dec.com; id AA29478; Wed, 10 Apr 91 21:32:18 -0700 Received: by palo-alto.pa.dec.com; id AA05217; Wed, 10 Apr 91 21:32:08 PDT Message-Id: <9104110432.AA05217@palo-alto.pa.dec.com> To: jwn2@qualcomm.com (John Noerenberg) Cc: ietf-822@dimacs.rutgers.edu, nsb@thumper.bellcore.com Subject: Re: text --> IA5 ? In-Reply-To: Your message of Tue, 09 Apr 91 14:48:07 -0700. <9104092148.AA06091@QUALCOMM.COM> Date: Wed, 10 Apr 91 21:32:06 PDT From: Dave Crocker Let's try this on for size (pun?): A body part has several levels of context, or interpretation. In theory, the number of levels might get quite large, and our attempting to handle the theoretical maximum, or the possibility of infinite nesting, will bog us down. But just for the heck of it, let's try the following: 1. How I, the sender, want it interpreted; e.g., run the BP through a program that is located in file foo/bar 2. BP consists of data generically characterized in a standard category, and precisely defined by a simple citation 3. Data are being sent according to a cited encoding standard. Hence, there is the sender's view, the 'byte' (or thereabouts) view of the interpretable data, and finally the transmission convention. So, I might have 1. nroff 2. USASCII 3. Quoted Text Thoughts? Dave From owner-ietf-822@dimacs.rutgers.edu Thu Apr 11 01:19:19 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA00502; Thu, 11 Apr 91 00:41:24 EDT Received: from enet-gw.pa.dec.com by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA00492; Thu, 11 Apr 91 00:40:48 EDT Received: by enet-gw.pa.dec.com; id AA29775; Wed, 10 Apr 91 21:38:04 -0700 Received: by palo-alto.pa.dec.com; id AA05266; Wed, 10 Apr 91 21:37:54 PDT Message-Id: <9104110437.AA05266@palo-alto.pa.dec.com> To: Stef@ics.uci.edu Cc: John C Klensin , ietf-822@dimacs.rutgers.edu, nsb@thumper.bellcore.com, jwn2@qualcomm.com Subject: Re: SIGH! Re: text --> IA5 ? In-Reply-To: Your message of Wed, 10 Apr 91 11:25:32 -0600. <11384.671307932@nma.com> Date: Wed, 10 Apr 91 21:37:51 PDT From: Dave Crocker This is great. I get to watch a long series of infinitely detailed messages, debating the number of ASCII's on the head of a byte and then Stef calls for me to dig you guys out. Well, precise citations and compare/contrast exercises between two international standards is not exactly my forte. Sorry. Besides, I can't believe that this is the right discussion to be having, at this stage. Yesssss, there will need to be a precise and agreed-upon citation in the final spec, but until you've got a document that is converging on finality, it probably isn't the issue to consume ourselves with. Dave From owner-ietf-822@dimacs.rutgers.edu Thu Apr 11 02:32:28 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA01584; Thu, 11 Apr 91 01:33:02 EDT Received: from ics.uci.edu by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA01580; Thu, 11 Apr 91 01:32:57 EDT Received: from ics.uci.edu by ICS.UCI.EDU id aa29999; 10 Apr 91 22:04 PDT Date: Wed, 10 Apr 91 22:04:25 -0700 Message-Id: <29993.671346265@ics.uci.edu> From: Einar Stefferud Subject: Re: Why We Disagree So Much... A Thought Apparently-To: ietf-822-list ------- Blind-Carbon-Copy To: David Robinson cc: IETF SMTP list Subject: Re: Why We Disagree So Much... A Thought In-reply-to: 10 Apr 91 22:20:24 EDT. <9104110220.AA04058@rutgers.edu> Cc: "Einar Stefferud" Date: Wed, 10 Apr 91 22:04:25 -0700 Message-ID: <29993.671346265@ics.uci.edu> From: "Einar Stefferud" I am only sending this to the ietf-smtp list, because us ietf-822 people do not want to be bothered any more by this damn argument! I suggest that you keep your arguement about it to your own side of the house! We don't want to hear about it! Until now, life was actully becoming sort of pleasant on the ietf-822 side of the house! Mr Chairman, will you please rule this discussion out of order on the ietf-822 side ot the house! Best...\Stef ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ David -- NO! I think the cause is orthogonal to your continuum. It is precisely because of the large immobile installed base that is perceived as very likely to never be upgraded under any conditions, that causes me, and I belive many others to want to do nothing to SMTP (RFC821) that will break old systems. The whole idea of just declaring older installed systems broken in the effort to move forward is just a very very bad idea, and we should put it away forever in the context of SMTP(RFC821). You can carry 8-bit stuff in 7-bit SMTP if you want to, so you are only making a very small gain in badwidth efficiency anyway. (But we have been over this endlessly, so I will not repeat any more of it.) If you want to go ahead anyway, then declare that you are going to define a new internet mail standard (to do what X.400 does, and more if you can) and run it as a 3rd alternative MTA system, with gateways between it and RFC821/822 systems to protect RFC821/822 systems from any and all ill effects. All you need to do is define gateways between your RFC-newmta systems and RFC821/822 systems and X.400 MTA systems. Lets get the real issue out on the table here. I always say that it is much easier to hold an autopsy if you have a corpse on the table! Some want a new INTERNET RFC STANDARD blessed MTA system to do 8-bit mail (at minimum), and they don't want to use X.400 to do it, so they want to convert an existing installed base (7bit RFC821) into something new. They don't mind that it breaks lots of mail systems that lots of people are using in production mode, with no desire to fix it, cause it ain't broke. They just want to blatantly declare all old stuff "broken, when it is not" and go their merry way. Well, I wonder how they would react if we were on opposite sides of this same question about just declaring my tools broken when they are not. My response, and yours to this challenge is "Just Say NO!". I don't know why they (the "we want a new MTA Protocol" people) don't want to just do something entirely new that only works among consenting parties that implement their "something new", with firewalls around it to protect the non-users. For myself, having already seen what happens when we have gateways among three mail system realms (RFC822/X.400/PROFS), I don't want to promote this "3 body problem" as a solution, but I will not stand in the way of folk who are really convinced that they have a valuable idea for how to really do mail right (e.g., do the MTA protocol right, for the last time, of course). I just don't want to contribute to it, and I don't want it to damage my currently working installed base. And I don't want it to get in the way of the other work that we need to get done. This is what the problem is all about! Best...\Stef ------- End of Blind-Carbon-Copy From owner-ietf-822@dimacs.rutgers.edu Thu Apr 11 02:19:24 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA01641; Thu, 11 Apr 91 01:37:30 EDT Received: from CBROWN.CLAREMONT.EDU by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA01637; Thu, 11 Apr 91 01:37:26 EDT Date: Wed, 10 Apr 1991 22:37 PDT From: "Ned Freed, Postmaster" Subject: Re: Why We Disagree So Much... A Thought To: DRB@relay.prime.com Cc: ietf-822@dimacs.rutgers.edu Message-Id: <0D05B4437E60092A@HMCVAX.CLAREMONT.EDU> X-Envelope-To: ietf-822@dimacs.rutgers.edu X-Vms-To: IN%"DRB@relay.prime.com" X-Vms-Cc: IN%"ietf-822@dimacs.rutgers.edu" Well, it is an interesting idea, but speaking for myself, I don't think I fit the categories you present. (1) I believe that RFC 821 and RFC 822 will be around for a very, very, very long time. (2) I believe that enhancements at both levels are warranted. Just because I'm against most of the SMTP enhancements proposed thus far doesn't mean there aren't enhancements to be made to SMTP. I haven't bothered to present what I think SMTP needs since it is unrelated to the issues we're debating at present. (3) I believe that if we come up with sound enhancements, they will be widely deployed and used. (4) I believe that just because we come up with enhancements, and they may see wide use, that many MTAs and UAs will be extremely slow to upgrade. Some will never upgrade, for all practical purposes. This necessitates backwards compatibility in order to insure operational integrity. (5) I also believe that X.400 will see increasing use, but it will be a long time before it overtakes 821 and 822 use. Given the fact that the most explosive e-mail growth today is in PC e-mail, and these systems are for the most part 822-derived, it may be that X.400 will _never_ overtake 822, certainly not within the useful life of what we're working on here. My opinions only. I offer them simply to show how they fit, or don't fit, your analysis. Ned From owner-ietf-822@dimacs.rutgers.edu Thu Apr 11 10:06:59 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA09306; Thu, 11 Apr 91 08:53:08 EDT Received: from thumper.bellcore.com by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA09302; Thu, 11 Apr 91 08:53:07 EDT Received: from greenbush.bellcore.com by thumper.bellcore.com (4.1/4.7) id for ietf-822@dimacs.rutgers.EDU; Thu, 11 Apr 91 08:52:08 EDT Received: by greenbush.bellcore.com (4.12/4.7) id for David_Crocker@Pa.dec.com; Thu, 11 Apr 91 08:55:27 edt Received: from Messages.7.14.N.CUILIB.3.45.SNAP.NOT.LINKED.greenbush.mouseclub.sun4.40 via MS.5.6.greenbush.mouseclub.sun4_40; Thu, 11 Apr 1991 08:55:22 -0400 (EDT) Message-Id: Date: Thu, 11 Apr 1991 08:55:22 -0400 (EDT) From: Nathaniel Borenstein To: John C Klensin , Einar Stefferud Subject: Re: SIGH! Re: text --> IA5 ? Cc: ietf-822@dimacs.rutgers.edu, jwn2@qualcomm.com, Dave Crocker In-Reply-To: <24290.671343055@ics.uci.edu> References: <24290.671343055@ics.uci.edu> Personally, I think we ARE converging on finality, but then I'm always an optimist. I'm nearly done with another pass through the draft RFC. This draft, as it turns out, is the first one that is "complete-looking" -- that is, if everyone agreed, it would need virtually no further work before publication. That doesn't, of course, mean that everyone will agree, but from where I sit it feels like enormous progress -- at least, for the first time, I'm happy with it, and it helps to have a document that the author is happy with. And, to answer the burning question you're all surely asking right now: the draft document calls the default content-type -- ta-ta -- "822ASCII". It strikes me as the label that best summarizes what we're trying to capture, i.e. "ASCII as specified by RFC 822." -- Nathaniel From owner-ietf-822@dimacs.rutgers.edu Thu Apr 11 10:33:37 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA09316; Thu, 11 Apr 91 08:56:38 EDT Received: from hydra.Helsinki.FI by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA09312; Thu, 11 Apr 91 08:56:18 EDT Received: from poros (poros.Helsinki.FI) by hydra.Helsinki.FI (4.1/SMI-4.1/32) id AA01559; Thu, 11 Apr 91 15:55:56 +0300 Date: Thu, 11 Apr 91 15:55:56 +0300 From: kankkune@cs.helsinki.fi (Risto Kankkunen) Message-Id: <9104111255.AA01559@hydra.Helsinki.FI> In-Reply-To: Dave Crocker's message as of Apr 10, 21:32 X-Mailer: Mail User's Shell (7.2.0 10/31/90) To: ietf-822@dimacs.rutgers.edu Subject: Re: text --> IA5 ? > Hence, there is the sender's view, the 'byte' (or thereabouts) view of > the interpretable data, and finally the transmission convention. > > So, I might have > > 1. nroff > 2. USASCII > 3. Quoted Text One thing that should be considered, is the functionality of these headers. With the content-type/content-encoding pair the matter is quite simple: first the UA unpacks the body by feeding it through all the decoders mentioned in the content-encoding header. Then it chooses the right viewer for this part according to the content-type header. How does a content-charset or something like that fit into this? Is the charset passed as a parameter to the viewer? Maybe it would then be better to place the charset indicator as a parameter to the content type. Or maybe there is a different viewer for each pair, that pair could be used as the content-type name, like X-nroff-USASCII. There is probably some lookup table for the different contents and their viewers. Even if there is a single viewer for the different character sets, that could be handled with this table, if this latter style is used: X-nroff /usr/bin/nroff -c ascii X-nroff-USASCII /usr/bin/nroff -c ascii X-nroff-Latin1 /usr/bin/nroff -c ISO-8859-1 Also, a separate character set header would be needed only for these kind of special cases. Most text formats handle the charset issue themselves (like WP documents) and graphics or voice parts don't have this at all. So, I'd prefer a more general method instead of a specific header. I think we could manage with using X-nroff-USASCII style content names, or we could use the resource-ref part for this (e.g. Content- type: X-nroff;;me,Latin1). I don't know, if the RFC822bis should decide this in general. Btw. RFC1049 allows one to leave out the ver-num part, if there are no resource-refs, but requires it when they are present. Is there a reason for this? -- Risto Kankkunen kankkune@cs.Helsinki.FI (Internet) Department of Computer Science kankkunen@finuh (Bitnet) University of Helsinki, Finland ..!mcsun!uhecs!kankkune (UUCP) From owner-ietf-822@dimacs.rutgers.edu Thu Apr 11 10:41:08 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA09705; Thu, 11 Apr 91 09:10:48 EDT Received: from thumper.bellcore.com by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA09684; Thu, 11 Apr 91 09:10:28 EDT Received: from greenbush.bellcore.com by thumper.bellcore.com (4.1/4.7) id for IETF-SMTP@dimacs.rutgers.edu; Thu, 11 Apr 91 09:10:26 EDT Received: by greenbush.bellcore.com (4.12/4.7) id for IETF-822@dimacs.rutgers.edu; Thu, 11 Apr 91 09:13:44 edt Received: from Messages.7.14.N.CUILIB.3.45.SNAP.NOT.LINKED.greenbush.mouseclub.sun4.40 via MS.5.6.greenbush.mouseclub.sun4_40; Thu, 11 Apr 1991 09:13:38 -0400 (EDT) Message-Id: Date: Thu, 11 Apr 1991 09:13:38 -0400 (EDT) From: Nathaniel Borenstein To: IETF SMTP list , IETF RFC-822 list Subject: Re: Why We Disagree So Much... A Thought In-Reply-To: <9104110220.AA04058@rutgers.edu> References: <9104110220.AA04058@rutgers.edu> Actually, David, it's too bad you weren't in St. Louis. My observation there is that the difference is not between people who expect 822/SMTP to be transient and those who expect it to last a long time. Rather, I was amused to discover, the difference seems to be between two groups that disagree about the right way to prolong and promote the longevity of 822/SMTP: -- The "conservative" camp believes that simplicity and stability is one of 822/SMTP's biggest advantages over X.400. If 822/SMTP were to become as complex as X.400, for example, it would lose most of its staying power. -- The "liberal" camp believes that upgrading 822/SMTP is the best way to stave off future desertions from the 822/SMTP camp. In other words, I think that many people on both sides of this debate hope & expect that 822/SMTP will be long-lived, and that the disagreements center on the right way to promote and guide that long life. -- Nathaniel From owner-ietf-822@dimacs.rutgers.edu Thu Apr 11 11:10:15 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA09422; Thu, 11 Apr 91 09:04:14 EDT Received: from thumper.bellcore.com by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA09414; Thu, 11 Apr 91 09:04:07 EDT Received: from greenbush.bellcore.com by thumper.bellcore.com (4.1/4.7) id for ietf-822@dimacs.rutgers.edu; Thu, 11 Apr 91 09:04:05 EDT Received: by greenbush.bellcore.com (4.12/4.7) id for ietf-822@dimacs.rutgers.edu; Thu, 11 Apr 91 09:07:24 edt Received: from Messages.7.14.N.CUILIB.3.45.SNAP.NOT.LINKED.greenbush.mouseclub.sun4.40 via MS.5.6.greenbush.mouseclub.sun4_40; Thu, 11 Apr 1991 09:07:20 -0400 (EDT) Message-Id: Date: Thu, 11 Apr 1991 09:07:20 -0400 (EDT) From: Nathaniel Borenstein To: jwn2@qualcomm.com (John Noerenberg), Dave Crocker Subject: Re: text --> IA5 ? Cc: ietf-822@dimacs.rutgers.edu In-Reply-To: <9104110432.AA05217@palo-alto.pa.dec.com> References: <9104110432.AA05217@palo-alto.pa.dec.com> Excerpts from mail: 10-Apr-91 Re: text --> IA5 ? Dave Crocker@Pa.dec.com (828) > A body part has several levels of context, or interpretation. In > theory, the number of levels might get quite large, and our attempting > to handle the theoretical maximum, or the possibility of infinite > nesting, will bog us down. But just for the heck of it, let's try the > following: I've tried it on, and I don't think it fits. > 1. How I, the sender, want it interpreted; e.g., run the BP through a > program that is located in file foo/bar Like, maybe, a program that just knows how to display 8 bit text? 7 bit? > 2. BP consists of data generically characterized in a standard > category, and precisely defined by a simple citation Like, maybe, nroff or TeX sources, as cited by RFC 1049? > 3. Data are being sent according to a cited encoding standard. Like, maybe, "some kind of data" encoded in hex? The line betwee #1 and #2 is totally fuzzy. #3 is clearly a different concept, but doesn't stand alone -- you have to know what, precisely, you have encoded. #1 and #2 are both the kinds of things that Content-type, as defined by RFC 1049, was designed to handle. An important point: THAT MECHANISM WORKS VERY WELL. There are lots of different people using the Content-type header, and have been for several years now, and the ONLY problems people have had with it, to my knowledge, are the lack of a standard mechanism for multipart bodies and the lack of a standard for your case #3, encoding data that can't be passed in 7 bit mail bodies. These problems were, indisputably, the motivation for the current effort. The basic notion of "Content-type" is, however, not broken, so I don't think we should fix it. It seems to me that we're solving enough issues in this RFC without trying to change something that works fine as it is. I would strongly urge us not to tamper with the basic notion of Content-type, which is the direction I think I sense Dave heading. -- Nathaniel From owner-ietf-822@dimacs.rutgers.edu Thu Apr 11 11:26:53 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA10900; Thu, 11 Apr 91 09:41:57 EDT Received: from NRI.RESTON.VA.US by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA10892; Thu, 11 Apr 91 09:41:53 EDT Received: from NRI by NRI.NRI.Reston.VA.US id aa08850; 11 Apr 91 9:35 EDT Org: Corp. for National Research Initiatives Phone: (703) 620-8990 ; Fax: (703) 620-0913 To: ietf-822@dimacs.rutgers.edu Cc: gvaudre@nri.reston.va.us Subject: ietf-822 charter Date: Thu, 11 Apr 91 09:35:55 -0400 From: Greg Vaudreuil Message-Id: <9104110935.aa08850@NRI.NRI.Reston.VA.US> Internet Message Extentions (822ext) Charter Chair(s): Gregory Vaudreuil, gvaudre@nri.reston.va.us Mailing Lists: General Discussion: ietf-822@dimacs.rutgers.edu To Subscribe: ietf-822-request@dimacs.rutgers.edu Description of Working Group: This working group is chartered to extend the RFC 822 Message format to facilitate multi-media mail and alternate character sets. The group is expected to formulate a standard message format, roughly based on either RFC1154 or RFC 1049. The immediate goals of this group are to define a mechanism for the standard interchange and interoperation of international character sets. Goals and Milestones: Mar 1991 Review the charter, and refine the groups focus. Decide whether this is a worthwhile effort. Mar 1991 Discuss, debate, and choose a framework for the solution. Assign writing assignments, and identify issues to be resolved. Jul 1991 Review exiting writing, resolve outstanding issues, identify new work, and work toward a complete document Nov 1991 Post a first Internet Draft. Dec 1991 Review and finalize the draft document. Jan 1991 Submit the document as a Proposed Standard. From owner-ietf-822@dimacs.rutgers.edu Thu Apr 11 12:23:43 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA14891; Thu, 11 Apr 91 11:52:33 EDT Received: from enet-gw.pa.dec.com by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA14867; Thu, 11 Apr 91 11:52:15 EDT Received: by enet-gw.pa.dec.com; id AA12644; Thu, 11 Apr 91 08:50:38 -0700 Received: by palo-alto.pa.dec.com; id AA14461; Thu, 11 Apr 91 08:50:21 PDT Message-Id: <9104111550.AA14461@palo-alto.pa.dec.com> To: Greg Vaudreuil Cc: ietf-822@dimacs.rutgers.edu Subject: Re: ietf-822 charter In-Reply-To: Your message of Thu, 11 Apr 91 09:35:55 -0400. <9104110935.aa08850@NRI.NRI.Reston.VA.US> Date: Thu, 11 Apr 91 08:50:18 PDT From: Dave Crocker Just a thought: What about shooting for an ID by July and attempting Proposed by September? Dave From owner-ietf-822@dimacs.rutgers.edu Thu Apr 11 13:19:22 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA17990; Thu, 11 Apr 91 13:15:31 EDT Received: from thumper.bellcore.com by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA17986; Thu, 11 Apr 91 13:15:27 EDT Received: from greenbush.bellcore.com by thumper.bellcore.com (4.1/4.7) id for ietf-822@dimacs.rutgers.edu; Thu, 11 Apr 91 13:15:23 EDT Received: by greenbush.bellcore.com (4.12/4.7) id for ietf-822@dimacs.rutgers.edu; Thu, 11 Apr 91 13:18:42 edt Received: from Messages.7.14.N.CUILIB.3.45.SNAP.NOT.LINKED.greenbush.mouseclub.sun4.40 via MS.5.6.greenbush.mouseclub.sun4_40; Thu, 11 Apr 1991 13:18:38 -0400 (EDT) Message-Id: Date: Thu, 11 Apr 1991 13:18:38 -0400 (EDT) From: Nathaniel Borenstein To: ietf-822@dimacs.rutgers.edu Subject: Re: text --> IA5 ? In-Reply-To: <9104111255.AA01559@hydra.Helsinki.FI> References: <9104111255.AA01559@hydra.Helsinki.FI> Excerpts from internet.ietf-822: 11-Apr-91 Re: text --> IA5 ? Risto Kankkunen@cs.helsi (2135) > Btw. RFC1049 allows one to leave out the ver-num part, if there are no > resource-refs, but requires it when they are present. Is there a reason > for this? I think that's just a syntactic oddity. I've seen people use "null", which is legal, or just leave the version number blank, which is not really legal since it has to be a local-part, which (by my reading) requires at least non-whitespace character. I suppose we could consider changing the definition to allow it to be blank, e.g. local-part / "" Is there a strong reason for wanting to make this change? From owner-ietf-822@dimacs.rutgers.edu Thu Apr 11 14:49:24 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA20754; Thu, 11 Apr 91 14:42:09 EDT Received: from ics.uci.edu by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA20750; Thu, 11 Apr 91 14:42:03 EDT Received: from ics.uci.edu by ICS.UCI.EDU id aa01753; 11 Apr 91 7:58 PDT To: Nathaniel Borenstein Cc: John C Klensin , ietf-822@dimacs.rutgers.edu, jwn2@qualcomm.com, Dave Crocker Subject: Re: SIGH! Re: text --> IA5 ? In-Reply-To: Thu, 11 Apr 91 08:55:22 EDT. Cc: Einar Stefferud Date: Thu, 11 Apr 91 07:58:27 -0700 Message-Id: <1751.671381907@ics.uci.edu> From: Einar Stefferud I expect that we might want to change that to be RFC822bisASCII, when we know what the new RFC number is. Question: Do we or do we not want to cite a 1968 obsolete SI X3.4 standard in a 1991 RFC? Best...\Stef From owner-ietf-822@dimacs.rutgers.edu Thu Apr 11 15:28:13 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA21644; Thu, 11 Apr 91 15:19:02 EDT Received: from qualcom.qualcomm.com by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA21640; Thu, 11 Apr 91 15:18:57 EDT Received: from [129.46.4.152] with SMTP by QUALCOMM.COM (5.64+/QUALCOMM/V1.0) id AA09217 for ietf-822@dimacs.rutgers.EDU; Thu, 11 Apr 91 12:18:30 -0700 Date: Thu, 11 Apr 91 12:18:30 -0700 Message-Id: <9104111918.AA09217@QUALCOMM.COM> To: Nathaniel Borenstein , Einar Stefferud From: jwn2@qualcom.qualcomm.com Subject: Re: SIGH! Re: text --> IA5 ? Cc: John C Klensin , ietf-822@dimacs.rutgers.edu, jwn2@qualcomm.com, Dave Crocker Question: Do we or do we not want to >cite a 1968 obsolete SI X3.4 standard in a 1991 RFC? > I thought about that. If the symbol map for the left half of the table (the first 128 codes) match between the 68 revision and the current revision, then the current revision is probably the right one to cite. I'm still trying to scare up a copy of the current revision. Btw, yesterday while pondering this whole business, I went looking for all the different ascii charts I could find. To my horror, I discovered that Microsoft had inserted some glyph of their own choosing in the DEL position (7/f), explaining that DEL and BS are the same under DOS. Now that I think about it, Microsoft invented glyphs for all of the (traditionally) non-printable codes for DOS. All this proves, of course, is setting a standard is one thing. _Adhering_ to it is quite another. I think there's a lesson here... From owner-ietf-822@dimacs.rutgers.edu Thu Apr 11 15:49:22 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA21695; Thu, 11 Apr 91 15:20:34 EDT Received: from INFOODS.MIT.EDU by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA21689; Thu, 11 Apr 91 15:20:32 EDT Date: Thu 11 Apr 91 15:20:15-EDT From: John C Klensin Subject: Re: SIGH! Re: text --> IA5 ? To: stef@ics.uci.edu Cc: nsb@thumper.bellcore.com, ietf-822@dimacs.rutgers.edu, jwn2@qualcomm.com, David_Crocker@pa.dec.com Message-Id: <671397615.890521.KLENSIN@INFOODS.MIT.EDU> In-Reply-To: <1751.671381907@ics.uci.edu> Mail-System-Version: >I expect that we might want to change that to be RFC822bisASCII, when we >know what the new RFC number is. Question: Do we or do we not want to >cite a 1968 obsolete SI X3.4 standard in a 1991 RFC? If one wants to cite the old one, and be perfectly precise, and expect that RFC-XXXX is going to supplement 822 and not replace it, then the easiest thing to do is to cite "ASCII as defined in RFC822, reference [n]" and be done with it. The second question is more interesting, but I've got nothing else to contribute to it. Stef, before serious confusion sets in, I'd suggest dropping the "RFC822bis" terminology and using either Nathaniel's RFC-XXXX or something else. The reason is that there is *already* something that might be construed as RFC822bis and is an Internet Standard: RFC822 read through the filters and interpretations of RFC1123. Moreover, I infer from Dave's comments/questions of a week ago about weak places in 822 itself that he is contemplating a revision and tuning of 822 that, presumably, would supercede it, thereby creating, perhaps, 822bis-bis. Finally, in deference to those who are throughly sick of this discussion, and to Dave's earlier comment (which I construed as "let's make sure we agree on the conceptual problems, then quibble"), let's take the off the list. I, for one, would like to read what Nathaniel comes up with in RFC form and then comment/quibble, possibly offline relative to the list. In the interim, I'd categorize myself as part of the "throughly sick of this discussion" group. --john ------- From owner-ietf-822@dimacs.rutgers.edu Thu Apr 11 16:49:23 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA25369; Thu, 11 Apr 91 16:33:24 EDT Received: from NRI.RESTON.VA.US by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA25364; Thu, 11 Apr 91 16:33:15 EDT Received: from NRI by NRI.NRI.Reston.VA.US id aa01343; 11 Apr 91 16:26 EDT To: John C Klensin Cc: stef@ics.uci.edu, nsb@thumper.bellcore.com, ietf-822@dimacs.rutgers.edu, jwn2@qualcomm.com, David_Crocker@pa.dec.com, gvaudre@nri.reston.va.us Subject: Re: SIGH! Re: text --> IA5 ? In-Reply-To: Your message of "Thu, 11 Apr 91 15:20:15 EDT." <671397615.890521.KLENSIN@INFOODS.MIT.EDU> Date: Thu, 11 Apr 91 16:26:35 -0400 From: Greg Vaudreuil Message-Id: <9104111626.aa01343@NRI.NRI.Reston.VA.US> Lets take the opportunity to define the character set we mean when we say USasciiblitherblap. It is not unreasonable to include a chart of accepted characters and their character values. In a "ascii" rfc, that should be possible :-) Greg V. From owner-ietf-822@dimacs.rutgers.edu Fri Apr 12 09:19:27 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA20423; Fri, 12 Apr 91 09:04:36 EDT Received: from thumper.bellcore.com by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA20419; Fri, 12 Apr 91 09:04:34 EDT Received: from greenbush.bellcore.com by thumper.bellcore.com (4.1/4.7) id for ietf-822@dimacs.rutgers.edu; Fri, 12 Apr 91 09:02:12 EDT Received: by greenbush.bellcore.com (4.12/4.7) id for David_Crocker@pa.dec.com; Fri, 12 Apr 91 09:05:33 edt Received: from Messages.7.14.N.CUILIB.3.45.SNAP.NOT.LINKED.greenbush.mouseclub.sun4.40 via MS.5.6.greenbush.mouseclub.sun4_40; Fri, 12 Apr 1991 09:05:28 -0400 (EDT) Message-Id: Date: Fri, 12 Apr 1991 09:05:28 -0400 (EDT) From: Nathaniel Borenstein To: John C Klensin , Greg Vaudreuil Subject: Re: SIGH! Re: text --> IA5 ? Cc: stef@ics.uci.edu, ietf-822@dimacs.rutgers.edu, jwn2@qualcomm.com, David_Crocker@pa.dec.com, gvaudre@nri.reston.va.us In-Reply-To: <9104111626.aa01343@NRI.NRI.Reston.VA.US> References: <9104111626.aa01343@NRI.NRI.Reston.VA.US> I used to think I knew what ASCII was. I don't any more. I'd be happy to include a reference table, but I am not going to write it. Do any of you folks who actually know what you're talking about want to provide me with such a table? Barring that, my first pass would be "insert-file /usr/pub/ascii" -- something tells me that this would not be universally acceptable.... -- Nathaniel From owner-ietf-822@dimacs.rutgers.edu Mon Apr 15 09:26:51 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA20730; Mon, 15 Apr 91 09:14:57 EDT Received: from e.ms.uky.edu by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA20726; Mon, 15 Apr 91 09:14:53 EDT Received: from s.ms.uky.edu by g.ms.uky.edu id ab19625; 15 Apr 91 9:08 EDT Date: Mon, 15 Apr 91 9:07:33 EDT From: Daniel Chaney To: ietf-822@dimacs.rutgers.edu Subject: Please add me Message-Id: <9104150907.aa15134@s.s.ms.uky.edu> Thank you. -dan -- -- Daniel Chaney -- -- postmaster, newsguy, main archiver for ms.uky.edu (Univ of KY Math Sci) -- -- {uunet and the like}!ukma!chaney chaney@ms.uky.edu chaney@ukma.BITNET -- -- "I'll have time enough for sleep when I'm dead and in the ground" -- From owner-ietf-822@dimacs.rutgers.edu Thu Apr 18 15:57:25 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA07748; Thu, 18 Apr 91 15:46:00 EDT Received: from dkuug.dk by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA07736; Thu, 18 Apr 91 15:45:52 EDT Received: by dkuug.dk (5.64+/8+bit/IDA-1.2.8) id AA07166; Thu, 18 Apr 91 21:44:17 +0200 Date: Thu, 18 Apr 91 21:44:17 +0200 From: Keld J|rn Simonsen Message-Id: <9104181944.AA07166@dkuug.dk> To: gvaudre@nri.reston.va.us, nsb@thumper.bellcore.com Subject: Re: SIGH! Re: text --> IA5 ? Cc: David_Crocker@pa.dec.com, ietf-822@dimacs.rutgers.edu, jwn2@qualcomm.com, stef@ics.uci.edu X-Charset: ASCII X-Char-Esc: 29 About the name of ASCII: A Project Team of the European Workshop for Open Systems (EWOS PT 001, paid by the Commision of the European Communities) has produced a report on character sets. This was intended for use in "Open Systems" like OSI and POSIX, and they also addressed the issue of good unique naming of character sets. Their recommendation was to use the registration number of the ISO 2375 registry administered by ECMA. A character set could then be referenced as "ISO IR xxx" meaning ISO International Registration number xxx. The registration numbers of ISO 2375 is actually also the way character sets are referenced in OSI standards and profiles, and in European chararacter set profile standards such as the ENV 41 5xx series made by CEN/CENELEC (the Joint European Standards Institution). This information is also included in the character set work that I have distributed earlier to this list. ASCII (the 7-bit critter we know so well) has registration number 6. My naming takes the effort to make the name a token, so I call it "ISO-IR-6". Just a suggestion to make the new RFC in line with much other work in the communications world... Keld From owner-ietf-822@dimacs.rutgers.edu Thu Apr 18 19:57:25 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA16293; Thu, 18 Apr 91 19:16:05 EDT Received: from INFOODS.MIT.EDU by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA16285; Thu, 18 Apr 91 19:16:02 EDT Date: Thu 18 Apr 91 19:15:35-EDT From: John C Klensin Subject: Spelling "ASCII" in ECMA or ISO To: ietf-822@dimacs.rutgers.edu Message-Id: <672016535.912671.KLENSIN@INFOODS.MIT.EDU> Mail-System-Version: Relative to using the registration number strategy, I'm hesitant for two reasons. The second may be ok, but someone should check, very carefully. (1) This is really an ANSI Standard, whose first version is of hoary vintage, and should be referenced as such. Given its special status in 822 and 821 (and, for that matter, in Telnet), I think it would be better to see the system use content identifiers that reflect "ASCII" in some form, even if every other content identifier that refers to a character set uses the ISO-RN-nn form (which I basically like, subject to the qualification below). (2) I don't have my ECMA registration file in front of me, and it isn't complete anyway, but my recollection is that registration tables for 94 character sets are registrations for GL only. That is, they have little grey areas in columns 0 and 1 and positions 2/0 and maybe 7/15. "ASCII" (as in X3.4) is a complete table, reflecting columns 0 through 7 (C0 as well as something suitable for mapping to GL). If so, you can't say "registration NNN" and then talk about CR or LF, because that registration does not define such creatures. I applaud the trend toward making everything international and elegant, but it is lots safer to apply it to "new" things than to try to retroactively apply it to things that have been in use--and well-defined-- for a long time. That said, if one followed and extended the referencing model of RFC821 and RFC822 and said, e.g., American National Standard... X3.4-1976,... and then went on to say... "for all practical intents and purposes, ISO646 (name, date) International Reference Version; ISO Registration nn mapped to GL with ISO Registration mm mapped to C0; columns 0 through 7 of ISO8859-1,2,3,...; plane 0 (i.e., 32-bit values in the range 032 000 through 032 127) of DIS10646; and UNICODE characters in columns 000 through 007; are all equivalent to this Standard. However, in the event of any ambiguity, the definitions in X3.4 apply." ... I think it would be a useful public service for readers for whom ANSI X3.4 may be the least accessible of these authoritative documents. --john ------- From owner-ietf-822@dimacs.rutgers.edu Thu Apr 18 21:57:29 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA21115; Thu, 18 Apr 91 21:40:52 EDT Received: from RUTGERS.EDU by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA21098; Thu, 18 Apr 91 21:40:38 EDT Received: from nrtc.northrop.com by rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA10084; Thu, 18 Apr 91 21:40:16 EDT Received: from nma.com by nrtc.nrtc.northrop.com id ab18971; 18 Apr 91 17:36 PST Received: from odin.nma.com by nma.com id aa12515; 18 Apr 91 17:04 PDT To: Keld J|rn Simonsen Cc: gvaudre@nri.reston.va.us, nsb@thumper.bellcore.com, David_Crocker@pa.dec.com, ietf-822@dimacs.rutgers.edu, jwn2@qualcomm.com Subject: Re: SIGH! Re: text --> IA5 ? In-Reply-To: Your message of Thu, 18 Apr 91 21:44:17 +0100. <9104181944.AA07166@dkuug.dk> Reply-To: Stef@ics.uci.edu From: Einar Stefferud Date: Thu, 18 Apr 91 18:01:34 MDT Message-Id: <16386.672022894@nma.com> Sender: stef@nma.com Thanks Keld -- This is the kind of unambiguous token I was looking for, exactly! > ASCII (the 7-bit critter we know so well) has registration number 6. > My naming takes the effort to make the name a token, so I call > it "ISO-IR-6". Just a suggestion to make the new RFC in line with > much other work in the communications world... I suggest that we use it, after we make sure that "ISO-IR-6" means exactly what we mean in RFC-XXXX. I also suggest that we consider using the other tokens to identify otehr character sets or encodings, or whatever it is that we call this stuff. Cheers...\Stef From owner-ietf-822@dimacs.rutgers.edu Thu Apr 18 23:57:27 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA24187; Thu, 18 Apr 91 23:26:43 EDT Received: from Sun.COM by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA24178; Thu, 18 Apr 91 23:26:38 EDT Received: from Eng.Sun.COM (zigzag-bb.Corp.Sun.COM) by Sun.COM (4.1/SMI-4.1) id AA01619; Thu, 18 Apr 91 20:26:34 PDT Received: from polya.Eng.Sun.COM by Eng.Sun.COM (4.1/SMI-4.1) id AA18375; Thu, 18 Apr 91 20:26:26 PDT Received: by polya.Eng.Sun.COM (4.1/SMI-4.0-MHS-6.0) id AA20160; Thu, 18 Apr 91 20:26:19 PDT Date: Thu, 18 Apr 91 20:26:19 PDT From: Peter.Vanderbilt@eng.sun.com (Peter Vanderbilt) Message-Id: <9104190326.AA20160@polya.Eng.Sun.COM> To: keld@dkuug.dk Subject: Re: SIGH! Re: text --> IA5 ? Cc: ietf-822@dimacs.rutgers.edu > About the name of ASCII: [...] Their recommendation was to use > the registration number of the ISO 2375 registry administered by ECMA. > A character set could then be referenced as "ISO IR xxx" meaning ISO > International Registration number xxx. > ASCII (the 7-bit critter we know so well) has registration number 6. > My naming takes the effort to make the name a token, so I call > it "ISO-IR-6". It doesn't really matter what token we use as long as we agree what it means. But two warnings about "ISO-IR-6": First, IA5, what X.400 uses, is based on ISO-IR-2. Second: As I understand it, each registration numbers refers to only a part of a typical character set. In particular, registration numbers refer to coded character sets of size 94 or 96 or, for multibyte character sets, 94^n or 96^n. Typical character sets are larger, like 94+96 (for 8-bit) or 94+94^2+96. For example, ISO 8859/1 (Latin-1) has characters from registrations 6 and 100 -- the 6 refers to ASCII and the 100 to the right hand part of 8859/1. There are additional registration numbers for the control characters. Calling 8859/1 "ISO-IR-100" would be inexact, at best. Also I believe a commonly used character set in Japan uses ISO 2022 with characters from 14 in G0, 81 in G1 and 13 in G2. What would you use for this? Even ASCII is really composed of 6 with control characters (probably registration #1). To name the existing character set used by 822 systems, I prefer something simple like "ASCII" or "US-ASCII". The differences between versions and vintages are fairly minor and are probably not adhered to by real systems anyway. Does everybody reading this mail see "$@[]\^`{}|~" as dollar sign, at sign, square brackets, back slash, caret, back quote, curly brackets, vertical bar and tilde (hope I've got the names right!)? For the future we should nail down the character set as exactly as possible, including whether regionally varying renditions are allowed. Pete From owner-ietf-822@dimacs.rutgers.edu Fri Apr 19 02:27:30 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA28158; Fri, 19 Apr 91 02:07:49 EDT Received: from INFOODS.MIT.EDU by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA28154; Fri, 19 Apr 91 02:07:45 EDT Date: Fri 19 Apr 91 02:02:50-EDT From: John C Klensin Subject: Re: SIGH! Re: text --> IA5 ? To: Peter.Vanderbilt@eng.sun.com Cc: keld@dkuug.dk, ietf-822@dimacs.rutgers.edu Message-Id: <672040970.24671.KLENSIN@INFOODS.MIT.EDU> In-Reply-To: <9104190326.AA20160@polya.Eng.Sun.COM> Mail-System-Version: Peter Vanderbilt writes, in part... >First, IA5, what X.400 uses, is based on ISO-IR-2. "Based on"? No. No more than ASCII is "based on" 6. They are, by no coincidence, identical in the graphics characters but certainly ASCII, and, if I recall, IA5, are "character set first, registration of graphical repertoire later". >Second: As I understand it, each registration numbers refers to only a >part of a typical character set. In general, for what we mean by "character set", one needs at least two--one for graphics and one for controls. For many purposes, of course, one does not care about the controls. But, in a document that references things like CR and LF, one must be careful. >For example, ISO 8859/1 (Latin-1) has characters from registrations 6 >and 100 -- the 6 refers to ASCII and the 100 to the right hand part of >8859/1. There are additional registration numbers for the control >characters. Calling 8859/1 "ISO-IR-100" would be inexact, at best. Yes. And for those who believe that "ASCII" is the only source of terrible confusion around here, note ISO8859-1 (both graphic sets and both control sets) is called "Latin-1" and that registration 100 is called--you guessed it--"Latin-1". >The differences between >versions and vintages are fairly minor and are probably not adhered to >by real systems anyway. Before our European colleagues wake up and have to generate flames early in the morning... There is an ISO Standard, 646 (note low number), which started with ASCII as a departure point. Traditionally, 646 has specified two "versions". One of those, the "international reference version" is identical to ASCII with the substitution of "universal currency symbol" for "dollar sign". The other, however, is something called the "basic version". It reserves about a half-dozen character positions that ASCII uses for special characters for "national use" characters, leading to roughly one national variation per country. And "real systems" pay attention: if nothing else, these national characters show up on keyboard keytops, printers, and usually screens. >Does everybody reading this mail see >"$@[]\^`{}|~" as dollar sign, at sign, square brackets, back slash, >caret, back quote, curly brackets, vertical bar and tilde (hope I've >got the names right!)? In a word, no. Letters with umlauts, and cedillas, and grave and acute accents, and slashes, and circles over letters, and question marks with the little curvy part at the bottom and the dot at the top, and... > For the future we should nail down the >character set as exactly as possible, including whether regionally >varying renditions are allowed. Yeah. And we need to be clear about whether we are specifying (nailing down) graphic coding only (e.g., ISO-RN-6) or both graphics and controls (e.g., ASCII). Sorry, Stef, it really isn't going to be easy :-) --john ------- From owner-ietf-822@dimacs.rutgers.edu Fri Apr 19 10:48:46 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA07697; Fri, 19 Apr 91 09:28:33 EDT Received: from thumper.bellcore.com by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA07693; Fri, 19 Apr 91 09:28:31 EDT Received: from greenbush.bellcore.com by thumper.bellcore.com (4.1/4.7) id for ietf-822@dimacs.rutgers.edu; Fri, 19 Apr 91 09:28:28 EDT Received: by greenbush.bellcore.com (4.12/4.7) id for ietf-822@dimacs.rutgers.edu; Fri, 19 Apr 91 09:31:55 edt Received: from Messages.7.14.N.CUILIB.3.45.SNAP.NOT.LINKED.greenbush.mouseclub.sun4.40 via MS.5.6.greenbush.mouseclub.sun4_40; Fri, 19 Apr 1991 09:31:51 -0400 (EDT) Message-Id: Date: Fri, 19 Apr 1991 09:31:51 -0400 (EDT) From: Nathaniel Borenstein To: ietf-822@dimacs.rutgers.edu Subject: Re: Spelling "ASCII" in ECMA or ISO In-Reply-To: <672016535.912671.KLENSIN@INFOODS.MIT.EDU> References: <672016535.912671.KLENSIN@INFOODS.MIT.EDU> I think that John's probably right. My take on it is that we should allow all the ISO character sets to be specified in the form ISO-IR-xxx, but that the "default" content-type should be handled specially, because it probably isn't, from all I've heard, exactly the same as ISO-IR-6. We can't change the default RFC 822 body part retroactively. -- Nathaniel From owner-ietf-822@dimacs.rutgers.edu Fri Apr 19 11:42:36 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA09684; Fri, 19 Apr 91 10:26:59 EDT Received: from uvaarpa.Virginia.EDU by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA09680; Fri, 19 Apr 91 10:26:56 EDT Received: from uvacs.cs.Virginia.EDU by uvaarpa.Virginia.EDU id aa26539; 19 Apr 91 10:26 EDT Received: from wilbury.cs.Virginia.EDU by uvacs.cs.Virginia.EDU (4.1/5.1.UVA) id AA13515; Fri, 19 Apr 91 10:26:04 EDT Posted-Date: Fri, 19 Apr 91 10:27:23 EDT Return-Path: Received: by wilbury.cs.Virginia.EDU (4.1/SMI-2.0) id AA18763; Fri, 19 Apr 91 10:27:23 EDT Date: Fri, 19 Apr 91 10:27:23 EDT From: rja7m@wilbury.cs.virginia.edu Message-Id: <9104191427.AA18763@wilbury.cs.Virginia.EDU> X-Mailer: Mail User's Shell (7.2.0 10/31/90) To: ietf-822@dimacs.rutgers.edu Subject: Character set identification I think that disallowing the use of the term "ASCII" is a mistake because it is clearly understood by more people than any other term. Allowing the ECMA registrations as permitted extensions does seem reasonable, but frankly the use of these 7-bit variants is fading quickly and I'm a lot more concerned with proper handling of the 8-bit standards that all are moving towards and that will be with us a long time in the future (namely the ISO 8859/x series). The ECMA registration is "international" in the European sense but is not widely used outside of Europe. I think that omitting "ASCII" in favor of the ECMA name is a mistake. Allowing both seems a reasonable compromise. The previous note about deficiencies in ISO 646 should be given careful thought with regard to this effort. Randall Atkinson randall@Virginia.EDU From owner-ietf-822@dimacs.rutgers.edu Sat Apr 20 14:09:53 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA14450; Sat, 20 Apr 91 13:58:14 EDT Received: from dkuug.dk by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA14429; Sat, 20 Apr 91 13:57:54 EDT Received: by dkuug.dk (5.64+/8+bit/IDA-1.2.8) id AA27941; Sat, 20 Apr 91 19:57:24 +0200 Date: Sat, 20 Apr 91 19:57:24 +0200 From: Keld J|rn Simonsen Message-Id: <9104201757.AA27941@dkuug.dk> To: ietf-822@dimacs.rutgers.edu, nsb@thumper.bellcore.com Subject: Re: Spelling "ASCII" in ECMA or ISO X-Charset: ASCII X-Char-Esc: 29 > I think that John's probably right. My take on it is that we should > allow all the ISO character sets to be specified in the form ISO-IR-xxx, > but that the "default" content-type should be handled specially, because > it probably isn't, from all I've heard, exactly the same as ISO-IR-6. > We can't change the default RFC 822 body part retroactively. -- ISO-IR-6 is actually the same as ASCII, except that the control codes are not defined in ISO-IR-6. I would go with allowing them both, the ASCII name is especially convenient if the user is going to type this by hand, as ASCII is very well known, and people may have problems in remembering the relevant ISO registration number. Keld From owner-ietf-822@dimacs.rutgers.edu Sat Apr 20 16:09:57 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA17168; Sat, 20 Apr 91 16:07:41 EDT Received: from TWG.COM by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA17164; Sat, 20 Apr 91 16:07:30 EDT Received: from Obelix.twg.com by twg.com with SMTP ; Sat, 20 Apr 91 13:05:40 PST Received: from obelix.twg.com by Obelix.TWG.COM id aa27800; 20 Apr 91 13:05 PDT To: John C Klensin Cc: ietf-822@dimacs.rutgers.edu Subject: Re: RFC-87gtwys (was: smtp charter (revised) ) In-Reply-To: Your message of Fri, 19 Apr 91 15:03:55 -0400. <672087835.816671.KLENSIN@INFOODS.MIT.EDU> Date: Sat, 20 Apr 91 13:05:11 -0700 From: David Herron Message-Id: <9104201305.aa27800@Obelix.TWG.COM> [Message redirected to ietf-822 because it's not talking about SMTP++] > >>Basically some intermediate node must try > >> to translate a message without knowledge of either the sender > >> or the receiver. > > > >Assumably the translations will be well enough defined that it > >can be performed "on the fly" by software. > You mean like from Chinese characters into English? Or from music and > pictures into ASCII? Or from an ISO ("ECMA") registration recorded last > Monday into Latin-1? :-( No, doing so requires semantic knowledge. (I have a moderately strong background in Linguistics, and know that language translation is a Hard Problem). The kind of translations which can be done "on the fly" are encoding funny bytes into printable ascii (eg UUENCODE, TEX-HEX, etc) or (perhaps, if it is known) translating one picture (or sound) format into another. The last is something to be nervous over, especially over "conversions with loss" (As X.400 puts it). Fortunately there are a couple of headers available in X.400 (and codified into RFC parlance in Steve Kille's string of RFCs ending in 1148) Conversion: (allowed|prohibited) Conversion-With-Loss: (allowed|prohibited) Assuming that each piece of a message identifies what it is then the sorts of conversions I'm talking about are doable. What I expect is that as a message leaves an Enclave (as Stef described) that anything funny about the body be translated into uuencode (or something). Some sorts of markers need to be placed in the message describing what each piece is, and the encoding be used. etc. > I'd be delighted to provide an MHS > with limit values on how much transformation I'm willing to let it do > (if you can define that scale), but the default ought to be that, if the > message can't be delivered as sent, it gets returned, not opened and > rewritten by some hidden daemon. Are the Conversion: header lines enough for you? Suppose you had a bitmap of some Known and Defined format in your message and the destination was delivering messages to a fax machine. (Because the person is away from the office, and messages are going to the hotel fax where s/he is staying). Translating text into a FAX-bitmap is pretty easy, supporting many character sets makes it a little bit harder. Translating the bitmap is doable, but since G3-FAX is "only" 100x200 dpi then that translation might entail some loss in quality. David From owner-ietf-822@dimacs.rutgers.edu Sat Apr 20 20:09:59 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA21957; Sat, 20 Apr 91 19:50:56 EDT Received: from INFOODS.MIT.EDU by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA21953; Sat, 20 Apr 91 19:50:53 EDT Date: Sat 20 Apr 91 19:49:35-EDT From: John C Klensin Subject: Re: RFC-87gtwys (was: smtp charter (revised) ) To: david@twg.com Cc: ietf-822@dimacs.rutgers.edu Message-Id: <672191375.700671.KLENSIN@INFOODS.MIT.EDU> In-Reply-To: <9104201305.aa27800@Obelix.TWG.COM> Mail-System-Version: >Are the Conversion: header lines enough for you? No. The binary (with/without) granularity isn't fine enough. And what I think I learned back when I took may last course in communications and information theory over a quarter century ago, "without loss" is a really interesting philosophical concept, but impractical in practice: something about the assurance of shared semantics... :-) >Suppose you had a >bitmap of some Known and Defined format in your message and the >destination was delivering messages to a fax machine. >... This is actually a perfect example. In the general case, I don't know about the delivery device. If I start with a character stream message, one can talk about probable percentage distortion loss models which are, in general, pretty complicated--in part because, if the delivery translator prints VERY LARGE on the fax machine, 100x200 may be plenty, but, if it decides to print in 6 or 8 point, there may be big trouble. Moreover, in deciding the minimum acceptable size to print, a system that "knows" it is transmitting only upper-case ASCII can print lots smaller than the same system printing from a repertoire of a few thousand Kanji with equivalent levels of information loss (discuss this problem with almost any producer of OCR software). Even if I start with a bitmapped image at, say, 400x400 dpi and a simple graphic case, I may want/need to say "sampling and smoothing this down to 200x200 is ok, but don't go below that, and if you are going to sample and not smooth, the limit is 300x300". Those types of statements are reasonable, but a bit more complex than "allowed/prohibited". That said, this is, again, an argument for keeping the MTAs out of the business, not an argument for giving up entirely. If the dialogue between the final delivery MTA and the receiver UA can logically contain "hey, I've got this 400x400 dpi image for you, what would you like to do with it?", then answers like "wait until I find a high-resolution workstation", "go ahead and display it on this fuzzy fax and I'd see if I can read it, but save the 400x400 form in case I need to make a better plan", or "print it on the fax machine, but do every page as four so none of the dots get lost" all become options. If the sending UA (or the user driving it) wants to include headers that say "hey recipient, don't even think about reading this until you can do so at 400x400 or above", I think it is a dandy idea to have a good mechanism for passing that along, but I don't want any intermediate MTA making decisions on that basis. --john ------- From owner-ietf-822@dimacs.rutgers.edu Sun Apr 21 01:39:54 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA28887; Sun, 21 Apr 91 01:16:45 EDT Received: from RUTGERS.EDU by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA28880; Sun, 21 Apr 91 01:16:39 EDT Received: from nrtc.northrop.com by rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA21218; Sun, 21 Apr 91 01:16:26 EDT Received: from nma.com by nrtc.nrtc.northrop.com id ad27531; 20 Apr 91 21:15 PST Received: from odin.nma.com by nma.com id aa15179; 20 Apr 91 20:51 PDT To: Keld J|rn Simonsen Cc: ietf-822@dimacs.rutgers.edu, nsb@thumper.bellcore.com Subject: Re: Spelling "ASCII" in ECMA or ISO In-Reply-To: Your message of Sat, 20 Apr 91 19:57:24 +0100. <9104201757.AA27941@dkuug.dk> Reply-To: Stef@ics.uci.edu From: Einar Stefferud Date: Sat, 20 Apr 91 21:48:57 MDT Message-Id: <1267.672209337@nma.com> Sender: stef@nma.com Well, we seem to have come back around full circle. I hope that what we define the default identifier to be, for what we all know and love as RFC822 ASCII, will be totally unambiguous to the casual user. With 8859-1 coming to be commonly known as 8-bit ASCII, I would hope that we make sure that casual users do not think that ASCII means 8-bit ASCII! So, I think we are back to USACII, or maybe RFC-XXXX-ASCII, just to be really clear about exactly what the default really is, and then define it with tablesincluded in an RFC-XXXX ANNEX to support resolution of bar bets and other pointless arguments about what it is supposed to mean. Best...\Stef From owner-ietf-822@dimacs.rutgers.edu Sun Apr 21 23:10:02 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA22533; Sun, 21 Apr 91 22:35:49 EDT Received: from manta.mel.dit.CSIRO.AU by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA22526; Sun, 21 Apr 91 22:35:41 EDT Received: from manta.mel.dit.CSIRO.AU by manta.mel.dit.csiro.au with SMTP id AA09999 (5.65b/IDA-1.4.3/DIT-1.2 for ietf-822@dimacs.rutgers.edu); Mon, 22 Apr 91 12:34:55 +1000 Message-Id: <9104220234.AA09999@manta.mel.dit.csiro.au> To: erik@sra.co.jp Cc: ietf-822@dimacs.rutgers.edu Subject: Re: Strawman proposal In-Reply-To: Your message of "Mon, 22 Apr 91 10:35:03 +0900." <9104220135.AA08456@sran8.sra.co.jp> Date: Mon, 22 Apr 91 12:34:54 +1000 From: Bob Well this is definitely an 822 msg so I've changed the list. >After writing my previous messages, I realized that TEXT-HEX would not >work very well for Japanese. The Japanese use a form of ISO 2022 that >uses only 7 bits, but each Japanese character uses 2 of the 94 >printable ASCII characters, so changing & to && or &26 would actually >change 2 or more characters to something else (garbage). In many >cases, the user would be able to see that something happened to the >message, but sometimes the change would be subtle. In any case, this >is not very good for the Japanese users. If you use 7 bits you are probably RFC821-clean already. So I'm not sure why you'd need an escape. If you do then define a specific pair of characters to be the escape instead of a single character. The key point is that each Content-type should have its own set of Content-encodings. So you don't have to use a single character for an escape character in an encoding if that won't work well. All I ask is that you put Content-type: Japanese Content-encoding: ISO2022-x in the header of the message so that my mail UA knows how to display the message (even if I can't then read it!). > If it is rejected, we send the 8-bit data using DATA Definitely not acceptable -- but I'm not sure why you want to send 8 bit data anyway. Bob Smart From owner-ietf-822@dimacs.rutgers.edu Mon Apr 22 14:10:05 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA12653; Mon, 22 Apr 91 13:18:56 EDT Received: from thumper.bellcore.com by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA12649; Mon, 22 Apr 91 13:18:53 EDT Received: from greenbush.bellcore.com by thumper.bellcore.com (4.1/4.7) id for ietf-822@dimacs.rutgers.edu; Mon, 22 Apr 91 13:18:49 EDT Received: by greenbush.bellcore.com (4.12/4.7) id for ietf-822@dimacs.rutgers.edu; Mon, 22 Apr 91 13:22:17 edt Received: from Messages.7.14.N.CUILIB.3.45.SNAP.NOT.LINKED.greenbush.mouseclub.sun4.40 via MS.5.6.greenbush.mouseclub.sun4_40; Mon, 22 Apr 1991 13:22:13 -0400 (EDT) Message-Id: <4c4lj560M2YtQKUHpL@thumper.bellcore.com> Date: Mon, 22 Apr 1991 13:22:13 -0400 (EDT) From: Nathaniel Borenstein To: ietf-822@dimacs.rutgers.edu Subject: A Kinder, Gentler New Draft Ned and I have completed work on a major revision of the draft RFC. The result is an RFC that is much more polished and much closer to being "finished" (whatever that means). In my next two messages, I will be posting it to the list. I will post it in two forms, a plain-text version and a PostScript version. I encourage you to print it on paper, as the PostScript version is eminently more readable. In this draft, we have taken the attitude of pretending, at least, to give definitive answers to all questions. Gone is the hemming and hawing of previous versions. In its place is a document that pretends, at least, to know what it is talking about. In some cases, it probably doesn't. Please don't interpret the firmer tone of this document as an implication that anything is set in stone; it is intended only to reflect our hope that most of the document, at least, is firmer than it used to be. Finally, I'd like to suggest a convention for structuring our comments about this document. Instead of just sending all your mail as "Re: The New Draft" or something like that, how about trying to put the section number into your Subject header? Thus, a comment on section 3.1 might have Subject: 3.1: Style #2 stinks or Subject: Appendix A: You must be kidding This might make it easier for casual readers of this list to focus on topics of interest to them. If you have comments on several sections, and it isn't too much trouble, you might think about breaking them up into several messages. (I'm inevitably going to read them all, so it really doesn't matter to me, but I know that some readers of this list have considerably more specialized interests, and this convention would be kinder to them.) I'd like to particularly draw peoples' attention to Section 2, which defines a LOT of new Content-types that have never been defined before. Some of these were defined with only minimal help from experts in the field, and could doubtless benefit enormously from further scrutiny. Also, please note that five of the references are not yet fleshed out. I think I'll be able to handle them, but didn't want to delay the draft just to complete a few references. The draft RFC will follow shortly. Fire away! -- Nathaniel From owner-ietf-822@dimacs.rutgers.edu Mon Apr 22 14:39:54 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA12765; Mon, 22 Apr 91 13:20:50 EDT Received: from thumper.bellcore.com by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA12738; Mon, 22 Apr 91 13:20:35 EDT Received: from greenbush.bellcore.com by thumper.bellcore.com (4.1/4.7) id for ietf-822@dimacs.rutgers.edu; Mon, 22 Apr 91 13:20:29 EDT Received: by greenbush.bellcore.com (4.12/4.7) id for ietf-822@dimacs.rutgers.edu; Mon, 22 Apr 91 13:23:57 edt Received: from Messages.7.14.N.CUILIB.3.45.SNAP.NOT.LINKED.greenbush.mouseclub.sun4.40 via MS.5.6.greenbush.mouseclub.sun4_40; Mon, 22 Apr 1991 13:23:51 -0400 (EDT) Message-Id: Date: Mon, 22 Apr 1991 13:23:51 -0400 (EDT) From: Nathaniel Borenstein To: ietf-822@dimacs.rutgers.edu Subject: TEXT version of Draft RFC Network Working Group -- Request for Comments: XXXX A Multipart Content-Type and Content-Encoding Mechanism for RFC 822 Messages Nathaniel Borenstein, Bellcore Ned Freed, Innosoft April 1991 Status of This Memo This RFC suggests extensions to the RFC 822 message representation protocol to allow multi-part textual and non-textual messages to be represented and exchanged without loss of information. Discussion and suggestions for improvements are welcome. This memo does not specify an Internet standard. Distribution of this memo is unlimited. If this RFC becomes a standard, it would affect the following other RFC's: Would Obsolete: RFC 934, RFC 1049, RFC 1154 Would Update: RFC 822 Would Affect: RFC 1148 Table of Contents Introduction The Content-Type Header Field The Content-Encoding Header Field Quoted-Printable Content-Encoding Quoted-Printable Content-Encoding Base64 Content-Encoding The "Multipart" Content-Type A Complex Multipart Example The Encoded-Variable Header Field Cross-References Between Encapsulated Parts Optional Content-Size Header Field Summary Acknowledgements References Appendix A: The Character Set for the MAILASCII Content-Type 1 Introduction One of the limitations of RFC 821/822 based mail systems is the fact that they limit the contents of electronic mail messages to relatively short lines of seven-bit ASCII. This forces a user to convert any non-textual data that she may wish to send into a seven-bit ASCII representation before invoking her local mail UA (User Agent program). Examples of encodings currently used in the Internet include pure hexadecimal, uuencode, the 3-in-4 base 64 scheme specified in RFC 1113, the Andrew Toolkit Representation [REF-ATK], and many others. This limitation becomes even more apparent as gateways are designed to allow for the exchange of mail messages between RFC 822 hosts and X.400 hosts. X.400 [REF-X400] specifies mechanisms for the inclusion of non-textual body parts within electronic mail messages. The current standards for the mapping of X.400 messages to RFC 822 messages specify that either X.400 non-textual body parts should be converted to (not encoded in) an ASCII format, or that they should be discarded, notifying the RFC 822 user that discarding has occurred. This is clearly undesirable, as information that a user may wish to receive is lost. Even though a user's UA may not have the capability of dealing with the non-textual body part, the user might have some mechanism external to the UA that can extract useful information from the body part. Moreover, it does not allow for the fact that the message may eventually be gatewayed back into an X.400 MHS, where the non-textual information would definitely become useful again. In devising an encapsulation scheme, two things must be considered: how to convert the non-textual data to a representation which may be transmitted over a seven-bit SMTP connection without loss of data, and how to preserve information about the structure of the data itself. This "structural" information must include, at a minimum, the type of data involved. This type information may be something recognized by many systems or it may be some type of data specific to a single operating system. This memo describes several mechanisms that combine to solve these problems. In particular, it describes an encapsulation mechanism that may be used to describe multiple part ("multipart") messages. The parts themselves may contain textual or nontextual data; non-textual data is encoded in a form that can survive mailers unaware of this specification. This memo also defines two RFC 822 header fields to be used to indicate the inclusion of non-textual information in a mail message: Content-Type and Content-Encoding. Additionally, this memo proposes an Encoded-Variable header field for including non-textual or international text information in certain parts of the message header area. Finally, this memo defines an optional header field, Content-Size, which may be used within multipart messages. 2 The Content-Type Header Field The Content-Type header field was previously defined in RFC 1049, and is reaffirmed here. The remainder of this section is derived from RFC 1049, and, where different, is intended to supersede it. The Content-type: header field consists of up to four parameter values. The first, or type parameter names the type, format, or structuring technique; the second, optional, parameter is a version number, ver-num, which indicates a particular version or revision of the standardized format. The third parameter is a resource reference, resource-ref, which may indicate a standard database of information to be used in interpreting the information. The last parameter is a comment. In the Extended BNF notation of RFC-822, we have: Content-Type:= type [";" ver-num [";" 1#resource-ref]] [comment] ver-num:= local-part resource-ref:= local-part type := "POSTSCRIPT" / "SCRIBE" / "SGML" / "TeX" / "TROFF" / "DVI" / "ODA" / "MULTIPART" / "MAILASCII" / iso-charset-type / "U-LAW" / "A-LAW" / "PBM" / "PGM" / "PPM" / "DES-MESSAGE" / x400-type / x400-1984-type / x400-1988-type / "X-"atom iso-charset-type := "ISO-IR-" 1*DIGIT x400-type := "IA5-Text" / ; [0] IA5Text, IA5TextBodyPart "Voice" / ; [2] Voice, VoiceBodyPart "G3-Fax" / ; [3] G3Fax, G3FacsimileBodyPart "Teletex" / ; [5] TTX, TeletexBodyPart "Videotex" / ; [6] Videotex, VideotexBodyPart "Nationally-Defined" / ; [7] NationallyDefined, ; NationallyDefinedBodyPart "Encrypted" / ; [8] Encrypted, EncryptedBodypart "Message" ; [9] ForwardedIPMessageMessage, ; MessageBodyPart x400-1984-type := "Telex" / ; [1] TLX "TIF0" / ; [4] TIF0 "SFD" / ; [10] SFD "TIF1" ; [11] TIF1 x400-1988-type := "G4-Class1" / ; [4] G4Class1BodyPart "Mixed-Mode" / ; [11] MixedMode "Bilaterally-Defined" / ; [14] BilaterallyDefined "Externally-Defined" ; [15] ExternallyDefined These values are not case sensitive. POSTSCRIPT, Postscript, and POStscriPT are all equivalent. Additional "standard" Content-type values may be registered with Internet Assigned Numbers Coordinator at USC-ISI. Those wishing to register such values should contact: Joyce K. Reynolds USC Information Sciences Institute 4676 Admiralty Way Marina del Rey, CA 90292-6695 213-822-1511 JKReynolds@ISI.EDU The specific predefined "type" fields are explained below: "X-"atom -- Any type value beginning with the characters "X-" is a private value, to be used by consenting mail systems by mutual agreement. Any format without a rigorous and public definition should be named with an "X-" prefix. POSTSCRIPT -- Indicates the enclosed document consists of information encoded using the Postscript Page Definition Language developed by Adobe Systems, Inc. [REF-PS]. For type "postscript" the valid ver-num fields are "1.0", "2.0", and "null", and the valid resource-ref fields include, but are not limited to, "laserprep2.9", "laserprep3.0", "laserprep3.1", and "laserprep4.0". SCRIBE -- Indicates the document contains embedded formatting information according to the syntax used by the Scribe document formatting language distributed by the Unilogic Corporation. [REF-SCRIBE]. For type "scribe" the valid ver-num fields are "null", "3", "4", "5", etc. SGML -- Indicates the document contains structuring information to according the rules specified for the Standard Generalized Markup Language, IS 8879, as published by the International Organization for Standardization. [REF-SGML] Documents structured according to the ISO DIS 8613--Office Docment Architecture and Interchange Format--may also be encoded using SGML syntax. For type "sgml" the valid ver-num fields are "IS.8879.1986" and "null" TeX -- Indicates the document contains embedded formatting information according to the syntax of the TeX document production language. [REF-TEX] TROFF -- Indicates the document contains embedded formatting information according to the syntax specified for the TROFF formatting package developed by AT&T Bell Laboratories. [REF-TROFF]. For type "troff" the valid resource-ref fields include, but are not limited to, "eqn", "tbl", "me", and the names of other troff macro packages. ODA -- Indicates that the body is an ODA document, containing formatted information encoded according to the Office Document Architecture [REF-ODA]. If needed, a document application profile is to be included as part of the message body. DVI -- Indicates the document contains information according to the device independent file format produced by TROFF or TeX. MULTIPART -- Indicates the document contains multiple encapsulated messages, each of which may be of a different content-type. The precise syntax of a "multipart" message is defined later in this RFC, as are the possible values for its ver-num and resource-ref fields. U-LAW or A-LAW -- Indicates that the document contains audio data in U-law [REF-ULAW] or A-law [REF-ALAW], respectively. U-law and A-law are the American and European audio telephony standards. If one of these content-types is used, the ver-num field can be used to give a sampling rate in Hertz, optionally followed by the letter "HZ". Although audio header formats are not yet standardized, the resource-ref field can be used to specify an audio header format. Thus an appropriate content-type header for audio might be something like "Content-type: u-law; 8000 HZ; X-Next" PBM or PGM or PPM -- Indicates the document contains image data encoded in the Portable Bitmap format [REF-PBM] for black and white, grey scale, or color images. DES-MESSAGE -- Indicates that the body is an encapsulated message encrypted with DES encryption [REF-DES]. An encrytped message is specified, rather than simply encrypted text, because this permits the encrypted object to contain a Content-type header and thus to contain encrypted data of any type. If all that is desired is encrypted text, the header are of the encapsulated message can be blank (i.e. once decrypted, it begins with CRLF.) ISO-CHARSET-TYPE -- Indicates the document contains text in an ISO standard character set by ints International Registration number. Each ISO character set defines a new standard mail content type, given by the string "ISO-IR-" followed by the numeric value of the character set. Thus, for example, a content-type of "ISO-IR-6" specifies a character set that is extremely similar, and perhaps identical, to MAILASCII. However, it should be noted that even when the Content-type is an ISO-IR- character set type, certain control characters will always be construed according to the guidelines of RFC 821 and RFC 822. In particular, character positions 13, 10, and 32 will always be interpreted at times as CR, LF, and SPACE, respectively. X400-TYPE -- Indicates the document contains an ASN.1 representation of an X.400 bodypart. The type field may be either "1984", indicating that the represenation is defined in [REF-CCITT84c], or "1988", indicating that the encoding is defined in [REF-CCITT/ISO88b]. X400-1984-TYPE -- Indicates that the document contains an ASN.1 representation of an X.400 bodypart specific to the 1984 version of the standard [REF-CCITT84c]. The type field must be "1984" if specified. X400-1988-TYPE -- Indicates that the document contains an ASN.1 representation of an X.400 bodypart specific to the 1988 version of the standard [REF-CCITT/ISO88b]. The type field must be "1988" if specified. MAILASCII -- Indicates the document contains only unencoded 7 bit US ASCII text, the default content-type for RFC 822 mail. This content-type has been the subject of some confusion and ambiguity in the past. Its definition is spelled out in Appendix A. If no Content-type header field is present, "MAILASCII" is assumed. That is, the name "MAILASCII" is intended to refer to the default message body type as defined by RFC 822. It should be noted that the list of Content-type values given above is expected to be augmented in time, and that such additions will be registered at the address given above. We have simply attempted, in this RFC, to give as many standard Content-type definitions as was possible given the current state of our knowledge. The Content-type values defined above are a superset of the values defined by RFC 1049. Thos wishing to transmit FAX by Internet mail should note that G3-FAX is one of the Content-types defined for X.400 support. It is thus appropriate to use "Content-type: G3-FAX" for such data. 3 The Content-Encoding Header Field Many content-types are represented, in their natural format, as 8-bit or binary data. Such data can not be transmitted over existing Internet mail mechanisms because both RFC 821 and RFC 822 restrict mail messages to 7 bit data with reasonably short lines. It is necessary, therefore, to define a standard mechanism for encoding such data in an acceptable manner. This RFC specifies that this encoding will be done by a new "Content-Encoding" header field. The Content-Encoding field is used to indicate the type of transformation that has been used to represent the message body in an acceptable manner. Unlike Content-types, which are expected to proliferate, it is expected that there will never be more than a few different Content-Encoding values, both because there is less need for variation and because the effect of variation in Content-Encoding would be more problematic. However, establishing only a single Content-Encoding mechanism does not seem possible. In particular, there is a tradeoff between the desire for a compact and efficient encoding of binary data and the desire for a readable encoding of data that is mostly, but not entirely, MAILASCII text. For this reason, at least two encoding mechanisms are necessary, a "readable" encoding and a "dense" encoding. This RFC also specifies a third encoding which is neither readable nor dense, but is the most simple to encode and unencode. A fourth encoding, for compressed ("super-dense") data, might reasonably be defined at a later date. The Content-Encoding field is designed to specify a two-way mapping between the "native" representation of a type of data and a representation that can be readily exchanged using 7 bit mail transport protocols as defined by RFC 821 (SMTP). This field has not been defined by any previous RFC. The field's value is a single atom specifying the type of encoding, as enumerated below. Formally: Content-Encoding:= "BASE64"/ "HEXADECIMAL"/ "QUOTED-PRINTABLE"/ "8BIT"/"BINARY"/ "7BIT"/"X-"atom These values are not case sensitive. That is, Hexadecimal and HEXADECIMAL and heXadeCimAl are all equivalent. An encoding type of 7BIT implies that the message is already in a seven-bit ASCII representation. This value is assumed if the Content-Encoding header field is not present. If the message is stored or transported via a mechanism that permits 8-bit data, a Content-Encoding of "8bit" should nonetheless be used. If the message is stored or transported via a mechanism that permits arbitary binary data, a Content-Encoding of "binary" should nonetheless be used. (DISCUSSION: The distinction between the Content-Encoding values of "binary," "8bit," and "7bit" may seem unimportant in an 8-bit binary environment, but clear labeling will be of enormous value to gateways between 8-bit and 7-bit systems. The difference between "8bit" and "binary" is that "8bit" implies adherence to SMTP limits on line length and CR/LF semantics, whereas "binary" does not.) Implementors may define new content encoding values, but should prefix them with "x-" to indicate their non-standard status, e.g. "Content-Encoding: x-my-new-encoding". However, unlike Content-types, the creation of new Content-Encoding values is explicitly discouraged, as it seems likely to hinder inter-operability with little potential benefit. If a Content-Encoding header field appears as part of a message header, it applies to the entire message body, whether or not that body is of type "multipart." If it is of type multipart, the encoding applies recursively to all of the encapsulated parts, including their encapsulated headers. If a Content-Encoding header field appears as part of an encapsulation's headers, it applies only to the body of the encapsulated part. If the encapsulated part is itself of type "multipart", the encoding applies recursively to all of the encapsulated parts within that encapsulated part. The following sections will define the standard encoding mechanisms. 3.1 Quoted-Printable Content-Encoding The Quoted-Printable encoding is intended to represent data that is largely, but not entirely, 7 bit ASCII. Printable ASCII portions of body parts encoded in this way should be recognizable by humans, if necessary, without translation. In this encoding, ASCII characters 9 (tab), 10 (nl), 13 (cr), 32 through 37, inclusive, 39 through 91, and 93 through 127, inclusive, are unchanged. All other characters, including characters 38 and 92, are to be represented in either of the following quotation styles and special cases: Style #1: Any 8 bit value may be represented a "\" followed by a two digit hexadecimal representation of the character's ASCII value. Thus, for example, character 12 (control-L, or formfeed) can be represented by "\0C", the ampersand character (38) can be represented by "\26", and the backslash character (92) itself can be represented by "\5C". Style #2: An 8 bit value from 160 through 255 may, alternately, be represented by an ampersand character followed by the character obtained by the removal of the high order bit, i.e. by subtracting 128 from the value. Thus the 8 bit value 193 may be represented as "&A". Note that these two styles may be freely intermixed. Style #1 is preferred for characters 128 through 159, because style #2 might include control characters (e.g. TAB) that are altered by some MTA (see NOTES TO IMPLEMENTERS, below). Style #2 is provided for improved readability of some 8-bit character sets in which turning on the 8th bit produces a character similar to the corresponding 7 bit character, e.g. the 8th bit simply adds an umlaut. In such cases, style #2 is somewhat more readable, but should be used carefully, as explained in the NOTES TO IMPLEMENTERS. Additionally, there are two special cases that may be represented otherwise: Special case #1: The literal ampersand and backslash characters may themselves be quoted by backslashes. Thus, the backslash may be represented as "\\" and the ampersand as "\&". Note that this is not ambiguous with regard to the first clause, because neither "\" nor "&" are part of the hexadecimal alphabet. Special case #2: A backslash at the end of a line may be used to indicate a non-significant line break. That is, if one needs to include a long line without line breaks, but is concerned that MTA's will break the line into multiple lines, a message encoded with the quoted-printable encoding may include "soft" line breaks by preceding the line break with a backslash. Thus if the "raw" form of the line is a single line that says: Now's the time for all men to come to the aid of their country. Now's the time for all men to come to the aid of their country. Now's the time for all men to come to the aid of their country. This could be represented, in the quoted-printable encoding, as Now's the time for all men to come to the aid of their country. \ Now's the time for all men to come to the aid of their country. \ Now's the time for all men to come to the aid of their country. This provides a mechanism with which long lines can be encoded in such a way as to be restored by the user agent. NOTES TO IMPLEMENTERS of encoding agents: for maximum portability across MTA's, it is recommended that any long lines be represented using "soft" line breaks which are inserted before any line reaches the 80th character. It is also recommended that trailing white space (white space at the end of a line) not be relied upon, as some MTA's freely delete such trailing white space. (Such a line may be represented, if necessary, using the above rules, by appending a backslash to the end of the line, and following it with a blank line.) It is also recommended that the persistence of character codes less than 32 should not be relied on, particularly the TAB, CR, and LF characters. Where such characters would be required for representation in style #2, it is recommended that style #1 be used. NOTE ABOUT CR AND LF in encoded messages: The use of CR or LF characters that are not part of a CR/LF sequence is NOT PERMITTED in messages that use the Quoted-Printable encoding. (Their presence is not an issue for the other encodings.) Sequences such as CR LF LF are also invalid; the correct sequence is CR LF CR LF. The effect in an encoded message of a CR without a following LF, or an LF without a preceding CR, is undefined. Although RFC-822 defines these as ordinary characters when used outside of the CR/LF sequence, some implementations treat one (or both) as equivalent to newline or as error characters that are discarded. Messages which contain embedded bare CR or LF characters should use encoding style #1 to encode these characters "safely". (Discussion: Some environments use a bare CR or bare LF as the local newline convention. If a message contains embedded bare CR or LF characters, it is impossible to transform it from Internet to local conventions without interfering with this local convention.) Since the hyphen character ("-") is represented as itself in the Quoted-Printable encoding, care must be taken, when encapsulating a quoted-printable encoded message in a multipart message, to ensure that the encapsulation boundary does not appear anywhere in the message. See the definition of multipart messages, later in this document. 3.2 Hexadecimal Content-Encoding The Hexadecimal Content-Encoding is intended to represent arbitrary data that is not humanly-readable in a printable 7-bit form that can be passed through 7 bit mail transport agents. It transforms a byte stream into a series of two-digit hexadecimal values. Thus, the sequence of the five 8-bit values "ABC control-L newline" would be represented by "4142430C0A". Since newlines are themselves encoded as 0A, non-data newlines may be scattered freely to break the stream into multiple lines. In fact, it is recommended that newlines be included at least every 60 characters (30 encoded characters). Such newlines will be discarded by the decoder. The hexadecimal encoding is a simple way to represent arbitrary 8 bit data in 7 bit mail, but not a very efficient one, as it doubles the size of the data. The Base64 encoding, to be described below, is a reasonably simple alternative that only increases the size of the data by 33 percent. The hexadecimal encoding is permitted explicitly because there are widespread utilities for converting binary files to hexadecimal. Since the hyphen character ("-") is not used in hexadecimal encodings, there is no need to worry about quoting apparent encapsulation boundaries within hexadecimal-encoded body parts. When encoding a bit stream via the hexadecimal encoding, the bit stream should be presumed to be ordered with the most-significant-bit first. That is, the first bit in the stream will be the high-order bit in the first byte, and the eighth bit with be the low-order bit in the first byte, and so on. The Hexadecimal alphabet is defined as "0123456789ABCDEF". Upper case letters A-F should be used by encoders, though it is acceptable if a decoder ignores case. 3.3 Base64 Content-Encoding The Base64 Content-Encoding is designed to represent arbitrary 8 bit data in a form that is not humanly readable. The encoding and decoding algorithms are simple, but the encoded data is only about 33 percent larger than the unencoded data. This encoding is also used in Privacy Enhanced Mail applications; it is described in RFC 1113. The ability in RFC1113 to imbed clear text within such an encoding is not allowed in this context, however. The following description of the encoding is adapted from RFC 1113; apart from the exclusion of the "*" mechanism for imbedded clear text there are no significant technical changes. A 64-character subset of International Alphabet IA5 is used, enabling 6 bits to be represented per printable character. (The proposed subset of characters is represented identically in IA5 and ASCII.) One additional character, "=", is used to signify special processing functions. The character "=" is used for padding within the printable encoding procedure. The encoding function's output is delimited into text lines (using local conventions), with each line except the last containing exactly 64 printable characters and the final line containing 64 or fewer printable characters. (This line length is easily printable and is guaranteed to satisfy SMTP's 1000 character transmitted line length limit.) The encoding process represents 24-bit groups of input bits as output strings of 4 encoded characters. Proceeding from left to right across a 24-bit input group is formed by concatenating 3 8-bit input groups, this is then treated as 4 concatenated 6-bit groups. When encoding a bit stream via the base64 encoding, the bit stream should be presumed to be ordered with the most-significant-bit first. That is, the first bit in the stream will be the high-order bit in the first byte, and the eighth bit with be the low-order bit in the first byte, and so on. Each 6-bit group is used as an index into an array of 64 printable characters. The character referenced by the index is placed in the output string. These characters, identified in Table 1 below, are selected so as to be universally representable, and the set excludes characters with particular significance to SMTP (e.g., ".", "", ""). Table 1 Value Encoding Value Encoding Value Encoding Value Encoding 0 A 17 R 34 i 51 z 1 B 18 S 35 j 52 0 2 C 19 T 36 k 53 1 3 D 20 U 37 l 54 2 4 E 21 V 38 m 55 3 5 F 22 W 39 n 56 4 6 G 23 X 40 o 57 5 7 H 24 Y 41 p 58 6 8 I 25 Z 42 q 59 7 9 J 26 a 43 r 60 8 10 K 27 b 44 s 61 9 11 L 28 c 45 t 62 + 12 M 29 d 46 u 63 / 13 N 30 e 47 v 14 O 31 f 48 w (pad) = 15 P 32 g 49 x 16 Q 33 h 50 y Special processing is performed if fewer than 24 bits are available in an at the end of a message or encapsulated part of a message. A full encoding quantum is always completed at the end of a message. When fewer than 24 input bits are available in an input group, zero bits are added (on the right) to form an integral number of 6-bit groups. Output character positions which are not required to represent actual input data are set to the character "=". Since all canonically encoded output is an integral number of octets, only the following cases can arise: (1) the final quantum of encoding input is an integral multiple of 24 bits; here, the final unit of encoded output will be an integral multiple of 4 characters with no "=" padding, (2) the final quantum of encoding input is exactly 8 bits; here, the final unit of encoded output will be two characters followed by two "=" padding characters, or (3) the final quantum of encoding input is exactly 16 bits; here, the final unit of encoded output will be three characters followed by one "=" padding character. Since the hyphen character ("-") is not used, there is no need to worry about quoting apparent encapsulation boundaries within base64-encoded body parts. 4 The "Multipart" Content-Type In the case of multiple part messages, a "multipart" Content-type field should appear in the RFC 822 message header. The message body is then assumed to contain multiple parts separated by encapsulation boundaries. Each of the parts is defined, in essence, as a complete RFC 822 message in miniature. That is, what is found between the encapsulation boundaries is a header area, a blank line, and a body area, in accordance with the RFC 822 syntax for a message. However, it should be noted that NO header fields are actually required in these encapsulated messages. An encapsulation that starts with a blank line, therefore, is a legitimate encapsulation of a message with no header fields. In such a case, of course, the absence of a Content-type header field implies that the encapsulation is MAILASCII text. Important to note is that the encapsulation boundary MUST NOT appear inside any of the encapsulated parts. Thus, it is crucial that the composing agent be able to choose and specify the boundary that will separate the parts. This is done using the resource specification in the Content-type header field. The Content-type header field, as defined earlier in this document, has two important optional fields that may follow the type name. These fields are for a version number and a resource specification. In the case of the "multipart" content-type, this document defines version numbers 1-S and 1-P; if the version number is omitted or "null", it is to be assumed to be version 1-S. The two versions have identical syntax, but the "-P" is intended as a hint, to receivers, that the parts are intended to be viewed in parallel rather than sequentially. Implementations that can not show the parts in parallel, or that choose not to do so, are free to treat all multipart messages of version "1-P" as if they were version "1-S". However, all implementation should check the version number, to ensure graceful behavior in the event that an incompatible future version of multipart messages is defined later. The resource specification, which is always required for multipart messages, is used to specify the format of the encapsulation boundary. The encapsulation boundary is defined as two hyphen characters ("-", decimal code 45) followed by the resource-specification portion of the Content-type header field with any leading or trailing white space removed. (DISCUSSION: The specification that white space be removed is intended to eliminate the possible introduction of ambiguity caused by the addition or deletion of white space by message transport agents. They hyphens are for rough compatibility with the earlier RFC 934 method of message encapsulation, and for ease of searching for the boundaries in some implementations. However, it should be noted that multipart messages are NOT completely compatible with RFC 934 encapsulations; in particular, they do not obey RFC 934 quoting conventions for embedded lines that begin with hyphens.) Thus, a typical multipart content-type header field might look like this: Content-type: multipart; 1-S; gc0p4Jq0M2Yt08jU534c0p This indicates that the message consists of several parts, each itself structured as an RFC 822 message, which are intended to be viewed one-at-a-time, and that the parts are separated by the line --gc0p4Jq0M2Yt08jU534c0p The encapsulation boundaries must not appear within the encapsulations, and should be no longer than 70 characters, not counting the two leading hyphens. It should be noted that no interpretation is specified for any lines preceding the first encapsulation boundary or following the last one. In general, these "prefix" and "postfix" areas of multipart messages should be regarded as comments, and implementations are free to discard them. However, it is recommended that composing agents use the prefix area to include a short textual message, in MAILASCII, explaining that what follows is an encapsulated multipart message, intended to be interpreted by software rather than by human eyes. This message is for the benefit of people who might read the message with older user agents that do not properly interpret multipart messages. The use of "Content-Type: Multipart" as a message part within another "Content-Type: Multipart" is explicitly allowed. In such cases, for obvious reasons, care must be taken to ensure that each nested mulitpart message should use a different boundary delimiter. See the example in the following section. Overall, the body of a multipart message may be specified as follows: body := delimiter 1*encapsulation encapsulation := message CRLF delimiter delimiter := "--" CRLF message = 5 A Complex Multipart Example What follows is the outline of a complex multipart message. This message has three parts to be displayed serially: an introductory plain text (MAILASCII) part, an embedded multipart message, and a closing "rich text" part in SGML, which includes additional header fields to indicate that it originally came from a different sender. The embedded multipart message has two parts to be displayed in parallel, a picture and an audio fragment. From: ... Subject: ... Content-type: multipart; 1-s; tweedledum This is a multipart message. If you are reading this text, you might want to consider changing to a user agent that understands how to properly display multipart messages. --tweedledum ...Introductory text goes here... [Note that the preceding blank line means no header fields were given and this is MAILASCII.] --tweedledum Content-type: multipart; 1-p; tweedledee This is a multipart message. If you are reading this text, you might want to consider changing to a user agent that understands how to properly display multipart messages. --tweedledee Content-type: u-law; 8000 HZ; X-NEXT Content-Encoding: Hexadecimal ... hex-encoded NeXT-format audio data goes here.... --tweedledee Content-type: G3FAX Content-Encoding: Base64 ... base64-encoded FAX data goes here.... --tweedledee --tweedledum From: ... Subject: ... Content-type: SGML; null Content-Encoding: Quoted-printable ... Closing text goes here ... --tweedledum 6 The Encoded-Variable Header Field A particularly thorny problem, not addressed by the Content-Encoding header field specified earlier in this memo, is the problem of including data other than MAILASCII in a message header. It is tempting, to many, to simply declare that such inclusion is too problematic, and that message headers should always be entirely MAILASCII. After all, most of the information in the header is not intended for human consumption anyway. However, there are certain parts of the header that are intended entirely for human viewing, and these are the parts where MAILASCII is deemed most unsatisfactory. In particular, there is widespread desire to have the contents of the Subject field and the names of message senders and recipients appear in languages that cannot be represented in MAILASCII. The heart of the problem is the fact that RFC822 prescribes a great deal of syntax and semantics for the message header area, all of it based on MAILASCII. Tampering with this, it would seem, could introduce a great deal of complexity, as well as bugs involving backward compatibility. Instead, this memo proposes a mechanism by which the header area remains entirely MAILASCII, but encodes non-MAILASCII information in a manner from which it can easily be restored by conforming user agents. The basic idea is that, in certain parts of the headers which are never machine-interpreted, the human-readable data might best be represented in a content-type other than MAILASCII. In such cases, the data are to be represented, in the header field, by a "variable reference" -- a placeholder for a value defined elsewhere in the message header area. The variables are defined by one or more "Encoded-Variable" headers, with a syntax as specified below. Thus, for example, if a user's name includes characters that cannot be represented in MAILASCII, it can be replaced by the name of a variable that is defined elsewhere. To improve readability by UA's that only handle MAILASCII, it is recommended that the variable name itself be as close an approximation as possible to the correct name. Thus, for example, one might have; From: $Keld_JXrn_Simonsen Encoded-Variable: Keld_JXrn_Simonsen = quoted-printable, iso646, Keld_J&0Crn_Simonsen *** NOTE: It would be nice to get the character set & hex code right for the above example. Where multiple variables need to be defined, multiple Encoded-Variable header fields may be used. It is important to constrain the use of encoded-variables to places where they will not interfere with the established syntax or semantics of header fields. For that reason, their use is explicitly restricted to the Subject and Comments header fields, and to the "phrase" portion of RFC 822 addresses. This implies a small redefinition of RFC 822's "optional-field", "mailbox", and "group" syntax: optional-field = / "Message-ID" ":" msg-id / "Resent-Message-ID" ":" msg-id / "In-Reply-To" ":" *(phrase / msg-id) / "References" ":" *(phrase / msg-id) / "Keywords" ":" #phrase / "Subject" ":" var-text / "Comments" ":" var-text / "Encrypted" ":" 1#2word / extension-field ; To be defined / user-defined-field ; May be pre-empted mailbox = addr-spec ; simple address / var-phrase route-addr ; name & addr-spec group = var-phrase ":" [#mailbox] ";" The two new syntactic entities, "var-text" and "var-phrase", are defined as follows: var-text = *text / var-ref var-phrase = phrase / var-ref var-ref = "$" var-name var-name = atom NOTE that the definition of "atom" permits underscores, but not spaces or any other "specials" as defined by RFC 822. Note also that this does not actually change the legal syntax defined by RFC 822, because a "var-ref" is itself a valid instance of "phrase" or "*text". Thus, no correct existing parsers should be broken by the new definitions. However, the old parsers will not recognize a difference between a var-ref and any other instance of *text or phrase, and will therefore not do any variable substitution. The syntax of the Encoded-Variable field is defined as follows: Encoded-variable = var-name "=" Content-Encoding "," Content-Type "," var-contents var-contents = *text Here the var-contents is the encoded value of the variable, of a type given by Content-Type and encoded with the encoding given in Content-Encoding. Both a Content-Type and a Content-Encoding are required for each Encoded-Variable header field. 7 Cross-references Between Encapsulated Parts Within a multipart message, as defined above, there is essentially no cross-encapsulation structure. However, multimedia mail systems such as Andrew [REF-ATK] have demonstrated the value of inter-part reference. All that is necessary, in order to make a multipart scheme work, is a mechanism to allow one encapsulated part to make reference to another. Some have proposed the use of a new "Content-Label" header field within the encapsulated parts, in order to give each part a name by which it can be referenced. However, this is not necessary, as the established Message-ID header field can in fact be used for precisely this purpose. Each encapsulated part can include a Message-ID header field, which can then be used for reference purposes by related body parts. 8 Optional Content-size Header Field In the discussions of earlier drafts of this memo, some people indicated a strong preference for using a size-counting scheme to delimit the boundaries between encapsulated parts of multipart messages. This was rejected because such schemes are not, in general, sufficiently robust across the SMTP transport layer. For example, line counts can be altered by line-wrapping MTA's, and byte counts can be altered in any number of ways. However, there are restricted environments in which either or both of these counts can be relied upon, and in such environments it may be desirable to implement a count-based approach to delimiters. Therefore this memo specifies a conventional way to do this, in order to promote interoperability among systems that are able to take this approach. In such cases, boundary delimiters, as defined above, are still required. However, the header area of an encapsulated part may include an optional Content-Size header which indicates where the encapsulated part ends, if its size has not been altered. The size may be measured in either bytes or lines. Those who use the Content-Size header field should still preserve the encapsulation boundaries, and should recognize that other agents are free to ignore it in favor of complete reliance on encapsulation boundaries. The Content-Size header field is defined as follows: Content-Size = 1*DIGIT "lines" / 1*DIGIT "bytes" Note that each encapsulated part should still end with a newline followed by an encapsulation boundary. However, a message store that wishes, for example, to use a storage format that is largely RFC 822-compliant, but includes binary storage of binary objects, can use the Content-Size header field to indicate whether or not the final newline is to be interpreted as part of the binary object. If the newline follows the number of bytes specified for the encapsulation, then it is not part of the encapsulation. The size given by the Content-Size header field is the size of the encapsulation's body only, not counting the blank line that separates the header from the body. In other words, the four bytes CRLF CRLF, which separate header from body, are NOT counted as part of the content-size. 9 Summary Using the Content-Type and Content-Encoding header fields, it is possible to include, in a standardized way, arbitrary types of data objects in RFC 822 mail messages, without breaking any of the existing restrictions imposed by RFC 821 and RFC 822. Using the "mulitpart" content-type, it is possible to mix multiple objects of different types in a single message. The additional optional header field, Content-Size provides a conventional mechanism for an extension deemed desirable by many implementors. Finally, a limited mechanism is provided for including non-MAILASCII data in certain RFC 822 header fields. For more information, the authors of this document may be contacted via Internet mail: Nathaniel Borenstein Ned Freed 10 Acknowledgements This RFC is the result of the collective effort of a large number of people, at several IETF meetings and on the IETF-SMTP and IETF-822 mailing lists. Although any enumeration seems doomed to suffer from egregious omissions, the following are among the many contributors to this effort: Harald Alvestrand, Kevin Carosso, Mark Crispin, Dave Crocker, Walt Daniels, Kevin Donnelly, Johnny Eriksson, Craig Everhart, Bruce Howard, Risto Kankkunen, Neil Katin, Steve Kille, Anders Klemets, John Klensin, Vincent Lau, Timo Lehtinen, Rick McGowan, Mark Needleman, John Noerenberg, David Robinson, Jonathan Rosenberg, Jan Rynning, Mark Sherman, Keld Simonsen, Einar Stefferud, Michael Stein, Robert Ullman, Stuart Vance, Erik van der Poel, Greg Vaudreuil, Brian Wideen, Glenn Wright, and David Zimmerman. The authors apologize for any omissions from this list, which were certainly unintentional. 11 References [REF-PS] Adobe Systems, Inc. Postscript Language Reference Manual. Addison-Wesley, Reading, Mass., 1985. [REF-SGML] ISO TC97/SC18. Standard Generalized Markup Language. Tech. Rept. DIS 8879, ISO, 1986. [REF-TEX] Knuth, Donald E. The TEXbook. Addison-Wesley, Reading, Mass., 1984. [REF-TROFF] Ossanna, Joseph F. NROFF/TROFF User's Manual. Bell Laboratories, Murray Hill, New Jersey, 1976. Computing Science Technical Report No.54. [REF-SCRIBE] Unilogic. SCRIBE Document Production Software. Unilogic, 1985. Fourth Edition. [REF-ISO646] International Standard--Information Processing--ISO 7-bit coded character set for information interchange, ISO 646:1983. [REF-7BIT] International Standard--Information Processing--ISO 7-bit and 8-bit coded character sets--Code extension techniques, ISO 2022:1986. [REF-ANSI] Coded Character Set--7-Bit American National Standard Code for Information Interchange, ANSI X3.4-1986. [REF-X400] Schicker, Pietro, "Message Handling Systems, X.400", Message Handling Systems and Distributed Applications, E. Stefferud, O-j. Jacobsen, and P. Schicker, eds., North-Holland, 1989, pp. 3-41. [RFC-821] Postel, J.B. Simple Mail Transfer Protocol. August, 1982, Network Information Center, RFC-821. [RFC-822] Crocker, D. Standard for the format of ARPA Internet text messages. August, 1982, Network Information Center, RFC-822. [RFC-934] Rose, M.T.; Stefferud, E.A. Proposed standard for message encapsulation. January, 1985, Network Information Center, RFC-934. [RFC-1049] Sirbu, M.A. Content-type header field for Internet messages. March, 1988, Network Information Center, RFC-1049. [RFC-1113] Linn, J. Privacy enhancement for Internet electronic mail: Part I - message encipherment and authentication procedures [Draft]. August, 1989, Network Information Center, RFC-1113. [RFC-1148] Kille, S.E. Mapping between X.400(1988) / ISO 10021 and RFC 822. March, 1990, Network Information Center, RFC-1148. [RFC-1154] Robinson, D.; Ullmann, R. Encoding header field for internet messages. April, 1990, Network Information Center, RFC-1154. [REF-ATK] Borenstein, Nathaniel S., Multimedia Applications Development with the Andrew Toolkit, Prentice Hall, 1990. [REF-CCITT84c] CCITT SG 5/VII, "Recommendations X.420," Message Handling Systems: Interpersonal Messaging User Agent Layer, October 1984. [REF-CCITT/ISO88b] CCITT/ISO, "CCITT Recommendations X.420/ ISO IS 10021-7", Message Handling Systems: Interpersonal Messaging System, [REF-ODA] ************** [REF-ULAW] *************** [REF-ALAW] *************** [REF-DES] **************** [REF-PBM] **************** Appendix A -- The Character Set for the MAILASCII Content-Type As stated in this document, the MAILASCII content-type is based on a series of standards and on the historical standard practice in the Internet mail community. However, the precise meaning of this content-type has been the subject of some debate. In this appendix, therefore, we define the MAILASCII content-type. It is our belief that this definition corresponds with the default assumptions made for messages without Content-type headers as defined by RFC 822. The message body is coded in the character set of American National Standard Code for Information Interchange, sometimes known as "7-bit ASCII" [REF-7BIT]. This is not an arbitrary seven-bit character code, but indicates that the message body uses character coding that uses the exact correspondence of codes to characters specified in ASCII. National use variations on ISO646 [REF-ISO646] are not ASCII, and neither an explicit "ASCII" content type, nor "MAILASCII", nor the default (omission of a content-type) should be used when characters are coded using them. (Discussion: RFC821 very explicitly specifies "ASCII", and references an earlier version of the American National Standard cited in [REF-ANSI]. Whether that specification, rather than a reference to an International Standard, was done deliberately or out of convenience or ignorance, is no longer interesting: insofar as one of the purposes of specifying a content-type is to permit the receiver to unambiguously determine how the sender intended the coded message to be interpreted, assuming anything other than "strict ASCII" as the default would risk unintentional and incompatible changes to the semantics of messages now being transmitted. This also implies that messages containing characters coded according to national variations on ISO646, or using code-switching procedures (e.g., those of ISO2022), as well as 8-bit or multiple octet character encodings MUST use an appropriate content-type to be consistent with this specification.) Because of the restriction imposed on message bodies by RFC 822 and, in practice, by Message Transport Agents that are more-or-less compliant with RFC 821, implementors should be careful in several ways regarding MAILASCII text: (1) Delimiters other than CR-LF pairs may be used in the local representation of a message on some systems. The persistence of CR-LF pairs should not be relied on. (2) Isolated CR and LF characters are not well tolerated in general; they may be lost or converted to delimiters on some systems, and hence should not be relied on. (3) TAB characters may be misinterpreted or may be automatically converted to variable numbers of spaces. This is unavoidable in some environments, notably those not based on the ASCII character set. Such conversion is STRONGLY DISCOURAGED, but it may occur, and users of MAILASCII format should not rely on the persistence of TAB characters. (4) Lines longer than 80 characters may be wrapped in some environments. Line wrapping is STRONGLY DISCOURAGED, but unavoidable in some cases. Applications which depend on lines not being wrapped should use mechanisms other than unencoded MAILASCII bodyparts to transmits messages. (5) Trailing "white space" characters (SPACE, TAB, etc.) on a line may be discarded by some transport agents, and hence should not be relied on. See RFC 821, RFC 822, and RFC1113 for additional information about canonical SMTP formats. Authors of software which composes "MAILASCII" in compliance with this RFC should be well-acquainted with SMTP formats. The complete MAILASCII character set is listed below: ***** CONTROL CHARS???? 0 nul 16 dle 32 sp 48 0 64 @ 80 P 96 ` 112 p 1 soh 17 dc1 33 ! 49 1 65 A 81 Q 97 a 113 q 2 stx 18 dc2 34 " 50 2 66 B 82 R 98 b 114 r 3 etx 19 dc3 35 # 51 3 67 C 83 S 99 c 115 s 4 eot 20 dc4 36 $ 52 4 68 D 84 T 100 d 116 t 5 enq 21 nak 37 % 53 5 69 E 85 U 101 e 117 u 6 ack 22 syn 38 & 54 6 70 F 86 V 102 f 118 v 7 bel 23 etb 39 ' 55 7 71 G 87 W 103 g 119 w 8 bs 24 can 40 ( 56 8 72 H 88 X 104 h 120 x 9 ht 25 em 41 ) 57 9 73 I 89 Y 105 i 121 y 10 nl 26 sub 42 * 58 : 74 J 90 Z 106 j 122 z 11 vt 27 esc 43 + 59 ; 75 K 91 [ 107 k 123 { 12 np 28 fs 44 , 60 < 76 L 92 \ 108 l 124 | 13 cr 29 gs 45 - 61 = 77 M 93 ] 109 m 125 } 14 so 30 rs 46 . 62 > 78 N 94 ^ 110 n 126 ~ 15 si 31 us 47 / 63 ? 79 O 95 _ 111 o 127 del From owner-ietf-822@dimacs.rutgers.edu Mon Apr 22 15:06:50 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA12817; Mon, 22 Apr 91 13:21:26 EDT Received: from thumper.bellcore.com by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA12748; Mon, 22 Apr 91 13:20:39 EDT Received: from greenbush.bellcore.com by thumper.bellcore.com (4.1/4.7) id for ietf-822@dimacs.rutgers.edu; Mon, 22 Apr 91 13:20:22 EDT Received: by greenbush.bellcore.com (4.12/4.7) id for ietf-822@dimacs.rutgers.edu; Mon, 22 Apr 91 13:23:26 edt Date: Mon, 22 Apr 91 13:23:26 edt From: nsb@thumper.bellcore.com (Nathaniel Borenstein) Message-Id: <9104221723.AA06290@greenbush.bellcore.com> To: ietf-822@dimacs.rutgers.edu Subject: PostScript verison of Draft RFC Content-Type: PostScript %!PS-Adobe-1.0 %%Creator: greenbush:nsb (Nathaniel Borenstein,MRE 2A-274,4270,9938586,21462) %%Title: stdin (ditroff) %%CreationDate: Mon Apr 22 13:10:37 1991 %%EndComments % lib/psdit.pro -- prolog for psdit (ditroff) files % Copyright (c) 1984, 1985 Adobe Systems Incorporated. All Rights Reserved. % last edit: shore Sat Nov 23 20:28:03 1985 % RCSID: $Header: psdit.pro,v 2.1 85/11/24 12:19:43 shore Rel $ /$DITroff 140 dict def $DITroff begin /fontnum 1 def /fontsize 10 def /fontheight 10 def /fontslant 0 def /xi {0 72 11 mul translate 72 resolution div dup neg scale 0 0 moveto /fontnum 1 def /fontsize 10 def /fontheight 10 def /fontslant 0 def F /pagesave save def}def /PB{save /psv exch def currentpoint translate resolution 72 div dup neg scale 0 0 moveto}def /PE{psv restore}def /arctoobig 90 def /arctoosmall .05 def /m1 matrix def /m2 matrix def /m3 matrix def /oldmat matrix def /tan{dup sin exch cos div}def /point{resolution 72 div mul}def /dround {transform round exch round exch itransform}def /xT{/devname exch def}def /xr{/mh exch def /my exch def /resolution exch def}def /xp{}def /xs{docsave restore end}def /xt{}def /xf{/fontname exch def /slotno exch def fontnames slotno get fontname eq not {fonts slotno fontname findfont put fontnames slotno fontname put}if}def /xH{/fontheight exch def F}def /xS{/fontslant exch def F}def /s{/fontsize exch def /fontheight fontsize def F}def /f{/fontnum exch def F}def /F{fontheight 0 le {/fontheight fontsize def}if fonts fontnum get fontsize point 0 0 fontheight point neg 0 0 m1 astore fontslant 0 ne{1 0 fontslant tan 1 0 0 m2 astore m3 concatmatrix}if makefont setfont .04 fontsize point mul 0 dround pop setlinewidth}def /X{exch currentpoint exch pop moveto show}def /N{3 1 roll moveto show}def /Y{exch currentpoint pop exch moveto show}def /S{show}def /ditpush{}def/ditpop{}def /AX{3 -1 roll currentpoint exch pop moveto 0 exch ashow}def /AN{4 2 roll moveto 0 exch ashow}def /AY{3 -1 roll currentpoint pop exch moveto 0 exch ashow}def /AS{0 exch ashow}def /MX{currentpoint exch pop moveto}def /MY{currentpoint pop exch moveto}def /MXY{moveto}def /cb{pop}def % action on unknown char -- nothing for now /n{}def/w{}def /p{pop showpage pagesave restore /pagesave save def}def /abspoint{currentpoint exch pop add exch currentpoint pop add exch}def /distance{dup mul exch dup mul add sqrt}def /dstroke{currentpoint stroke moveto}def /Dl{2 copy gsave rlineto stroke grestore rmoveto}def /arcellipse{/diamv exch def /diamh exch def oldmat currentmatrix pop currentpoint translate 1 diamv diamh div scale /rad diamh 2 div def currentpoint exch rad add exch rad -180 180 arc oldmat setmatrix}def /Dc{dup arcellipse dstroke}def /De{arcellipse dstroke}def /Da{/endv exch def /endh exch def /centerv exch def /centerh exch def /cradius centerv centerv mul centerh centerh mul add sqrt def /eradius endv endv mul endh endh mul add sqrt def /endang endv endh atan def /startang centerv neg centerh neg atan def /sweep startang endang sub dup 0 lt{360 add}if def sweep arctoobig gt {/midang startang sweep 2 div sub def /midrad cradius eradius add 2 div def /midh midang cos midrad mul def /midv midang sin midrad mul def midh neg midv neg endh endv centerh centerv midh midv Da currentpoint moveto Da} {sweep arctoosmall ge {/controldelt 1 sweep 2 div cos sub 3 sweep 2 div sin mul div 4 mul def centerv neg controldelt mul centerh controldelt mul endv neg controldelt mul centerh add endh add endh controldelt mul centerv add endv add centerh endh add centerv endv add rcurveto dstroke} {centerh endh add centerv endv add rlineto dstroke}ifelse}ifelse}def /Barray 200 array def % 200 values in a wiggle /D~{mark}def /D~~{counttomark Barray exch 0 exch getinterval astore /Bcontrol exch def pop /Blen Bcontrol length def Blen 4 ge Blen 2 mod 0 eq and {Bcontrol 0 get Bcontrol 1 get abspoint /Ycont exch def /Xcont exch def Bcontrol 0 2 copy get 2 mul put Bcontrol 1 2 copy get 2 mul put Bcontrol Blen 2 sub 2 copy get 2 mul put Bcontrol Blen 1 sub 2 copy get 2 mul put /Ybi /Xbi currentpoint 3 1 roll def def 0 2 Blen 4 sub {/i exch def Bcontrol i get 3 div Bcontrol i 1 add get 3 div Bcontrol i get 3 mul Bcontrol i 2 add get add 6 div Bcontrol i 1 add get 3 mul Bcontrol i 3 add get add 6 div /Xbi Xcont Bcontrol i 2 add get 2 div add def /Ybi Ycont Bcontrol i 3 add get 2 div add def /Xcont Xcont Bcontrol i 2 add get add def /Ycont Ycont Bcontrol i 3 add get add def Xbi currentpoint pop sub Ybi currentpoint exch pop sub rcurveto }for dstroke}if}def end /ditstart{$DITroff begin /nfonts 60 def % NFONTS makedev/ditroff dependent! /fonts[nfonts{0}repeat]def /fontnames[nfonts{()}repeat]def /docsave save def }def % character outcalls /oc {/pswid exch def /cc exch def /name exch def /ditwid pswid fontsize mul resolution mul 72000 div def /ditsiz fontsize resolution mul 72 div def ocprocs name known{ocprocs name get exec}{name cb} ifelse}def /fractm [.65 0 0 .6 0 0] def /fraction {/fden exch def /fnum exch def gsave /cf currentfont def cf fractm makefont setfont 0 .3 dm 2 copy neg rmoveto fnum show rmoveto currentfont cf setfont(\244)show setfont fden show grestore ditwid 0 rmoveto} def /oce {grestore ditwid 0 rmoveto}def /dm {ditsiz mul}def /ocprocs 50 dict def ocprocs begin (14){(1)(4)fraction}def (12){(1)(2)fraction}def (34){(3)(4)fraction}def (13){(1)(3)fraction}def (23){(2)(3)fraction}def (18){(1)(8)fraction}def (38){(3)(8)fraction}def (58){(5)(8)fraction}def (78){(7)(8)fraction}def (sr){gsave 0 .06 dm rmoveto(\326)show oce}def (is){gsave 0 .15 dm rmoveto(\362)show oce}def (->){gsave 0 .02 dm rmoveto(\256)show oce}def (<-){gsave 0 .02 dm rmoveto(\254)show oce}def (==){gsave 0 .05 dm rmoveto(\272)show oce}def end % an attempt at a PostScript FONT to implement ditroff special chars % this will enable us to % cache the little buggers % generate faster, more compact PS out of psdit % confuse everyone (including myself)! 50 dict dup begin /FontType 3 def /FontName /DIThacks def /FontMatrix [.001 0 0 .001 0 0] def /FontBBox [-260 -260 900 900] def% a lie but ... /Encoding 256 array def 0 1 255{Encoding exch /.notdef put}for Encoding dup 8#040/space put %space dup 8#110/rc put %right ceil dup 8#111/lt put %left top curl dup 8#112/bv put %bold vert dup 8#113/lk put %left mid curl dup 8#114/lb put %left bot curl dup 8#115/rt put %right top curl dup 8#116/rk put %right mid curl dup 8#117/rb put %right bot curl dup 8#120/rf put %right floor dup 8#121/lf put %left floor dup 8#122/lc put %left ceil dup 8#140/sq put %square dup 8#141/bx put %box dup 8#142/ci put %circle dup 8#143/br put %box rule dup 8#144/rn put %root extender dup 8#145/vr put %vertical rule dup 8#146/ob put %outline bullet dup 8#147/bu put %bullet dup 8#150/ru put %rule dup 8#151/ul put %underline pop /DITfd 100 dict def /BuildChar{0 begin /cc exch def /fd exch def /charname fd /Encoding get cc get def /charwid fd /Metrics get charname get def /charproc fd /CharProcs get charname get def charwid 0 fd /FontBBox get aload pop setcachedevice 2 setlinejoin 40 setlinewidth newpath 0 0 moveto gsave charproc grestore end}def /BuildChar load 0 DITfd put %/UniqueID 5 def /CharProcs 50 dict def CharProcs begin /space{}def /.notdef{}def /ru{500 0 rls}def /rn{0 840 moveto 500 0 rls}def /vr{0 800 moveto 0 -770 rls}def /bv{0 800 moveto 0 -1000 rls}def /br{0 750 moveto 0 -1000 rls}def /ul{0 -140 moveto 500 0 rls}def /ob{200 250 rmoveto currentpoint newpath 200 0 360 arc closepath stroke}def /bu{200 250 rmoveto currentpoint newpath 200 0 360 arc closepath fill}def /sq{80 0 rmoveto currentpoint dround newpath moveto 640 0 rlineto 0 640 rlineto -640 0 rlineto closepath stroke}def /bx{80 0 rmoveto currentpoint dround newpath moveto 640 0 rlineto 0 640 rlineto -640 0 rlineto closepath fill}def /ci{500 360 rmoveto currentpoint newpath 333 0 360 arc 50 setlinewidth stroke}def /lt{0 -200 moveto 0 550 rlineto currx 800 2cx s4 add exch s4 a4p stroke}def /lb{0 800 moveto 0 -550 rlineto currx -200 2cx s4 add exch s4 a4p stroke}def /rt{0 -200 moveto 0 550 rlineto currx 800 2cx s4 sub exch s4 a4p stroke}def /rb{0 800 moveto 0 -500 rlineto currx -200 2cx s4 sub exch s4 a4p stroke}def /lk{0 800 moveto 0 300 -300 300 s4 arcto pop pop 1000 sub 0 300 4 2 roll s4 a4p 0 -200 lineto stroke}def /rk{0 800 moveto 0 300 s2 300 s4 arcto pop pop 1000 sub 0 300 4 2 roll s4 a4p 0 -200 lineto stroke}def /lf{0 800 moveto 0 -1000 rlineto s4 0 rls}def /rf{0 800 moveto 0 -1000 rlineto s4 neg 0 rls}def /lc{0 -200 moveto 0 1000 rlineto s4 0 rls}def /rc{0 -200 moveto 0 1000 rlineto s4 neg 0 rls}def end /Metrics 50 dict def Metrics begin /.notdef 0 def /space 500 def /ru 500 def /br 0 def /lt 416 def /lb 416 def /rt 416 def /rb 416 def /lk 416 def /rk 416 def /rc 416 def /lc 416 def /rf 416 def /lf 416 def /bv 416 def /ob 350 def /bu 350 def /ci 750 def /bx 750 def /sq 750 def /rn 500 def /ul 500 def /vr 0 def end DITfd begin /s2 500 def /s4 250 def /s3 333 def /a4p{arcto pop pop pop pop}def /2cx{2 copy exch}def /rls{rlineto stroke}def /currx{currentpoint pop}def /dround{transform round exch round exch itransform} def end end /DIThacks exch definefont pop ditstart (psc)xT 576 1 1 xr 1(Times-Roman)xf 1 f 2(Times-Italic)xf 2 f 3(Times-Bold)xf 3 f 4(Times-BoldItalic)xf 4 f 5(Helvetica)xf 5 f 6(Helvetica-Bold)xf 6 f 7(Courier)xf 7 f 8(Courier-Bold)xf 8 f 9(Symbol)xf 9 f 10(DIThacks)xf 10 f 10 s 1 f xi %%EndProlog %%Page: 1 1 10 s 0 xH 0 xS 1 f 12 s 720 688(Network)N 1080(Working)X 1446(Group)X 1715(--)X 1803(Request)X 2137(for)X 2273(Comments:)X 2741(XXXX)X 16 s 1771 944(A)N 1895(M)X 2009(ultipart)X 2413(Content-Type)X 1554 1088(and)N 1771(Content-Encoding)X 2743(M)X 2857(echanism)X 1859 1232(for)N 2041(RFC)X 2314(822)X 2538(M)X 2652(essages)X 2 f 12 s 1850 1488(Nathaniel)N 2254(Borenstein,)X 2719(Bellcore)X 2062 1600(Ned)N 2241(Freed,)X 2519(Innosoft)X 2241 1824(April)N 2463(1991)X 1 f 3 f 14 s 600 2176(Status)N 928(of)X 1050(This)X 1289(M)X 1395(emo)X 1 f 12 s 720 2416(This)N 922(RFC)X 1134(suggests)X 1490(extensions)X 1927(to)X 2033(the)X 2182(RFC)X 2394(822)X 2569(message)X 2926(representation)X 3503(protocol)X 3855(to)X 3962(allow)X 720 2528(multi-part)N 1148(textual)X 1452(and)X 1632(non-textual)X 2112(messages)X 2516(to)X 2632(be)X 2764(represented)X 3249(and)X 3429(exchanged)X 3882(without)X 720 2640(loss)N 908(of)X 1027(information.)X 1545(Discussion)X 2005(and)X 2183(suggestions)X 2670(for)X 2821(improvements)X 3411(are)X 3569(welcome.)X 4005(This)X 720 2752(memo)N 985(does)X 1185(not)X 1332(specify)X 1634(an)X 1749(Internet)X 2073(standard.)X 2471(Distribution)X 2960(of)X 3064(this)X 3227(memo)X 3492(is)X 3580(unlimited.)X 720 2976(If)N 808(this)X 971(RFC)X 1176(becomes)X 1537(a)X 1604(standard,)X 1978(it)X 2056(would)X 2320(affect)X 2564(the)X 2706(following)X 3104(other)X 3326(RFC's:)X 3 f 720 3200(Would)N 1021(Obsolete:)X 1 f 1459(RFC)X 1664(934,)X 1856(RFC)X 2061(1049,)X 2301(RFC)X 2506(1154)X 3 f 720 3312(Would)N 1021(Update:)X 1 f 1447(RFC)X 1652(822)X 3 f 720 3424(Would)N 1021(Affect:)X 1 f 1400(RFC)X 1605(1148)X 3 f 14 s 600 3664(Table)N 901(of)X 1023(Contents)X 1 f 12 s 1008 3904(Introduction)N 1008 4016(The)N 1182(Content-Type)X 1741(Header)X 2043(Field)X 1008 4128(The)N 1182(Content-Encoding)X 1912(Header)X 2214(Field)X 1008 4240(Quoted-Printable)N 1695(Content-Encoding)X 1008 4352(Quoted-Printable)N 1695(Content-Encoding)X 1008 4464(Base64)N 1315(Content-Encoding)X 1008 4576(The)N 1182("Multipart")X 1648(Content-Type)X 1008 4688(A)N 1101(Complex)X 1478(Multipart)X 1866(Example)X 1008 4800(The)N 1182(Encoded-Variable)X 1907(Header)X 2209(Field)X 1008 4912(Cross-References)N 1710(Between)X 2071(Encapsulated)X 2609(Parts)X 1008 5024(Optional)N 1369(Content-Size)X 1896(Header)X 2198(Field)X 1008 5136(Summary)N 1008 5248(Acknowledgements)N 1008 5360(References)N 1008 5472(Appendix)N 1411(A:)X 1555(The)X 1729(Character)X 2128(Set)X 2275(for)X 2411(the)X 2553(MAILASCII)X 3072(Content-Type)X 2 p %%Page: 2 2 12 s 0 xH 0 xS 1 f 2368 400(-)N 2424(2)X 2496(-)X 3 f 14 s 720 704(1)N 1104(Introduction)X 1 f 12 s 720 944(One)N 909(of)X 1018(the)X 1165(limitations)X 1607(of)X 1716(RFC)X 1927(821/822)X 2272(based)X 2521(mail)X 2723(systems)X 3057(is)X 3151(the)X 3299(fact)X 3474(that)X 3649(they)X 3845(limit)X 4058(the)X 720 1056(contents)N 1079(of)X 1197(electronic)X 1616(mail)X 1825(messages)X 2225(to)X 2337(relatively)X 2739(short)X 2968(lines)X 3187(of)X 3304(seven-bit)X 3694(ASCII.)X 4005(This)X 720 1168(forces)N 986(a)X 1060(user)X 1251(to)X 1357(convert)X 1677(any)X 1847(non-textual)X 2317(data)X 2509(that)X 2685(she)X 2844(may)X 3041(wish)X 3253(to)X 3359(send)X 3566(into)X 3748(a)X 3823(seven-bit)X 720 1280(ASCII)N 1033(representation)X 1641(before)X 1949(invoking)X 2353(her)X 2538(local)X 2788(mail)X 3022(UA)X 3222(\(User)X 3497(Agent)X 3794(program\).)X 720 1392(Examples)N 1162(of)X 1304(encodings)X 1756(currently)X 2166(used)X 2404(in)X 2541(the)X 2721(Internet)X 3083(include)X 3430(pure)X 3664(hexadecimal,)X 720 1504(uuencode,)N 1166(the)X 1337(3-in-4)X 1625(base)X 1849(64)X 1998(scheme)X 2339(speci\256ed)X 2733(in)X 2860(RFC)X 3093(1113,)X 3361(the)X 3531(Andrew)X 3892(Toolkit)X 720 1616(Representation)N 1322([REF-ATK],)X 1839(and)X 2002(many)X 2240(others.)X 720 1840(This)N 925(limitation)X 1335(becomes)X 1706(even)X 1922(more)X 2154(apparent)X 2520(as)X 2634(gateways)X 3026(are)X 3178(designed)X 3554(to)X 3663(allow)X 3911(for)X 4058(the)X 720 1952(exchange)N 1109(of)X 1214(mail)X 1411(messages)X 1798(between)X 2143(RFC)X 2348(822)X 2516(hosts)X 2737(and)X 2900(X.400)X 3161(hosts.)X 3430(X.400)X 3691([REF-X400])X 720 2064(speci\256es)N 1084(mechanisms)X 1593(for)X 1738(the)X 1889(inclusion)X 2275(of)X 2389(non-textual)X 2862(body)X 3088(parts)X 3309(within)X 3589(electronic)X 4004(mail)X 720 2176(messages.)N 1182(The)X 1383(current)X 1707(standards)X 2121(for)X 2284(the)X 2453(mapping)X 2841(of)X 2972(X.400)X 3260(messages)X 3674(to)X 3800(RFC)X 4032(822)X 720 2288(messages)N 1119(specify)X 1433(that)X 1614(either)X 1870(X.400)X 2143(non-textual)X 2618(body)X 2847(parts)X 3071(should)X 3364(be)X 3492(converted)X 3909(to)X 4021(\(not)X 720 2400(encoded)N 1076(in\))X 1218(an)X 1344(ASCII)X 1628(format,)X 1943(or)X 2057(that)X 2236(they)X 2436(should)X 2726(be)X 2851(discarded,)X 3278(notifying)X 3665(the)X 3817(RFC)X 4032(822)X 720 2512(user)N 909(that)X 1083(discarding)X 1514(has)X 1672(occurred.)X 2087(This)X 2288(is)X 2382(clearly)X 2675(undesirable,)X 3173(as)X 3283(information)X 3768(that)X 3943(a)X 4016(user)X 720 2624(may)N 923(wish)X 1141(to)X 1253(receive)X 1568(is)X 1668(lost.)X 1891(Even)X 2125(though)X 2428(a)X 2507(user's)X 2772(UA)X 2946(may)X 3148(not)X 3307(have)X 3525(the)X 3679(capability)X 4096(of)X 720 2736(dealing)N 1032(with)X 1231(the)X 1377(non-textual)X 1844(body)X 2064(part,)X 2266(the)X 2412(user)X 2600(might)X 2853(have)X 3063(some)X 3294(mechanism)X 3761(external)X 4101(to)X 720 2848(the)N 875(UA)X 1050(that)X 1232(can)X 1403(extract)X 1703(useful)X 1975(information)X 2467(from)X 2691(the)X 2846(body)X 3075(part.)X 3310(Moreover,)X 3750(it)X 3841(does)X 4053(not)X 720 2960(allow)N 973(for)X 1124(the)X 1281(fact)X 1465(that)X 1649(the)X 1806(message)X 2171(may)X 2376(eventually)X 2817(be)X 2947(gatewayed)X 3398(back)X 3619(into)X 3808(an)X 3939(X.400)X 720 3072(MHS,)N 975(where)X 1234(the)X 1376(non-textual)X 1839(information)X 2318(would)X 2582(de\256nitely)X 2970(become)X 3294(useful)X 3553(again.)X 720 3296(In)N 828(devising)X 1182(an)X 1301(encapsulation)X 1859(scheme,)X 2200(two)X 2372(things)X 2635(must)X 2850(be)X 2969(considered:)X 3442(how)X 3636(to)X 3740(convert)X 4058(the)X 720 3408(non-textual)N 1197(data)X 1396(to)X 1509(a)X 1590(representation)X 2174(which)X 2446(may)X 2649(be)X 2777(transmitted)X 3248(over)X 3456(a)X 3536(seven-bit)X 3926(SMTP)X 720 3520(connection)N 1175(without)X 1501(loss)X 1682(of)X 1794(data,)X 2011(and)X 2182(how)X 2379(to)X 2486(preserve)X 2844(information)X 3331(about)X 3577(the)X 3727(structure)X 4096(of)X 720 3632(the)N 862(data)X 1047(itself.)X 1312(This)X 1507("structural")X 1973(information)X 2452(must)X 2663(include,)X 2995(at)X 3089(a)X 3156(minimum,)X 3579(the)X 3721(type)X 3911(of)X 4015(data)X 720 3744(involved.)N 1120(This)X 1330(type)X 1535(information)X 2029(may)X 2234(be)X 2364(something)X 2805(recognized)X 3268(by)X 3404(many)X 3658(systems)X 4002(or)X 4122(it)X 720 3856(may)N 910(be)X 1025(some)X 1252(type)X 1442(of)X 1546(data)X 1731(speci\256c)X 2049(to)X 2148(a)X 2215(single)X 2469(operating)X 2857(system.)X 720 4080(This)N 936(memo)X 1222(describes)X 1625(several)X 1943(mechanisms)X 2464(that)X 2654(combine)X 3031(to)X 3151(solve)X 3400(these)X 3644(problems.)X 4096(In)X 720 4192(particular,)N 1139(it)X 1218(describes)X 1601(an)X 1717(encapsulation)X 2272(mechanism)X 2736(that)X 2905(may)X 3095(be)X 3210(used)X 3410(to)X 3509(describe)X 3854(multiple)X 720 4304(part)N 894(\("multipart"\))X 1414(messages.)X 1825(The)X 1999(parts)X 2210(themselves)X 2663(may)X 2854(contain)X 3163(textual)X 3451(or)X 3556(nontextual)X 3988(data;)X 720 4416(non-textual)N 1219(data)X 1440(is)X 1564(encoded)X 1945(in)X 2080(a)X 2183(form)X 2430(that)X 2635(can)X 2828(survive)X 3170(mailers)X 3513(unaware)X 3898(of)X 4037(this)X 720 4528(speci\256cation.)N 1263(This)X 1466(memo)X 1739(also)X 1926(de\256nes)X 2230(two)X 2406(RFC)X 2619(822)X 2795(header)X 3084(\256elds)X 3324(to)X 3431(be)X 3554(used)X 3762(to)X 3870(indicate)X 720 4640(the)N 871(inclusion)X 1257(of)X 1370(non-textual)X 1842(information)X 2330(in)X 2437(a)X 2512(mail)X 2716(message:)X 3101(Content-Type)X 3668(and)X 3839(Content-)X 720 4752(Encoding.)N 1168(Additionally,)X 1734(this)X 1928(memo)X 2224(proposes)X 2620(an)X 2767(Encoded-Variable)X 3524(header)X 3837(\256eld)X 4064(for)X 720 4864(including)N 1124(non-textual)X 1603(or)X 1723(international)X 2251(text)X 2436(information)X 2931(in)X 3046(certain)X 3348(parts)X 3574(of)X 3693(the)X 3850(message)X 720 4976(header)N 1015(area.)X 1262(Finally,)X 1597(this)X 1774(memo)X 2053(de\256nes)X 2363(an)X 2492(optional)X 2846(header)X 3141(\256eld,)X 3375(Content-Size,)X 3941(which)X 720 5088(may)N 910(be)X 1025(used)X 1225(within)X 1495(multipart)X 1873(messages.)X 3 p %%Page: 3 3 12 s 0 xH 0 xS 1 f 2368 400(-)N 2424(3)X 2496(-)X 3 f 14 s 720 704(2)N 1104(The)X 1318(Content-Type)X 2013(Header)X 2396(Field)X 1 f 12 s 720 944(The)N 905(Content-Type)X 1475(header)X 1767(\256eld)X 1974(was)X 2159(previously)X 2601(de\256ned)X 2920(in)X 3031(RFC)X 3248(1049,)X 3500(and)X 3675(is)X 3775(reaf\256rmed)X 720 1056(here.)N 967(The)X 1150(remainder)X 1574(of)X 1687(this)X 1859(section)X 2165(is)X 2261(derived)X 2582(from)X 2801(RFC)X 3014(1049,)X 3262(and,)X 3457(where)X 3724(different,)X 4112(is)X 720 1168(intended)N 1076(to)X 1175(supersede)X 1578(it.)X 720 1392(The)N 895(Content-type:)X 1474(header)X 1756(\256eld)X 1952(consists)X 2282(of)X 2388(up)X 2510(to)X 2611(four)X 2797(parameter)X 3209(values.)X 3529(The)X 3705(\256rst,)X 3904(or)X 4010(type)X 720 1504(parameter)N 1165(names)X 1470(the)X 1647(type,)X 1896(format,)X 2236(or)X 2375(structuring)X 2851(technique;)X 3311(the)X 3487(second,)X 3836(optional,)X 720 1616(parameter)N 1135(is)X 1228(a)X 1300(version)X 1612(number,)X 1959(ver-num,)X 2338(which)X 2602(indicates)X 2974(a)X 3046(particular)X 3445(version)X 3757(or)X 3866(revision)X 720 1728(of)N 841(the)X 1000(standardized)X 1528(format.)X 1873(The)X 2063(third)X 2285(parameter)X 2711(is)X 2815(a)X 2898(resource)X 3264(reference,)X 3687(resource-ref,)X 720 1840(which)N 995(may)X 1201(indicate)X 1547(a)X 1630(standard)X 1996(database)X 2369(of)X 2490(information)X 2986(to)X 3102(be)X 3234(used)X 3451(in)X 3567(interpreting)X 4058(the)X 720 1952(information.)N 1247(The)X 1421(last)X 1579(parameter)X 1989(is)X 2077(a)X 2144(comment.)X 720 2176(In)N 824(the)X 966(Extended)X 1354(BNF)X 1564(notation)X 1904(of)X 2008(RFC-822,)X 2413(we)X 2549(have:)X 7 f 10 s 720 2384(Content-Type:=)N 1440(type)X 1680([";")X 1920(ver-num)X 2304([";")X 2544(1#resource-ref]])X 1488 2480([comment])N 720 2672(ver-num:=)N 1440(local-part)X 720 2864(resource-ref:=)N 1488(local-part)X 720 3056(type)N 1056(:=)X 1200("POSTSCRIPT")X 1824(/)X 1200 3152("SCRIBE")N 1632(/)X 1200 3248("SGML")N 1536(/)X 1200 3344("TeX")N 1488(/)X 1200 3440("TROFF")N 1584(/)X 1200 3536("DVI")N 1488(/)X 1200 3632("ODA")N 1488(/)X 1200 3728("MULTIPART")N 1776(/)X 1200 3824("MAILASCII")N 1776(/)X 1200 3920(iso-charset-type)N 2016(/)X 1200 4016("U-LAW")N 1584(/)X 1200 4112("A-LAW")N 1584(/)X 1200 4208("PBM")N 1488(/)X 1200 4304("PGM")N 1488(/)X 1200 4400("PPM")N 1488(/)X 1200 4496("DES-MESSAGE")N 1872(/)X 1200 4592(x400-type)N 1680(/)X 1200 4688(x400-1984-type)N 1920(/)X 1200 4784(x400-1988-type)N 1920(/)X 1200 4880("X-"atom)N 720 5072(iso-charset-type)N 1536(:=)X 1680("ISO-IR-")X 2160(1*DIGIT)X 720 5264(x400-type)N 1200(:=)X 1344("IA5-Text")X 1872(/)X 2496(;)X 2592([0])X 2784(IA5Text,)X 3216(IA5TextBodyPart)X 1200 5360("Voice")N 1584(/)X 2352(;)X 2448([2])X 2640(Voice,)X 2976(VoiceBodyPart)X 1200 5456("G3-Fax")N 1632(/)X 2352(;)X 2448([3])X 2640(G3Fax,)X 2976(G3FacsimileBodyPart)X 1200 5552("Teletex")N 1680(/)X 2352(;)X 2448([5])X 2640(TTX,)X 2880(TeletexBodyPart)X 4 p %%Page: 4 4 10 s 0 xH 0 xS 7 f 1 f 12 s 2368 384(-)N 2424(4)X 2496(-)X 7 f 10 s 1200 672("Videotex")N 1728(/)X 2352(;)X 2448([6])X 2640(Videotex,)X 3120(VideotexBodyPart)X 1200 768("Nationally-Defined")N 2208(/)X 2352(;)X 2448([7])X 2640(NationallyDefined,)X 2352 864(;)N 2640(NationallyDefinedBodyPart)X 1200 960("Encrypted")N 1776(/)X 2352(;)X 2448([8])X 2640(Encrypted,)X 3168(EncryptedBodypart)X 1200 1056("Message")N 2352(;)X 2448([9])X 2640(ForwardedIPMessageMessage,)X 2352 1152(;)N 2640(MessageBodyPart)X 720 1344(x400-1984-type)N 1440(:=)X 1584("Telex")X 1968(/)X 2352(;)X 2448([1])X 2640(TLX)X 1584 1440("TIF0")N 1920(/)X 2352(;)X 2448([4])X 2640(TIF0)X 1584 1536("SFD")N 1872(/)X 2352(;)X 2448([10])X 2688(SFD)X 1584 1632("TIF1")N 2352(;)X 2448([11])X 2688(TIF1)X 720 1824(x400-1988-type)N 1440(:=)X 1584("G4-Class1")X 2160(/)X 2736(;)X 2832([4])X 3024(G4Class1BodyPart)X 1584 1920("Mixed-Mode")N 2208(/)X 2736(;)X 2832([11])X 3072(MixedMode)X 1584 2016("Bilaterally-Defined")N 2640(/)X 2736(;)X 2832([14])X 3072(BilaterallyDefined)X 1584 2112("Externally-Defined")N 2736(;)X 2832([15])X 3072(ExternallyDefined)X 1 f 12 s 720 2416(These)N 974(values)X 1244(are)X 1386(not)X 1533(case)X 1723(sensitive.)X 2132(POSTSCRIPT,)X 2739(Postscript,)X 3166(and)X 720 2528(POStscriPT)N 1230(are)X 1405(all)X 1559(equivalent.)X 2066(Additional)X 2535("standard")X 2996(Content-type)X 3557(values)X 3861(may)X 4085(be)X 720 2640(registered)N 1137(with)X 1345(Internet)X 1682(Assigned)X 2076(Numbers)X 2464(Coordinator)X 2965(at)X 3071(USC-ISI.)X 3490(Those)X 3761(wishing)X 4101(to)X 720 2752(register)N 1033(such)X 1233(values)X 1503(should)X 1783(contact:)X 2086 2976(Joyce)N 2329(K.)X 2446(Reynolds)X 1762 3088(USC)N 1972(Information)X 2456(Sciences)X 2817(Institute)X 2039 3200(4676)N 2255(Admiralty)X 2675(Way)X 1802 3312(Marina)N 2104(del)X 2246(Rey,)X 2449(CA)X 2630(90292-6695)X 1680 3536(213-822-1511)N 2320(JKReynolds@ISI.EDU)X 720 3760(The)N 894(speci\256c)X 1212(prede\256ned)X 1642("type")X 1910(\256elds)X 2142(are)X 2284(explained)X 2683(below:)X 3 f 720 3984("X-"atom)N 1 f 1162(--)X 1254(Any)X 1447(type)X 1641(value)X 1878(beginning)X 2291(with)X 2490(the)X 2636(characters)X 3055("X-")X 3262(is)X 3354(a)X 3425(private)X 3721(value,)X 3982(to)X 4085(be)X 720 4096(used)N 930(by)X 1060(consenting)X 1510(mail)X 1715(systems)X 2052(by)X 2181(mutual)X 2482(agreement.)X 2965(Any)X 3163(format)X 3453(without)X 3780(a)X 3856(rigorous)X 720 4208(and)N 883(public)X 1148(de\256nition)X 1541(should)X 1821(be)X 1936(named)X 2217(with)X 2412(an)X 2527("X-")X 2730(pre\256x.)X 3 f 720 4432(POSTSCRIPT)N 1 f 1371(--)X 1484(Indicates)X 1881(the)X 2049(enclosed)X 2436(document)X 2866(consists)X 3220(of)X 3350(information)X 3855(encoded)X 720 4544(using)N 957(the)X 1104(Postscript)X 1512(Page)X 1728(De\256nition)X 2147(Language)X 2556(developed)X 2981(by)X 3105(Adobe)X 3389(Systems,)X 3761(Inc.)X 3936([REF-)X 720 4656(PS].)N 916(For)X 1083(type)X 1284("postscript")X 1771(the)X 1924(valid)X 2152(ver-num)X 2513(\256elds)X 2756(are)X 2909("1.0",)X 3166("2.0",)X 3423(and)X 3597("null",)X 3884(and)X 4058(the)X 720 4768(valid)N 953(resource-ref)X 1458(\256elds)X 1706(include,)X 2054(but)X 2217(are)X 2374(not)X 2536(limited)X 2849(to,)X 2987("laserprep2.9",)X 3601("laserprep3.0",)X 720 4880("laserprep3.1",)N 1319(and)X 1482("laserprep4.0".)X 3 f 720 5104(SCRIBE)N 1 f 1101(--)X 1190(Indicates)X 1563(the)X 1707(document)X 2113(contains)X 2460(embedded)X 2882(formatting)X 3315(information)X 3796(according)X 720 5216(to)N 847(the)X 1017(syntax)X 1320(used)X 1547(by)X 1694(the)X 1863(Scribe)X 2160(document)X 2591(formatting)X 3049(language)X 3448(distributed)X 3911(by)X 4058(the)X 720 5328(Unilogic)N 1097(Corporation.)X 1626([REF-SCRIBE].)X 2322(For)X 2495(type)X 2701("scribe")X 3049(the)X 3208(valid)X 3442(ver-num)X 3809(\256elds)X 4058(are)X 720 5440("null",)N 996("3",)X 1170("4",)X 1344("5",)X 1518(etc.)X 5 p %%Page: 5 5 12 s 0 xH 0 xS 1 f 3 f 1 f 2368 400(-)N 2424(5)X 2496(-)X 3 f 720 688(SGML)N 1 f 1031(--)X 1123(Indicates)X 1499(the)X 1645(document)X 2053(contains)X 2402(structuring)X 2847(information)X 3330(to)X 3433(according)X 3842(the)X 3989(rules)X 720 800(speci\256ed)N 1096(for)X 1242(the)X 1394(Standard)X 1769(Generalized)X 2268(Markup)X 2605(Language,)X 3042(IS)X 3160(8879,)X 3409(as)X 3522(published)X 3929(by)X 4058(the)X 720 912(International)N 1272(Organization)X 1834(for)X 2006(Standardization.)X 2695([REF-SGML])X 3293(Documents)X 3791(structured)X 720 1024(according)N 1131(to)X 1237(the)X 1386(ISO)X 1571(DIS)X 1756(8613--Of\256ce)X 2283(Docment)X 2667(Architecture)X 3180(and)X 3349(Interchange)X 3834(Format--)X 720 1136(may)N 916(also)X 1101(be)X 1222(encoded)X 1573(using)X 1811(SGML)X 2107(syntax.)X 2436(For)X 2599(type)X 2795("sgml")X 3090(the)X 3238(valid)X 3462(ver-num)X 3819(\256elds)X 4058(are)X 720 1248("IS.8879.1986")N 1339(and)X 1502("null")X 3 f 720 1472(TeX)N 1 f 929(--)X 1026(Indicates)X 1407(the)X 1558(document)X 1972(contains)X 2327(embedded)X 2757(formatting)X 3198(information)X 3687(according)X 4101(to)X 720 1584(the)N 862(syntax)X 1137(of)X 1241(the)X 1383(TeX)X 1578(document)X 1982(production)X 2423(language.)X 2819([REF-TEX])X 3 f 720 1808(TROFF)N 1 f 1075(--)X 1168(Indicates)X 1545(the)X 1692(document)X 2101(contains)X 2451(embedded)X 2876(formatting)X 3312(information)X 3796(according)X 720 1920(to)N 837(the)X 997(syntax)X 1290(speci\256ed)X 1674(for)X 1828(the)X 1988(TROFF)X 2328(formatting)X 2777(package)X 3135(developed)X 3573(by)X 3711(AT&T)X 4015(Bell)X 720 2032(Laboratories.)N 1267([REF-TROFF].)X 1921(For)X 2090(type)X 2292("troff")X 2577(the)X 2731(valid)X 2961(resource-ref)X 3463(\256elds)X 3708(include,)X 4053(but)X 720 2144(are)N 862(not)X 1009(limited)X 1307(to,)X 1430("eqn",)X 1695("tbl",)X 1923("me",)X 2167(and)X 2330(the)X 2472(names)X 2742(of)X 2846(other)X 3068(troff)X 3263(macro)X 3528(packages.)X 3 f 720 2368(ODA)N 1 f 964(--)X 1059(Indicates)X 1438(that)X 1614(the)X 1763(body)X 1986(is)X 2081(an)X 2203(ODA)X 2441(document,)X 2876(containing)X 3314(formatted)X 3721(information)X 720 2480(encoded)N 1090(according)X 1519(to)X 1643(the)X 1810(Of\256ce)X 2098(Document)X 2547(Architecture)X 3077([REF-ODA].)X 3676(If)X 3788(needed,)X 4133(a)X 720 2592(document)N 1124(application)X 1577(pro\256le)X 1852(is)X 1940(to)X 2039(be)X 2154(included)X 2510(as)X 2614(part)X 2788(of)X 2892(the)X 3034(message)X 3384(body.)X 3 f 720 2816(DVI)N 1 f 922(--)X 1013(Indicates)X 1388(the)X 1533(document)X 1940(contains)X 2288(information)X 2770(according)X 3177(to)X 3279(the)X 3425(device)X 3705(independent)X 720 2928(\256le)N 867(format)X 1148(produced)X 1530(by)X 1650(TROFF)X 1972(or)X 2076(TeX.)X 3 f 720 3152(MULTIPART)N 1 f 1334(--)X 1426(Indicates)X 1802(the)X 1948(document)X 2356(contains)X 2705(multiple)X 3056(encapsulated)X 3583(messages,)X 3999(each)X 720 3264(of)N 825(which)X 1085(may)X 1276(be)X 1392(of)X 1497(a)X 1565(different)X 1922(content-type.)X 2477(The)X 2651(precise)X 2948(syntax)X 3223(of)X 3327(a)X 3394("multipart")X 3850(message)X 720 3376(is)N 817(de\256ned)X 1133(later)X 1339(in)X 1448(this)X 1621(RFC,)X 1860(as)X 1974(are)X 2126(the)X 2278(possible)X 2627(values)X 2907(for)X 3053(its)X 3178(ver-num)X 3538(and)X 3711(resource-ref)X 720 3488(\256elds.)N 3 f 720 3712(U-LAW)N 1 f 1077(or)X 3 f 1184(A-LAW)X 1 f 1542(--)X 1682(Indicates)X 2058(that)X 2231(the)X 2377(document)X 2785(contains)X 3134(audio)X 3376(data)X 3565(in)X 3668(U-law)X 3936([REF-)X 720 3824(ULAW])N 1072(or)X 1184(A-law)X 1456([REF-ALAW],)X 2071(respectively.)X 2616(U-law)X 2887(and)X 3057(A-law)X 3328(are)X 3477(the)X 3626(American)X 4037(and)X 720 3936(European)N 1120(audio)X 1365(telephony)X 1776(standards.)X 2218(If)X 2313(one)X 2483(of)X 2594(these)X 2823(content-types)X 3373(is)X 3468(used,)X 3700(the)X 3850(ver-num)X 720 4048(\256eld)N 917(can)X 1077(be)X 1194(used)X 1396(to)X 1497(give)X 1689(a)X 1758(sampling)X 2137(rate)X 2308(in)X 2409(Hertz,)X 2673(optionally)X 3090(followed)X 3458(by)X 3579(the)X 3722(letter)X 3946("HZ".)X 720 4160(Although)N 1122(audio)X 1376(header)X 1673(formats)X 2007(are)X 2165(not)X 2328(yet)X 2486(standardized,)X 3037(the)X 3195(resource-ref)X 3700(\256eld)X 3911(can)X 4085(be)X 720 4272(used)N 937(to)X 1053(specify)X 1372(an)X 1504(audio)X 1759(header)X 2057(format.)X 2403(Thus)X 2635(an)X 2766(appropriate)X 3245(content-type)X 3767(header)X 4064(for)X 720 4384(audio)N 958(might)X 1207(be)X 1322(something)X 1747(like)X 1916("Content-type:)X 2533(u-law;)X 2803(8000)X 3019(HZ;)X 3198(X-Next")X 3 f 720 4608(PBM)N 1 f 970(or)X 3 f 1086(PGM)X 1 f 1347(or)X 3 f 1463(PPM)X 1 f 1708(--)X 1808(Indicates)X 2192(the)X 2346(document)X 2762(contains)X 3119(image)X 3391(data)X 3588(encoded)X 3946(in)X 4058(the)X 720 4720(Portable)N 1065(Bitmap)X 1373(format)X 1654([REF-PBM])X 2152(for)X 2288(black)X 2521(and)X 2684(white,)X 2946(grey)X 3141(scale,)X 3382(or)X 3486(color)X 3708(images.)X 3 f 720 4944(DES-MESSAGE)N 1 f 1440(--)X 1537(Indicates)X 1919(that)X 2098(the)X 2250(body)X 2476(is)X 2574(an)X 2699(encapsulated)X 3231(message)X 3591(encrypted)X 4005(with)X 720 5056(DES)N 947(encryption)X 1405([REF-DES].)X 1952(An)X 2115(encrytped)X 2541(message)X 2913(is)X 3022(speci\256ed,)X 3433(rather)X 3703(than)X 3914(simply)X 720 5168(encrypted)N 1148(text,)X 1365(because)X 1718(this)X 1905(permits)X 2243(the)X 2410(encrypted)X 2839(object)X 3124(to)X 3248(contain)X 3581(a)X 3673(Content-type)X 720 5280(header)N 1007(and)X 1176(thus)X 1366(to)X 1471(contain)X 1785(encrypted)X 2195(data)X 2386(of)X 2496(any)X 2665(type.)X 2909(If)X 3003(all)X 3129(that)X 3303(is)X 3396(desired)X 3703(is)X 3796(encrypted)X 720 5392(text,)N 927(the)X 1083(header)X 1378(are)X 1535(of)X 1654(the)X 1811(encapsulated)X 2348(message)X 2713(can)X 2886(be)X 3016(blank)X 3269(\(i.e.)X 3458(once)X 3679(decrypted,)X 4122(it)X 720 5504(begins)N 995(with)X 1190(CRLF.\))X 6 p %%Page: 6 6 12 s 0 xH 0 xS 1 f 3 f 1 f 2368 400(-)N 2424(6)X 2496(-)X 3 f 720 688(ISO-CHARSET-TYPE)N 1 f 1716(--)X 1828(Indicates)X 2224(the)X 2390(document)X 2819(contains)X 3189(text)X 3383(in)X 3507(an)X 3647(ISO)X 3850(standard)X 720 800(character)N 1102(set)X 1237(by)X 1361(ints)X 1528(International)X 2049(Registration)X 2548(number.)X 2918(Each)X 3138(ISO)X 3319(character)X 3700(set)X 3834(de\256nes)X 4133(a)X 720 912(new)N 911(standard)X 1268(mail)X 1471(content)X 1787(type,)X 2009(given)X 2255(by)X 2383(the)X 2533(string)X 2784("ISO-IR-")X 3208(followed)X 3582(by)X 3710(the)X 3860(numeric)X 720 1024(value)N 963(of)X 1077(the)X 1229(character)X 1617(set.)X 1806(Thus,)X 2056(for)X 2202(example,)X 2587(a)X 2664(content-type)X 3180(of)X 3294("ISO-IR-6")X 3768(speci\256es)X 4133(a)X 720 1136(character)N 1103(set)X 1239(that)X 1413(is)X 1506(extremely)X 1921(similar,)X 2242(and)X 2410(perhaps)X 2738(identical,)X 3124(to)X 3228(MAILASCII.)X 3800(However,)X 720 1248(it)N 815(should)X 1112(be)X 1244(noted)X 1499(that)X 1685(even)X 1907(when)X 2155(the)X 2313(Content-type)X 2856(is)X 2960(an)X 3091(ISO-IR-)X 3445(character)X 3839(set)X 3986(type,)X 720 1360(certain)N 1017(control)X 1324(characters)X 1749(will)X 1934(always)X 2236(be)X 2362(construed)X 2771(according)X 3186(to)X 3296(the)X 3449(guidelines)X 3880(of)X 3995(RFC)X 720 1472(821)N 910(and)X 1095(RFC)X 1322(822.)X 1560(In)X 1686(particular,)X 2126(character)X 2526(positions)X 2919(13,)X 3085(10,)X 3251(and)X 3436(32)X 3578(will)X 3773(always)X 4085(be)X 720 1584(interpreted)N 1162(at)X 1256(times)X 1489(as)X 1593(CR,)X 1769(LF,)X 1929(and)X 2092(SPACE,)X 2438(respectively.)X 3 f 720 1808(X400-TYPE)N 1 f 1258(--)X 1359(Indicates)X 1745(the)X 1901(document)X 2319(contains)X 2678(an)X 2807(ASN.1)X 3108(representation)X 3692(of)X 3810(an)X 3939(X.400)X 720 1920(bodypart.)N 1111(The)X 1286(type)X 1477(\256eld)X 1673(may)X 1864(be)X 1979(either)X 2223("1984",)X 2541(indicating)X 2951(that)X 3120(the)X 3262(represenation)X 3805(is)X 3893(de\256ned)X 720 2032(in)N 852([REF-CCITT84c],)X 1622(or)X 1759("1988",)X 2110(indicating)X 2553(that)X 2755(the)X 2930(encoding)X 3340(is)X 3462(de\256ned)X 3803(in)X 3936([REF-)X 720 2144(CCITT/ISO88b].)N 3 f 720 2368(X400-1984-TYPE)N 1 f 1470(--)X 1559(Indicates)X 1933(that)X 2104(the)X 2248(document)X 2654(contains)X 3001(an)X 3118(ASN.1)X 3407(representation)X 3979(of)X 4085(an)X 720 2480(X.400)N 988(bodypart)X 1361(speci\256c)X 1685(to)X 1790(the)X 1938(1984)X 2160(version)X 2473(of)X 2583(the)X 2731(standard)X 3087([REF-CCITT84c].)X 3830(The)X 4010(type)X 720 2592(\256eld)N 915(must)X 1126(be)X 1241("1984")X 1535(if)X 1618(speci\256ed.)X 3 f 720 2816(X400-1988-TYPE)N 1 f 1470(--)X 1559(Indicates)X 1933(that)X 2104(the)X 2248(document)X 2654(contains)X 3001(an)X 3118(ASN.1)X 3407(representation)X 3979(of)X 4085(an)X 720 2928(X.400)N 989(bodypart)X 1362(speci\256c)X 1687(to)X 1793(the)X 1942(1988)X 2165(version)X 2479(of)X 2590(the)X 2739(standard)X 3096([REF-CCITT/ISO88b].)X 4026(The)X 720 3040(type)N 910(\256eld)X 1105(must)X 1316(be)X 1431("1988")X 1725(if)X 1808(speci\256ed.)X 3 f 720 3264(MAILASCII)N 1 f 1281(--)X 1380(Indicates)X 1763(the)X 1916(document)X 2331(contains)X 2687(only)X 2893(unencoded)X 3345(7)X 3428(bit)X 3565(US)X 3722(ASCII)X 4007(text,)X 720 3376(the)N 877(default)X 1184(content-type)X 1705(for)X 1855(RFC)X 2074(822)X 2256(mail.)X 2514(This)X 2723(content-type)X 3243(has)X 3409(been)X 3629(the)X 3785(subject)X 4096(of)X 720 3488(some)N 947(confusion)X 1350(and)X 1513(ambiguity)X 1928(in)X 2027(the)X 2169(past.)X 2396(Its)X 2516(de\256nition)X 2909(is)X 2997(spelled)X 3294(out)X 3441(in)X 3540(Appendix)X 3943(A.)X 720 3712(If)N 812(no)X 936(Content-type)X 1467(header)X 1752(\256eld)X 1951(is)X 2043(present,)X 2373("MAILASCII")X 2974(is)X 3066(assumed.)X 3497(That)X 3703(is,)X 3820(the)X 3967(name)X 720 3824("MAILASCII")N 1325(is)X 1421(intended)X 1785(to)X 1892(refer)X 2106(to)X 2213(the)X 2363(default)X 2663(message)X 3021(body)X 3245(type)X 3443(as)X 3554(de\256ned)X 3868(by)X 3995(RFC)X 720 3936(822.)N 720 4160(It)N 819(should)X 1115(be)X 1246(noted)X 1500(that)X 1685(the)X 1843(list)X 2002(of)X 2123(Content-type)X 2667(values)X 2954(given)X 3209(above)X 3480(is)X 3585(expected)X 3969(to)X 4085(be)X 720 4272(augmented)N 1171(in)X 1274(time,)X 1498(and)X 1665(that)X 1838(such)X 2042(additions)X 2422(will)X 2599(be)X 2717(registered)X 3124(at)X 3221(the)X 3366(address)X 3681(given)X 3922(above.)X 720 4384(We)N 911(have)X 1150(simply)X 1469(attempted,)X 1931(in)X 2063(this)X 2259(RFC,)X 2522(to)X 2655(give)X 2879(as)X 3017(many)X 3289(standard)X 3673(Content-type)X 720 4496(de\256nitions)N 1159(as)X 1272(was)X 1454(possible)X 1802(given)X 2049(the)X 2200(current)X 2506(state)X 2716(of)X 2829(our)X 2989(knowledge.)X 3491(The)X 3673(Content-type)X 720 4608(values)N 990(de\256ned)X 1297(above)X 1551(are)X 1693(a)X 1760(superset)X 2099(of)X 2203(the)X 2345(values)X 2615(de\256ned)X 2922(by)X 3042(RFC)X 3247(1049.)X 720 4832(Thos)N 945(wishing)X 1282(to)X 1390(transmit)X 1739(FAX)X 1963(by)X 2092(Internet)X 2425(mail)X 2630(should)X 2920(note)X 3120(that)X 3299(G3-FAX)X 3673(is)X 3771(one)X 3944(of)X 4058(the)X 720 4944(Content-types)N 1302(de\256ned)X 1627(for)X 1781(X.400)X 2059(support.)X 2436(It)X 2536(is)X 2641(thus)X 2842(appropriate)X 3322(to)X 3438(use)X 3607("Content-type:)X 720 5056(G3-FAX")N 1123(for)X 1259(such)X 1459(data.)X 3 f 14 s 720 5296(3)N 1104(The)X 1318(Content-Encoding)X 2224(Header)X 2607(Field)X 1 f 12 s 720 5536(Many)N 971(content-types)X 1517(are)X 1662(represented,)X 2158(in)X 2261(their)X 2466(natural)X 2762(format,)X 3071(as)X 3179(8-bit)X 3389(or)X 3497(binary)X 3771(data.)X 3984(Such)X 720 5648(data)N 917(can)X 1087(not)X 1246(be)X 1373(transmitted)X 1843(over)X 2050(existing)X 2391(Internet)X 2727(mail)X 2935(mechanisms)X 3447(because)X 3788(both)X 3995(RFC)X 720 5760(821)N 898(and)X 1071(RFC)X 1286(822)X 1464(restrict)X 1766(mail)X 1972(messages)X 2369(to)X 2478(7)X 2560(bit)X 2696(data)X 2892(with)X 3098(reasonably)X 3550(short)X 3777(lines.)X 4018(It)X 4112(is)X 7 p %%Page: 7 7 12 s 0 xH 0 xS 1 f 2368 400(-)N 2424(7)X 2496(-)X 720 688(necessary,)N 1168(therefore,)X 1590(to)X 1715(de\256ne)X 2000(a)X 2093(standard)X 2469(mechanism)X 2958(for)X 3120(encoding)X 3523(such)X 3749(data)X 3960(in)X 4085(an)X 720 800(acceptable)N 1152(manner.)X 720 1024(This)N 919(RFC)X 1128(speci\256es)X 1487(that)X 1660(this)X 1827(encoding)X 2208(will)X 2386(be)X 2505(done)X 2720(by)X 2845(a)X 2917(new)X 3106("Content-Encoding")X 3919(header)X 720 1136(\256eld.)N 969(The)X 1149(Content-Encoding)X 1885(\256eld)X 2086(is)X 2180(used)X 2386(to)X 2491(indicate)X 2827(the)X 2974(type)X 3169(of)X 3278(transformation)X 3874(that)X 4048(has)X 720 1248(been)N 926(used)X 1126(to)X 1225(represent)X 1603(the)X 1746(message)X 2097(body)X 2314(in)X 2414(an)X 2530(acceptable)X 2963(manner.)X 3325(Unlike)X 3612(Content-types,)X 720 1360(which)N 984(are)X 1131(expected)X 1503(to)X 1607(proliferate,)X 2062(it)X 2145(is)X 2237(expected)X 2608(that)X 2781(there)X 3002(will)X 3180(never)X 3422(be)X 3541(more)X 3767(than)X 3961(a)X 4032(few)X 720 1472(different)N 1097(Content-Encoding)X 1848(values,)X 2163(both)X 2379(because)X 2729(there)X 2967(is)X 3076(less)X 3265(need)X 3492(for)X 3649(variation)X 4037(and)X 720 1584(because)N 1049(the)X 1191(effect)X 1435(of)X 1539(variation)X 1906(in)X 2005(Content-Encoding)X 2735(would)X 2999(be)X 3114(more)X 3336(problematic.)X 720 1808(However,)N 1155(establishing)X 1674(only)X 1904(a)X 2006(single)X 2295(Content-Encoding)X 3060(mechanism)X 3559(does)X 3795(not)X 3978(seem)X 720 1920(possible.)N 1090(In)X 1201(particular,)X 1626(there)X 1850(is)X 1945(a)X 2019(tradeoff)X 2355(between)X 2707(the)X 2856(desire)X 3117(for)X 3260(a)X 3334(compact)X 3691(and)X 3860(ef\256cient)X 720 2032(encoding)N 1100(of)X 1207(binary)X 1480(data)X 1668(and)X 1834(the)X 1979(desire)X 2236(for)X 2375(a)X 2445(readable)X 2799(encoding)X 3179(of)X 3286(data)X 3474(that)X 3647(is)X 3739(mostly,)X 4053(but)X 720 2144(not)N 880(entirely,)X 1236(MAILASCII)X 1768(text.)X 1998(For)X 2168(this)X 2344(reason,)X 2656(at)X 2763(least)X 2977(two)X 3157(encoding)X 3546(mechanisms)X 4058(are)X 720 2256(necessary,)N 1142(a)X 1209("readable")X 1638(encoding)X 2015(and)X 2178(a)X 2245("dense")X 2566(encoding.)X 2991(This)X 3186(RFC)X 3391(also)X 3570(speci\256es)X 3926(a)X 3994(third)X 720 2368(encoding)N 1116(which)X 1394(is)X 1501(neither)X 1812(readable)X 2182(nor)X 2353(dense,)X 2639(but)X 2805(is)X 2912(the)X 3073(most)X 3303(simple)X 3603(to)X 3721(encode)X 4037(and)X 720 2480(unencode.)N 1142(A)X 1240(fourth)X 1504(encoding,)X 1911(for)X 2053(compressed)X 2537(\("super-dense"\))X 3168(data,)X 3383(might)X 3638(reasonably)X 4085(be)X 720 2592(de\256ned)N 1027(at)X 1121(a)X 1188(later)X 1384(date.)X 720 2816(The)N 916(Content-Encoding)X 1668(\256eld)X 1886(is)X 1997(designed)X 2386(to)X 2508(specify)X 2833(a)X 2923(two-way)X 3306(mapping)X 3690(between)X 4058(the)X 720 2928("native")N 1088(representation)X 1688(of)X 1822(a)X 1919(type)X 2139(of)X 2273(data)X 2488(and)X 2681(a)X 2778(representation)X 3378(that)X 3577(can)X 3764(be)X 3908(readily)X 720 3040(exchanged)N 1157(using)X 1390(7)X 1463(bit)X 1590(mail)X 1787(transport)X 2154(protocols)X 2537(as)X 2642(de\256ned)X 2950(by)X 3071(RFC)X 3277(821)X 3446(\(SMTP\).)X 3809(This)X 4005(\256eld)X 720 3152(has)N 878(not)X 1031(been)X 1243(de\256ned)X 1556(by)X 1681(any)X 1849(previous)X 2209(RFC.)X 2443(The)X 2622(\256eld's)X 2891(value)X 3129(is)X 3222(a)X 3294(single)X 3553(atom)X 3775(specifying)X 720 3264(the)N 862(type)X 1052(of)X 1156(encoding,)X 1557(as)X 1661(enumerated)X 2135(below.)X 2442(Formally:)X 7 f 720 3488 0.3971(Content-Encoding:=)AN 1872 0.4219("BASE64"/)AX 1872 3600 0.4038("HEXADECIMAL"/)AN 1872 3712 0.3958("QUOTED-PRINTABLE"/)AN 1872 3824 0.4000("8BIT"/"BINARY"/)AN 1872 3936 0.4018("7BIT"/"X-"atom)AN 1 f 720 4160(These)N 989(values)X 1274(are)X 1431(not)X 1593(case)X 1798(sensitive.)X 2222(That)X 2438(is,)X 2566(Hexadecimal)X 3115(and)X 3294(HEXADECIMAL)X 4037(and)X 720 4272(heXadeCimAl)N 1306(are)X 1454(all)X 1580(equivalent.)X 2059(An)X 2205(encoding)X 2587(type)X 2782(of)X 2891(7BIT)X 3123(implies)X 3436(that)X 3610(the)X 3757(message)X 4112(is)X 720 4384(already)N 1053(in)X 1177(a)X 1269(seven-bit)X 1671(ASCII)X 1970(representation.)X 2589(This)X 2809(value)X 3067(is)X 3181(assumed)X 3562(if)X 3671(the)X 3839(Content-)X 720 4496(Encoding)N 1137(header)X 1442(\256eld)X 1661(is)X 1773(not)X 1944(present.)X 2318(If)X 2430(the)X 2595(message)X 2968(is)X 3079(stored)X 3361(or)X 3488(transported)X 3968(via)X 4133(a)X 720 4608(mechanism)N 1192(that)X 1370(permits)X 1692(8-bit)X 1907(data,)X 2125(a)X 2201(Content-Encoding)X 2940(of)X 3053("8bit")X 3314(should)X 3603(nonetheless)X 4085(be)X 720 4720(used.)N 989(If)X 1098(the)X 1261(message)X 1632(is)X 1741(stored)X 2021(or)X 2145(transported)X 2622(via)X 2784(a)X 2871(mechanism)X 3354(that)X 3543(permits)X 3876(arbitary)X 720 4832(binary)N 1063(data,)X 1345(a)X 1485(Content-Encoding)X 2288(of)X 2465("binary")X 2886(should)X 3240(nonetheless)X 3787(be)X 3976(used.)X 720 4944(\(DISCUSSION:)N 1421(The)X 1626(distinction)X 2087(between)X 2462(the)X 2634(Content-Encoding)X 3394(values)X 3694(of)X 3828("binary,")X 720 5056("8bit,")N 1020(and)X 1207("7bit")X 1484(may)X 1699(seem)X 1946(unimportant)X 2466(in)X 2590(an)X 2730(8-bit)X 2961(binary)X 3256(environment,)X 3816(but)X 3988(clear)X 720 5168(labeling)N 1068(will)X 1255(be)X 1383(of)X 1500(enormous)X 1916(value)X 2162(to)X 2274(gateways)X 2669(between)X 3027(8-bit)X 3245(and)X 3420(7-bit)X 3638(systems.)X 4026(The)X 720 5280(difference)N 1144(between)X 1498("8bit")X 1759(and)X 1931("binary")X 2288(is)X 2385(that)X 2563("8bit")X 2824(implies)X 3141(adherence)X 3565(to)X 3673(SMTP)X 3956(limits)X 720 5392(on)N 840(line)X 1009(length)X 1274(and)X 1437(CR/LF)X 1728(semantics,)X 2156(whereas)X 2495("binary")X 2843(does)X 3043(not.\))X 720 5616(Implementors)N 1282(may)X 1475(de\256ne)X 1737(new)X 1924(content)X 2235(encoding)X 2615(values,)X 2912(but)X 3062(should)X 3346(pre\256x)X 3598(them)X 3819(with)X 4018("x-")X 720 5728(to)N 849(indicate)X 1209(their)X 1440(non-standard)X 1996(status,)X 2292(e.g.)X 2484("Content-Encoding:)X 3333(x-my-new-encoding".)X 720 5840(However,)N 1161(unlike)X 1467(Content-types,)X 2096(the)X 2279(creation)X 2656(of)X 2802(new)X 3028(Content-Encoding)X 3800(values)X 4112(is)X 8 p %%Page: 8 8 12 s 0 xH 0 xS 1 f 2368 400(-)N 2424(8)X 2496(-)X 720 688(explicitly)N 1120(discouraged,)X 1644(as)X 1759(it)X 1848(seems)X 2117(likely)X 2371(to)X 2480(hinder)X 2760 0.1992(inter-operability)AX 3421(with)X 3626(little)X 3838(potential)X 720 800(bene\256t.)N 720 1024(If)N 813(a)X 885(Content-Encoding)X 1620(header)X 1906(\256eld)X 2106(appears)X 2429(as)X 2538(part)X 2717(of)X 2826(a)X 2899(message)X 3255(header,)X 3566(it)X 3650(applies)X 3953(to)X 4058(the)X 720 1136(entire)N 980(message)X 1346(body,)X 1601(whether)X 1950(or)X 2069(not)X 2231(that)X 2415(body)X 2646(is)X 2749(of)X 2868(type)X 3073("multipart.")X 3592(If)X 3695(it)X 3788(is)X 3891(of)X 4010(type)X 720 1248(multipart,)N 1140(the)X 1300(encoding)X 1695(applies)X 2010(recursively)X 2480(to)X 2597(all)X 2736(of)X 2858(the)X 3018(encapsulated)X 3558(parts,)X 3812(including)X 720 1360(their)N 939(encapsulated)X 1479(headers.)X 1863(If)X 1969(a)X 2054(Content-Encoding)X 2802(header)X 3101(\256eld)X 3314(appears)X 3650(as)X 3772(part)X 3964(of)X 4085(an)X 720 1472(encapsulation's)N 1363(headers,)X 1725(it)X 1823(applies)X 2140(only)X 2355(to)X 2475(the)X 2638(body)X 2875(of)X 3000(the)X 3163(encapsulated)X 3706(part.)X 3949(If)X 4058(the)X 720 1584(encapsulated)N 1251(part)X 1434(is)X 1531(itself)X 1757(of)X 1870(type)X 2069("multipart",)X 2558(the)X 2709(encoding)X 3095(applies)X 3400(recursively)X 3860(to)X 3967(all)X 4096(of)X 720 1696(the)N 862(encapsulated)X 1384(parts)X 1595(within)X 1865(that)X 2034(encapsulated)X 2556(part.)X 720 1920(The)N 894(following)X 1292(sections)X 1626(will)X 1800(de\256ne)X 2059(the)X 2201(standard)X 2551(encoding)X 2928(mechanisms.)X 3 f 720 2144(3.1)N 1104(Quoted-Printable)X 1849(Content-Encoding)X 1 f 720 2368(The)N 916(Quoted-Printable)X 1625(encoding)X 2024(is)X 2134(intended)X 2512(to)X 2633(represent)X 3033(data)X 3241(that)X 3433(is)X 3544(largely,)X 3883(but)X 4053(not)X 720 2480(entirely,)N 1066(7)X 1141(bit)X 1269(ASCII.)X 1593(Printable)X 1967(ASCII)X 2243(portions)X 2584(of)X 2690(body)X 2908(parts)X 3121(encoded)X 3468(in)X 3569(this)X 3734(way)X 3920(should)X 720 2592(be)N 835(recognizable)X 1352(by)X 1472(humans,)X 1819(if)X 1902(necessary,)X 2324(without)X 2642(translation.)X 720 2816(In)N 830(this)X 1000(encoding,)X 1408(ASCII)X 1689(characters)X 2111(9)X 2190(\(tab\),)X 2427(10)X 2554(\(nl\),)X 2748(13)X 2875(\(cr\),)X 3069(32)X 3196(through)X 3526(37,)X 3677(inclusive,)X 4080(39)X 720 2928(through)N 1078(91,)X 1257(and)X 1455(93)X 1610(through)X 1968(127,)X 2195(inclusive,)X 2625(are)X 2801(unchanged.)X 3324(All)X 3505(other)X 3761(characters,)X 720 3040(including)N 1111(characters)X 1529(38)X 1652(and)X 1818(92,)X 1965(are)X 2111(to)X 2214(be)X 2333(represented)X 2805(in)X 2908(either)X 3156(of)X 3264(the)X 3410(following)X 3812(quotation)X 720 3152(styles)N 963(and)X 1126(special)X 1418(cases:)X 1008 3376(Style)N 1249(#1:)X 1439(Any)X 1647(8)X 1738(bit)X 1883(value)X 2135(may)X 2344(be)X 2478(represented)X 2965(a)X 3051("\\")X 3199(followed)X 3584(by)X 3724(a)X 3811(two)X 3999(digit)X 1008 3488(hexadecimal)N 1533(representation)X 2116(of)X 2233(the)X 2387(character's)X 2846(ASCII)X 3132(value.)X 3425(Thus,)X 3677(for)X 3825(example,)X 1008 3600(character)N 1388(12)X 1510(\(control-L,)X 1956(or)X 2062(formfeed\))X 2473(can)X 2634(be)X 2752(represented)X 3223(by)X 3346("\\0C",)X 3614(the)X 3759(ampersand)X 1008 3712(character)N 1390(\(38\))X 1578(can)X 1739(be)X 1857(represented)X 2328(by)X 2451("\\26",)X 2703(and)X 2869(the)X 3014(backslash)X 3415(character)X 3796(\(92\))X 3983(itself)X 1008 3824(can)N 1166(be)X 1281(represented)X 1749(by)X 1869("\\5C".)X 1008 4048(Style)N 1238(#2:)X 1417(An)X 1566(8)X 1646(bit)X 1780(value)X 2021(from)X 2240(160)X 2416(through)X 2747(255)X 2923(may,)X 3145(alternately,)X 3609(be)X 3732(represented)X 1008 4160(by)N 1134(an)X 1255(ampersand)X 1702(character)X 2086(followed)X 2458(by)X 2584(the)X 2732(character)X 3116(obtained)X 3478(by)X 3604(the)X 3751(removal)X 4096(of)X 1008 4272(the)N 1159(high)X 1364(order)X 1601(bit,)X 1761(i.e.)X 1913(by)X 2043(subtracting)X 2505(128)X 2683(from)X 2904(the)X 3056(value.)X 3347(Thus)X 3597(the)X 3749(8)X 3831(bit)X 3967(value)X 1008 4384(193)N 1176(may)X 1366(be)X 1481(represented)X 1949(as)X 2053("&A".)X 1008 4608(Note)N 1233(that)X 1416(these)X 1652(two)X 1835(styles)X 2093(may)X 2298(be)X 2428(freely)X 2692(intermixed.)X 3197(Style)X 3434(#1)X 3569(is)X 3672(preferred)X 4064(for)X 1008 4720(characters)N 1440(128)X 1625(through)X 1965(159,)X 2174(because)X 2520(style)X 2743(#2)X 2880(might)X 3146(include)X 3471(control)X 3785(characters)X 1008 4832(\(e.g.)N 1213(TAB\))X 1471(that)X 1651(are)X 1804(altered)X 2102(by)X 2233(some)X 2471(MTA)X 2719(\(see)X 2909(NOTES)X 3253(TO)X 3416(IMPLEMENTERS,)X 1008 4944(below\).)N 1348(Style)X 1571(#2)X 1692(is)X 1781(provided)X 2148(for)X 2285(improved)X 2679(readability)X 3117(of)X 3221(some)X 3448(8-bit)X 3654(character)X 4032(sets)X 1008 5056(in)N 1107(which)X 1366(turning)X 1668(on)X 1788(the)X 1930(8th)X 2077(bit)X 2203(produces)X 2574(a)X 2641(character)X 3019(similar)X 3311(to)X 3410(the)X 3553(corresponding)X 4128(7)X 1008 5168(bit)N 1146(character,)X 1560(e.g.)X 1735(the)X 1889(8th)X 2048(bit)X 2186(simply)X 2484(adds)X 2696(an)X 2823(umlaut.)X 3175(In)X 3291(such)X 3502(cases,)X 3764(style)X 3981(#2)X 4112(is)X 1008 5280(somewhat)N 1449(more)X 1698(readable,)X 2100(but)X 2274(should)X 2582(be)X 2725(used)X 2953(carefully,)X 3372(as)X 3504(explained)X 3931(in)X 4058(the)X 1008 5392(NOTES)N 1341(TO)X 1493(IMPLEMENTERS.)X 1008 5616(Additionally,)N 1543(there)X 1760(are)X 1902(two)X 2070(special)X 2362(cases)X 2589(that)X 2758(may)X 2948(be)X 3063(represented)X 3531(otherwise:)X 9 p %%Page: 9 9 12 s 0 xH 0 xS 1 f 2368 400(-)N 2424(9)X 2496(-)X 1008 688(Special)N 1320(case)X 1514(#1:)X 1689(The)X 1867(literal)X 2121(ampersand)X 2566(and)X 2733(backslash)X 3135(characters)X 3554(may)X 3748(themselves)X 1008 800(be)N 1126(quoted)X 1414(by)X 1536(backslashes.)X 2064(Thus,)X 2306(the)X 2450(backslash)X 2850(may)X 3042(be)X 3159(represented)X 3629(as)X 3735("\\\\")X 3893(and)X 4058(the)X 1008 912(ampersand)N 1449(as)X 1553("\\&".)X 1805(Note)X 2016(that)X 2185(this)X 2348(is)X 2436(not)X 2583(ambiguous)X 3029(with)X 3224(regard)X 3494(to)X 3594(the)X 3737(\256rst)X 3911(clause,)X 1008 1024(because)N 1337(neither)X 1629("\\")X 1758(nor)X 1910("&")X 2087(are)X 2229(part)X 2403(of)X 2507(the)X 2649(hexadecimal)X 3161(alphabet.)X 1008 1248(Special)N 1321(case)X 1516(#2:)X 1692(A)X 1790(backslash)X 2193(at)X 2292(the)X 2439(end)X 2607(of)X 2716(a)X 2788(line)X 2963(may)X 3159(be)X 3280(used)X 3486(to)X 3591(indicate)X 3927(a)X 4000(non-)X 1008 1360(signi\256cant)N 1447(line)X 1630(break.)X 1930(That)X 2144(is,)X 2269(if)X 2365(one)X 2541(needs)X 2797(to)X 2909(include)X 3230(a)X 3310(long)X 3518(line)X 3700(without)X 4031(line)X 1008 1472(breaks,)N 1324(but)X 1488(is)X 1593(concerned)X 2030(that)X 2216(MTA's)X 2539(will)X 2730(break)X 2985(the)X 3144(line)X 3330(into)X 3521(multiple)X 3885(lines,)X 4133(a)X 1008 1584(message)N 1382(encoded)X 1751(with)X 1969(the)X 2134(quoted-printable)X 2818(encoding)X 3218(may)X 3431(include)X 3762("soft")X 4031(line)X 1008 1696(breaks)N 1292(by)X 1421(preceding)X 1834(the)X 1985(line)X 2163(break)X 2410(with)X 2615(a)X 2692(backslash.)X 3148(Thus)X 3374(if)X 3467(the)X 3619("raw")X 3875(form)X 4096(of)X 1008 1808(the)N 1150(line)X 1319(is)X 1407(a)X 1474(single)X 1728(line)X 1897(that)X 2066(says:)X 7 f 10 s 1008 2016(Now's)N 1307(the)X 1510(time)X 1761(for)X 1964(all)X 2167(men)X 2370(to)X 2525(come)X 2776(to)X 2931(the)X 3134(aid)X 3337(of)X 3492(their)X 3792(country.)X 1008 2112(Now's)N 1308(the)X 1511(time)X 1762(for)X 1965(all)X 2168(men)X 2371(to)X 2526(come)X 2777(to)X 2932(the)X 3135(aid)X 3338(of)X 3493(their)X 3792(country.)X 1008 2208(Now's)N 1296(the)X 1488(time)X 1728(for)X 1920(all)X 2112(men)X 2304(to)X 2448(come)X 2688(to)X 2832(the)X 3024(aid)X 3216(of)X 3360(their)X 3648(country.)X 1 f 12 s 1008 2416(This)N 1203(could)X 1441(be)X 1556(represented,)X 2048(in)X 2147(the)X 2289(quoted-printable)X 2950(encoding,)X 3351(as)X 7 f 10 s 1008 2624(Now's)N 1296(the)X 1488(time)X 1728(for)X 1920(all)X 2112(men)X 2304(to)X 2448(come)X 2688(to)X 2832(the)X 3024(aid)X 3216(of)X 3360(their)X 3648(country.)X 4128(\\)X 1008 2720(Now's)N 1296(the)X 1488(time)X 1728(for)X 1920(all)X 2112(men)X 2304(to)X 2448(come)X 2688(to)X 2832(the)X 3024(aid)X 3216(of)X 3360(their)X 3648(country.)X 4128(\\)X 1008 2816(Now's)N 1296(the)X 1488(time)X 1728(for)X 1920(all)X 2112(men)X 2304(to)X 2448(come)X 2688(to)X 2832(the)X 3024(aid)X 3216(of)X 3360(their)X 3648(country.)X 1 f 12 s 1008 3024(This)N 1209(provides)X 1570(a)X 1643(mechanism)X 2112(with)X 2313(which)X 2578(long)X 2779(lines)X 2991(can)X 3155(be)X 3277(encoded)X 3629(in)X 3735(such)X 3942(a)X 4016(way)X 1008 3136(as)N 1112(to)X 1211(be)X 1326(restored)X 1660(by)X 1780(the)X 1922(user)X 2106(agent.)X 3 f 720 3360(NOTES)N 1081(TO)X 1256(IMPLEMENTERS)X 1 f 2081(of)X 2197(encoding)X 2586(agents:)X 2920(for)X 3069(maximum)X 3497(portability)X 3936(across)X 720 3472(MTA's,)N 1064(it)X 1156(is)X 1258(recommended)X 1842(that)X 2025(any)X 2202(long)X 2411(lines)X 2630(be)X 2758(represented)X 3239(using)X 3484("soft")X 3743(line)X 3925(breaks)X 720 3584(which)N 990(are)X 1143(inserted)X 1483(before)X 1764(any)X 1939(line)X 2120(reaches)X 2445(the)X 2599(80th)X 2806(character.)X 3244(It)X 3339(is)X 3439(also)X 3630(recommended)X 720 3696(that)N 900(trailing)X 1214(white)X 1463(space)X 1712(\(white)X 1993(space)X 2242(at)X 2346(the)X 2498(end)X 2671(of)X 2785(a)X 2862(line\))X 3073(not)X 3230(be)X 3355(relied)X 3609(upon,)X 3859(as)X 3973(some)X 720 3808(MTA's)N 1047(freely)X 1317(delete)X 1593(such)X 1814(trailing)X 2138(white)X 2397(space.)X 2704(\(Such)X 2974(a)X 3063(line)X 3254(may)X 3466(be)X 3603(represented,)X 4117(if)X 720 3920(necessary,)N 1154(using)X 1398(the)X 1552(above)X 1818(rules,)X 2065(by)X 2197(appending)X 2634(a)X 2713(backslash)X 3123(to)X 3234(the)X 3388(end)X 3563(of)X 3679(the)X 3833(line,)X 4037(and)X 720 4032(following)N 1124(it)X 1208(with)X 1409(a)X 1482(blank)X 1726(line.\))X 1981(It)X 2070(is)X 2164(also)X 2350(recommended)X 2927(that)X 3103(the)X 3252(persistence)X 3711(of)X 3822(character)X 720 4144(codes)N 968(less)X 1141(than)X 1336(32)X 1461(should)X 1745(not)X 1896(be)X 2015(relied)X 2263(on,)X 2411(particularly)X 2884(the)X 3030(TAB,)X 3274(CR,)X 3454(and)X 3621(LF)X 3761(characters.)X 720 4256(Where)N 1045(such)X 1290(characters)X 1750(would)X 2059(be)X 2219(required)X 2609(for)X 2790(representation)X 3405(in)X 3549(style)X 3800(#2,)X 3989(it)X 4112(is)X 720 4368(recommended)N 1290(that)X 1459(style)X 1665(#1)X 1785(be)X 1900(used.)X 3 f 720 4592(NOTE)N 1027(ABOUT)X 1403(CR)X 1576(AND)X 1818(LF)X 1 f 1976(in)X 2086(encoded)X 2442(messages:)X 2891(The)X 3077(use)X 3241(of)X 3357(CR)X 3521(or)X 3637(LF)X 3785(characters)X 720 4704(that)N 902(are)X 1057(not)X 1217(part)X 1404(of)X 1521(a)X 1601(CR/LF)X 1905(sequence)X 2295(is)X 2395(NOT)X 2628(PERMITTED)X 3203(in)X 3314(messages)X 3713(that)X 3894(use)X 4058(the)X 720 4816(Quoted-Printable)N 1425(encoding.)X 1868(\(Their)X 2151(presence)X 2531(is)X 2638(not)X 2804(an)X 2938(issue)X 3173(for)X 3328(the)X 3489(other)X 3730(encodings.\))X 720 4928(Sequences)N 1162(such)X 1374(as)X 1490(CR)X 1654(LF)X 1802(LF)X 1950(are)X 2104(also)X 2294(invalid;)X 2624(the)X 2777(correct)X 3080(sequence)X 3468(is)X 3567(CR)X 3730(LF)X 3877(CR)X 4040(LF.)X 720 5040(The)N 902(effect)X 1154(in)X 1261(an)X 1384(encoded)X 1737(message)X 2095(of)X 2208(a)X 2284(CR)X 2445(without)X 2772(a)X 2848(following)X 3255(LF,)X 3424(or)X 3537(an)X 3661(LF)X 3806(without)X 4133(a)X 720 5152(preceding)N 1145(CR,)X 1342(is)X 1451(unde\256ned.)X 1923(Although)X 2331(RFC-822)X 2733(de\256nes)X 3049(these)X 3291(as)X 3415(ordinary)X 3785(characters)X 720 5264(when)N 958(used)X 1164(outside)X 1472(of)X 1583(the)X 1732(CR/LF)X 2030(sequence,)X 2438(some)X 2672(implementations)X 3346(treat)X 3549(one)X 3719(\(or)X 3862(both\))X 4096(as)X 720 5376(equivalent)N 1153(to)X 1259(newline)X 1595(or)X 1706(as)X 1816(error)X 2033(characters)X 2454(that)X 2629(are)X 2777(discarded.)X 3224(Messages)X 3627(which)X 3892(contain)X 720 5488(embedded)N 1163(bare)X 1376(CR)X 1552(or)X 1680(LF)X 1840(characters)X 2279(should)X 2583(use)X 2759(encoding)X 3160(style)X 3390(#1)X 3534(to)X 3657(encode)X 3978(these)X 720 5600(characters)N 1149("safely".)X 1519(\(Discussion:)X 2037(Some)X 2294(environments)X 2856(use)X 3022(a)X 3103(bare)X 3307(CR)X 3472(or)X 3589(bare)X 3792(LF)X 3941(as)X 4058(the)X 720 5712(local)N 932(newline)X 1261(convention.)X 1737(If)X 1825(a)X 1892(message)X 2242(contains)X 2587(embedded)X 3007(bare)X 3198(CR)X 3351(or)X 3456(LF)X 3593(characters,)X 4033(it)X 4112(is)X 720 5824(impossible)N 1163(to)X 1264(transform)X 1664(it)X 1744(from)X 1957(Internet)X 2283(to)X 2384(local)X 2598(conventions)X 3089(without)X 3409(interfering)X 3841(with)X 4037(this)X 10 p %%Page: 10 10 12 s 0 xH 0 xS 1 f 2344 400(-)N 2400(10)X 2520(-)X 720 688(local)N 932(convention.\))X 720 912(Since)N 958(the)X 1101(hyphen)X 1409(character)X 1788(\("-"\))X 1987(is)X 2076(represented)X 2545(as)X 2650(itself)X 2868(in)X 2968(the)X 3111(Quoted-Printable)X 3799(encoding,)X 720 1024(care)N 936(must)X 1178(be)X 1324(taken,)X 1612(when)X 1875(encapsulating)X 2460(a)X 2558(quoted-printable)X 3249(encoded)X 3624(message)X 4004(in)X 4133(a)X 720 1136(multipart)N 1104(message,)X 1485(to)X 1591(ensure)X 1873(that)X 2049(the)X 2198(encapsulation)X 2759(boundary)X 3153(does)X 3360(not)X 3514(appear)X 3802(anywhere)X 720 1248(in)N 819(the)X 961(message.)X 1359(See)X 1522(the)X 1664(de\256nition)X 2057(of)X 2161(multipart)X 2539(messages,)X 2950(later)X 3146(in)X 3245(this)X 3408(document.)X 3 f 720 1472(3.2)N 1104(Hexadecimal)X 1663(Content-Encoding)X 1 f 720 1696(The)N 910(Hexadecimal)X 1459(Content-Encoding)X 2205(is)X 2309(intended)X 2681(to)X 2796(represent)X 3189(arbitrary)X 3561(data)X 3762(that)X 3948(is)X 4053(not)X 720 1808(humanly-readable)N 1441(in)X 1541(a)X 1609(printable)X 1977(7-bit)X 2184(form)X 2395(that)X 2564(can)X 2722(be)X 2837(passed)X 3117(through)X 3440(7)X 3512(bit)X 3638(mail)X 3834(transport)X 720 1920(agents.)N 1045(It)X 1135(transforms)X 1577(a)X 1651(byte)X 1848(stream)X 2136(into)X 2317(a)X 2391(series)X 2641(of)X 2753(two-digit)X 3138(hexadecimal)X 3658(values.)X 3960(Thus,)X 720 2032(the)N 869(sequence)X 1253(of)X 1364(the)X 1513(\256ve)X 1688(8-bit)X 1901(values)X 2178("ABC)X 2445(control-L)X 2840(newline")X 3215(would)X 3485(be)X 3606(represented)X 4080(by)X 720 2144("4142430C0A".)N 1395(Since)X 1641(newlines)X 2015(are)X 2165(themselves)X 2625(encoded)X 2978(as)X 3091(0A,)X 3265(non-data)X 3635(newlines)X 4010(may)X 720 2256(be)N 836(scattered)X 1204(freely)X 1454(to)X 1554(break)X 1793(the)X 1936(stream)X 2218(into)X 2393(multiple)X 2740(lines.)X 2995(In)X 3100(fact,)X 3294(it)X 3373(is)X 3461(recommended)X 4031(that)X 720 2368(newlines)N 1086(be)X 1201(included)X 1557(at)X 1651(least)X 1852(every)X 2090(60)X 2210(characters)X 2625(\(30)X 2777(encoded)X 3122(characters\).)X 3617(Such)X 3834(newlines)X 720 2480(will)N 894(be)X 1009(discarded)X 1402(by)X 1522(the)X 1664(decoder.)X 720 2704(The)N 898(hexadecimal)X 1414(encoding)X 1795(is)X 1887(a)X 1958(simple)X 2243(way)X 2431(to)X 2534(represent)X 2915(arbitrary)X 3275(8)X 3351(bit)X 3481(data)X 3670(in)X 3773(7)X 3849(bit)X 3980(mail,)X 720 2816(but)N 868(not)X 1015(a)X 1082(very)X 1277(ef\256cient)X 1617(one,)X 1804(as)X 1908(it)X 1986(doubles)X 2309(the)X 2451(size)X 2625(of)X 2729(the)X 2871(data.)X 3104(The)X 3278(Base64)X 3585(encoding,)X 3986(to)X 4085(be)X 720 2928(described)N 1113(below,)X 1396(is)X 1484(a)X 1551(reasonably)X 1992(simple)X 2273(alternative)X 2705(that)X 2875(only)X 3071(increases)X 3449(the)X 3592(size)X 3767(of)X 3872(the)X 4015(data)X 720 3040(by)N 864(33)X 1008(percent.)X 1388(The)X 1586(hexadecimal)X 2122(encoding)X 2523(is)X 2635(permitted)X 3053(explicitly)X 3466(because)X 3818(there)X 4058(are)X 720 3152(widespread)N 1182(utilities)X 1496(for)X 1632(converting)X 2068(binary)X 2338(\256les)X 2522(to)X 2621(hexadecimal.)X 720 3376(Since)N 958(the)X 1100(hyphen)X 1407(character)X 1785(\("-"\))X 1983(is)X 2071(not)X 2218(used)X 2418(in)X 2517(hexadecimal)X 3029(encodings,)X 3467(there)X 3684(is)X 3773(no)X 3894(need)X 4101(to)X 720 3488(worry)N 999(about)X 1263(quoting)X 1607(apparent)X 1989(encapsulation)X 2569(boundaries)X 3040(within)X 3335 0.2222(hexadecimal-encoded)AX 720 3600(body)N 936(parts.)X 720 3824(When)N 992(encoding)X 1387(a)X 1472(bit)X 1616(stream)X 1915(via)X 2075(the)X 2235(hexadecimal)X 2765(encoding,)X 3184(the)X 3344(bit)X 3488(stream)X 3787(should)X 4085(be)X 720 3936(presumed)N 1134(to)X 1249(be)X 1380(ordered)X 1714(with)X 1925(the)X 2083(most-signi\256cant-bit)X 2877(\256rst.)X 3114(That)X 3331(is,)X 3458(the)X 3615(\256rst)X 3803(bit)X 3944(in)X 4058(the)X 720 4048(stream)N 1002(will)X 1178(be)X 1295(the)X 1439(high-order)X 1871(bit)X 1999(in)X 2100(the)X 2244(\256rst)X 2419(byte,)X 2635(and)X 2800(the)X 2944(eighth)X 3211(bit)X 3339(with)X 3536(be)X 3653(the)X 3797(low-order)X 720 4160(bit)N 846(in)X 945(the)X 1087(\256rst)X 1260(byte,)X 1474(and)X 1637(so)X 1746(on.)X 720 4384(The)N 899(Hexadecimal)X 1437(alphabet)X 1793(is)X 1886(de\256ned)X 2198(as)X 2331("0123456789ABCDEF".)X 3344(Upper)X 3613(case)X 3809(letters)X 4075(A-)X 720 4496(F)N 797(should)X 1077(be)X 1192(used)X 1392(by)X 1512(encoders,)X 1902(though)X 2193(it)X 2271(is)X 2359(acceptable)X 2791(if)X 2874(a)X 2941(decoder)X 3270(ignores)X 3577(case.)X 3 f 720 4720(3.3)N 1104(Base64)X 1416(Content-Encoding)X 1 f 720 4944(The)N 897(Base64)X 1207(Content-Encoding)X 1941(is)X 2033(designed)X 2403(to)X 2506(represent)X 2887(arbitrary)X 3247(8)X 3323(bit)X 3453(data)X 3642(in)X 3745(a)X 3816(form)X 4031(that)X 720 5056(is)N 827(not)X 993(humanly)X 1373(readable.)X 1791(The)X 1984(encoding)X 2380(and)X 2561(decoding)X 2956(algorithms)X 3410(are)X 3570(simple,)X 3893(but)X 4058(the)X 720 5168(encoded)N 1071(data)X 1262(is)X 1356(only)X 1557(about)X 1801(33)X 1927(percent)X 2241(larger)X 2496(than)X 2692(the)X 2840(unencoded)X 3287(data.)X 3526(This)X 3728(encoding)X 4112(is)X 720 5280(also)N 900(used)X 1101(in)X 1201(Privacy)X 1520(Enhanced)X 1925(Mail)X 2132(applications;)X 2650(it)X 2729(is)X 2818(described)X 3211(in)X 3310(RFC)X 3515(1113.)X 3755(The)X 3929(ability)X 720 5392(in)N 826(RFC1113)X 1231(to)X 1338(imbed)X 1611(clear)X 1831(text)X 2008(within)X 2286(such)X 2494(an)X 2617(encoding)X 3002(is)X 3098(not)X 3253(allowed)X 3590(in)X 3697(this)X 3868(context,)X 720 5504(however.)N 1113(The)X 1301(following)X 1713(description)X 2179(of)X 2297(the)X 2453(encoding)X 2844(is)X 2946(adapted)X 3284(from)X 3509(RFC)X 3727(1113;)X 3983(apart)X 720 5616(from)N 934(the)X 1079(exclusion)X 1475(of)X 1582(the)X 1727("*")X 1880(mechanism)X 2347(for)X 2487(imbedded)X 2895(clear)X 3111(text)X 3284(there)X 3505(are)X 3651(no)X 3775(signi\256cant)X 720 5728(technical)N 1093(changes.)X 11 p %%Page: 11 11 12 s 0 xH 0 xS 1 f 2344 400(-)N 2400(11)X 2520(-)X 720 688(A)N 838(64-character)X 1369(subset)X 1658(of)X 1787(International)X 2329(Alphabet)X 2731(IA5)X 2929(is)X 3042(used,)X 3291(enabling)X 3673(6)X 3771(bits)X 3960(to)X 4085(be)X 720 800(represented)N 1204(per)X 1366(printable)X 1748(character.)X 2189(\(The)X 2410(proposed)X 2801(subset)X 3080(of)X 3199(characters)X 3629(is)X 3732(represented)X 720 912(identically)N 1162(in)X 1271(IA5)X 1454(and)X 1627(ASCII.\))X 1967(One)X 2161(additional)X 2582(character,)X 2995("=",)X 3186(is)X 3285(used)X 3496(to)X 3606(signify)X 3908(special)X 720 1024(processing)N 1189(functions.)X 1653(The)X 1861(character)X 2273("=")X 2463(is)X 2585(used)X 2819(for)X 2988(padding)X 3355(within)X 3658(the)X 3833(printable)X 720 1136(encoding)N 1110(procedure.)X 1556(The)X 1743(encoding)X 2133(function's)X 2560(output)X 2843(is)X 2944(delimited)X 3346(into)X 3533(text)X 3716(lines)X 3936(\(using)X 720 1248(local)N 965(conventions\),)X 1543(with)X 1771(each)X 2005(line)X 2207(except)X 2516(the)X 2691(last)X 2882(containing)X 3346(exactly)X 3681(64)X 3833(printable)X 720 1360(characters)N 1135(and)X 1298(the)X 1440(\256nal)X 1635(line)X 1804(containing)X 2235(64)X 2356(or)X 2461(fewer)X 2705(printable)X 3073(characters.)X 3537(\(This)X 3765(line)X 3935(length)X 720 1472(is)N 820(easily)X 1081(printable)X 1460(and)X 1635(is)X 1735(guaranteed)X 2194(to)X 2305(satisfy)X 2592(SMTP's)X 2946(1000)X 3173(character)X 3562(transmitted)X 4031(line)X 720 1584(length)N 985(limit.\))X 720 1808(The)N 894(encoding)X 1271(process)X 1583(represents)X 1997(24-bit)X 2251(groups)X 2536(of)X 2640(input)X 2862(bits)X 3025(as)X 3129(output)X 3399(strings)X 3679(of)X 3783(4)X 3855(encoded)X 720 1920(characters.)N 1186(Proceeding)X 1670(from)X 1908(left)X 2088(to)X 2214(right)X 2447(across)X 2738(a)X 2832(24-bit)X 3113(input)X 3362(group)X 3637(is)X 3752(formed)X 4080(by)X 720 2032(concatenating)N 1292(3)X 1376(8-bit)X 1594(input)X 1828(groups,)X 2149(this)X 2325(is)X 2426(then)X 2629(treated)X 2929(as)X 3046(4)X 3131(concatenated)X 3672(6-bit)X 3891(groups.)X 720 2144(When)N 979(encoding)X 1360(a)X 1431(bit)X 1561(stream)X 1846(via)X 1992(the)X 2138(base64)X 2433(encoding,)X 2838(the)X 2984(bit)X 3114(stream)X 3399(should)X 3683(be)X 3802(presumed)X 720 2256(to)N 821(be)X 939(ordered)X 1260(with)X 1458(the)X 1603(most-signi\256cant-bit)X 2384(\256rst.)X 2608(That)X 2812(is,)X 2927(the)X 3072(\256rst)X 3248(bit)X 3377(in)X 3479(the)X 3624(stream)X 3908(will)X 4085(be)X 720 2368(the)N 866(high-order)X 1300(bit)X 1430(in)X 1533(the)X 1679(\256rst)X 1856(byte,)X 2074(and)X 2241(the)X 2387(eighth)X 2655(bit)X 2784(with)X 2982(be)X 3100(the)X 3245(low-order)X 3651(bit)X 3780(in)X 3882(the)X 4027(\256rst)X 720 2480(byte,)N 934(and)X 1097(so)X 1206(on.)X 720 2704(Each)N 960(6-bit)X 1189(group)X 1460(is)X 1571(used)X 1794(as)X 1921(an)X 2059(index)X 2320(into)X 2517(an)X 2655(array)X 2900(of)X 3028(64)X 3172(printable)X 3563(characters.)X 4026(The)X 720 2816(character)N 1126(referenced)X 1585(by)X 1733(the)X 1903(index)X 2169(is)X 2285(placed)X 2589(in)X 2716(the)X 2886(output)X 3184(string.)X 3479(These)X 3761(characters,)X 720 2928(identi\256ed)N 1110(in)X 1211(Table)X 1457(1)X 1531(below,)X 1816(are)X 1961(selected)X 2299(so)X 2411(as)X 2518(to)X 2620(be)X 2738(universally)X 3193(representable,)X 3758(and)X 3924(the)X 4069(set)X 720 3040(excludes)N 1081(characters)X 1496(with)X 1691(particular)X 2085(signi\256cance)X 2569(to)X 2668(SMTP)X 2942(\(e.g.,)X 3161(".",)X 3311("",)X 3673(""\).)X 3 f 14 s 2269 3280(Table)N 2570(1)X 1 f 12 s 7 f 10 s 816 3520(Value)N 1104(Encoding)X 1584(Value)X 1872(Encoding)X 2352(Value)X 2640(Encoding)X 3120(Value)X 3408(Encoding)X 1056 3616(0)N 1152(A)X 1776(17)X 1920(R)X 2544(34)X 2688(i)X 3312(51)X 3456(z)X 1056 3712(1)N 1152(B)X 1776(18)X 1920(S)X 2544(35)X 2688(j)X 3312(52)X 3456(0)X 1056 3808(2)N 1152(C)X 1776(19)X 1920(T)X 2544(36)X 2688(k)X 3312(53)X 3456(1)X 1056 3904(3)N 1152(D)X 1776(20)X 1920(U)X 2544(37)X 2688(l)X 3312(54)X 3456(2)X 1056 4000(4)N 1152(E)X 1776(21)X 1920(V)X 2544(38)X 2688(m)X 3312(55)X 3456(3)X 1056 4096(5)N 1152(F)X 1776(22)X 1920(W)X 2544(39)X 2688(n)X 3312(56)X 3456(4)X 1056 4192(6)N 1152(G)X 1776(23)X 1920(X)X 2544(40)X 2688(o)X 3312(57)X 3456(5)X 1056 4288(7)N 1152(H)X 1776(24)X 1920(Y)X 2544(41)X 2688(p)X 3312(58)X 3456(6)X 1056 4384(8)N 1152(I)X 1776(25)X 1920(Z)X 2544(42)X 2688(q)X 3312(59)X 3456(7)X 1056 4480(9)N 1152(J)X 1776(26)X 1920(a)X 2544(43)X 2688(r)X 3312(60)X 3456(8)X 1008 4576(10)N 1152(K)X 1776(27)X 1920(b)X 2544(44)X 2688(s)X 3312(61)X 3456(9)X 1008 4672(11)N 1152(L)X 1776(28)X 1920(c)X 2544(45)X 2688(t)X 3312(62)X 3456(+)X 1008 4768(12)N 1152(M)X 1776(29)X 1920(d)X 2544(46)X 2688(u)X 3312(63)X 3456(/)X 1008 4864(13)N 1152(N)X 1776(30)X 1920(e)X 2544(47)X 2688(v)X 1008 4960(14)N 1152(O)X 1776(31)X 1920(f)X 2544(48)X 2688(w)X 3168(\(pad\))X 3456(=)X 1008 5056(15)N 1152(P)X 1776(32)X 1920(g)X 2544(49)X 2688(x)X 1008 5152(16)N 1152(Q)X 1776(33)X 1920(h)X 2544(50)X 2688(y)X 1 f 12 s 720 5360(Special)N 1035(processing)X 1477(is)X 1572(performed)X 2004(if)X 2094(fewer)X 2344(than)X 2541(24)X 2668(bits)X 2838(are)X 2987(available)X 3368(in)X 3475(an)X 3598(at)X 3700(the)X 3850(end)X 4021(of)X 4133(a)X 720 5472(message)N 1103(or)X 1240(encapsulated)X 1795(part)X 2002(of)X 2139(a)X 2239(message.)X 2670(A)X 2796(full)X 2987(encoding)X 3396(quantum)X 3789(is)X 3909(always)X 720 5584(completed)N 1157(at)X 1262(the)X 1415(end)X 1590(of)X 1706(a)X 1785(message.)X 2171(When)X 2437(fewer)X 2692(than)X 2894(24)X 3026(input)X 3260(bits)X 3435(are)X 3589(available)X 3974(in)X 4085(an)X 720 5696(input)N 944(group,)X 1218(zero)X 1410(bits)X 1575(are)X 1719(added)X 1975(\(on)X 2129(the)X 2273(right\))X 2512(to)X 2612(form)X 2824(an)X 2940(integral)X 3260(number)X 3579(of)X 3684(6-bit)X 3891(groups.)X 720 5808(Output)N 1013(character)X 1393(positions)X 1766(which)X 2027(are)X 2171(not)X 2321(required)X 2669(to)X 2771(represent)X 3151(actual)X 3409(input)X 3634(data)X 3822(are)X 3967(set)X 4101(to)X 12 p %%Page: 12 12 12 s 0 xH 0 xS 1 f 2344 400(-)N 2400(12)X 2520(-)X 720 688(the)N 870(character)X 1256("=".)X 1468(Since)X 1714(all)X 1843(canonically)X 2319(encoded)X 2671(output)X 2948(is)X 3043(an)X 3165(integral)X 3491(number)X 3816(of)X 3927(octets,)X 720 800(only)N 919(the)X 1065(following)X 1467(cases)X 1698(can)X 1860(arise:)X 2097(\(1\))X 2237(the)X 2384(\256nal)X 2584(quantum)X 2950(of)X 3059(encoding)X 3441(input)X 3668(is)X 3761(an)X 3881(integral)X 720 912(multiple)N 1068(of)X 1174(24)X 1296(bits;)X 1488(here,)X 1704(the)X 1848(\256nal)X 2045(unit)X 2221(of)X 2327(encoded)X 2674(output)X 2946(will)X 3122(be)X 3239(an)X 3356(integral)X 3676(multiple)X 4023(of)X 4128(4)X 720 1024(characters)N 1138(with)X 1336(no)X 1459("=")X 1618(padding,)X 1979(\(2\))X 2118(the)X 2263(\256nal)X 2461(quantum)X 2825(of)X 2932(encoding)X 3312(input)X 3537(is)X 3628(exactly)X 3934(8)X 4010(bits;)X 720 1136(here,)N 936(the)X 1080(\256nal)X 1277(unit)X 1453(of)X 1559(encoded)X 1905(output)X 2176(will)X 2351(be)X 2467(two)X 2636(characters)X 3052(followed)X 3419(by)X 3540(two)X 3709("=")X 3866(padding)X 720 1248(characters,)N 1168(or)X 1281(\(3\))X 1427(the)X 1579(\256nal)X 1784(quantum)X 2155(of)X 2269(encoding)X 2656(input)X 2888(is)X 2986(exactly)X 3299(16)X 3429(bits;)X 3629(here,)X 3853(the)X 4005(\256nal)X 720 1360(unit)N 894(of)X 998(encoded)X 1343(output)X 1613(will)X 1787(be)X 1902(three)X 2119(characters)X 2534(followed)X 2900(by)X 3020(one)X 3183("=")X 3339(padding)X 3673(character.)X 720 1584(Since)N 972(the)X 1128(hyphen)X 1449(character)X 1841(\("-"\))X 2053(is)X 2155(not)X 2317(used,)X 2556(there)X 2788(is)X 2891(no)X 3026(need)X 3247(to)X 3361(worry)X 3629(about)X 3882(quoting)X 720 1696(apparent)N 1076(encapsulation)X 1630(boundaries)X 2076(within)X 2346(base64-encoded)X 2990(body)X 3206(parts.)X 3 f 14 s 720 1936(4)N 1104(The)X 1318("M)X 1486(ultipart")X 1944(Content-Type)X 1 f 12 s 720 2176(In)N 832(the)X 982(case)X 1180(of)X 1292(multiple)X 1646(part)X 1828(messages,)X 2247(a)X 2322("multipart")X 2786(Content-type)X 3321(\256eld)X 3524(should)X 3812(appear)X 4101(in)X 720 2288(the)N 877(RFC)X 1097(822)X 1279(message)X 1643(header.)X 1962(The)X 2150(message)X 2514(body)X 2744(is)X 2846(then)X 3050(assumed)X 3419(to)X 3532(contain)X 3854(multiple)X 720 2400(parts)N 931(separated)X 1319(by)X 1439(encapsulation)X 1993(boundaries.)X 2487(Each)X 2704(of)X 2809(the)X 2952(parts)X 3164(is)X 3253(de\256ned,)X 3585(in)X 3685(essence,)X 4028(as)X 4133(a)X 720 2512(complete)N 1139(RFC)X 1385(822)X 1594(message)X 1985(in)X 2125(miniature.)X 2608(That)X 2850(is,)X 3003(what)X 3255(is)X 3384(found)X 3673(between)X 4058(the)X 720 2624(encapsulation)N 1282(boundaries)X 1737(is)X 1834(a)X 1910(header)X 2200(area,)X 2418(a)X 2494(blank)X 2741(line,)X 2943(and)X 3115(a)X 3191(body)X 3416(area,)X 3634(in)X 3742(accordance)X 720 2736(with)N 927(the)X 1081(RFC)X 1298(822)X 1478(syntax)X 1765(for)X 1913(a)X 1991(message.)X 2400(However,)X 2811(it)X 2900(should)X 3191(be)X 3317(noted)X 3566(that)X 3746(NO)X 3919(header)X 720 2848(\256elds)N 957(are)X 1104(actually)X 1439(required)X 1789(in)X 1894(these)X 2122(encapsulated)X 2650(messages.)X 3091(An)X 3238(encapsulation)X 3798(that)X 3973(starts)X 720 2960(with)N 926(a)X 1004(blank)X 1253(line,)X 1457(therefore,)X 1864(is)X 1963(a)X 2041(legitimate)X 2463(encapsulation)X 3028(of)X 3143(a)X 3221(message)X 3582(with)X 3788(no)X 3919(header)X 720 3072(\256elds.)N 1006(In)X 1116(such)X 1322(a)X 1395(case,)X 1615(of)X 1725(course,)X 2030(the)X 2178(absence)X 2513(of)X 2623(a)X 2696(Content-type)X 3229(header)X 3516(\256eld)X 3717(implies)X 4031(that)X 720 3184(the)N 862(encapsulation)X 1416(is)X 1504(MAILASCII)X 2023(text.)X 720 3408(Important)N 1134(to)X 1243(note)X 1443(is)X 1541(that)X 1720(the)X 1872(encapsulation)X 2436(boundary)X 2833(MUST)X 3133(NOT)X 3365(appear)X 3657(inside)X 3922(any)X 4096(of)X 720 3520(the)N 864(encapsulated)X 1388(parts.)X 1649(Thus,)X 1891(it)X 1971(is)X 2061(crucial)X 2349(that)X 2519(the)X 2662(composing)X 3109(agent)X 3343(be)X 3459(able)X 3645(to)X 3745(choose)X 4037(and)X 720 3632(specify)N 1048(the)X 1216(boundary)X 1629(that)X 1824(will)X 2025(separate)X 2392(the)X 2561(parts.)X 2847(This)X 3069(is)X 3184(done)X 3422(using)X 3681(the)X 3850(resource)X 720 3744(speci\256cation)N 1231(in)X 1330(the)X 1472(Content-type)X 1999(header)X 2280(\256eld.)X 720 3968(The)N 909(Content-type)X 1451(header)X 1747(\256eld,)X 1981(as)X 2101(de\256ned)X 2424(earlier)X 2711(in)X 2826(this)X 3005(document,)X 3449(has)X 3617(two)X 3801(important)X 720 4080(optional)N 1065(\256elds)X 1302(that)X 1476(may)X 1671(follow)X 1951(the)X 2098(type)X 2292(name.)X 2553(These)X 2811(\256elds)X 3047(are)X 3193(for)X 3333(a)X 3404(version)X 3715(number)X 4037(and)X 720 4192(a)N 811(resource)X 1185(speci\256cation.)X 1768(In)X 1896(the)X 2062(case)X 2276(of)X 2405(the)X 2572("multipart")X 3053(content-type,)X 3608(this)X 3796(document)X 720 4304(de\256nes)N 1017(version)X 1324(numbers)X 1679(1-S)X 1836(and)X 1999(1-P;)X 2183(if)X 2266(the)X 2408(version)X 2715(number)X 3033(is)X 3121(omitted)X 3440(or)X 3544("null",)X 3820(it)X 3898(is)X 3986(to)X 4085(be)X 720 4416(assumed)N 1095(to)X 1214(be)X 1349(version)X 1676(1-S.)X 1901(The)X 2095(two)X 2283(versions)X 2648(have)X 2875(identical)X 3253(syntax,)X 3573(but)X 3741(the)X 3904("-P")X 4112(is)X 720 4528(intended)N 1080(as)X 1188(a)X 1259(hint,)X 1461(to)X 1564(receivers,)X 1964(that)X 2137(the)X 2283(parts)X 2498(are)X 2644(intended)X 3004(to)X 3107(be)X 3226(viewed)X 3532(in)X 3634(parallel)X 3951(rather)X 720 4640(than)N 910(sequentially.)X 1472(Implementations)X 2144(that)X 2313(can)X 2471(not)X 2618(show)X 2844(the)X 2986(parts)X 3197(in)X 3296(parallel,)X 3634(or)X 3739(that)X 3909(choose)X 720 4752(not)N 882(to)X 996(do)X 1131(so,)X 1279(are)X 1436(free)X 1625(to)X 1739(treat)X 1950(all)X 2086(multipart)X 2479(messages)X 2881(of)X 3000(version)X 3321("1-P")X 3570(as)X 3688(if)X 3785(they)X 3989(were)X 720 4864(version)N 1031("1-S".)X 1318(However,)X 1722(all)X 1847 0.2548(implementation)AX 2481(should)X 2765(check)X 3018(the)X 3164(version)X 3475(number,)X 3821(to)X 3925(ensure)X 720 4976(graceful)N 1066(behavior)X 1433(in)X 1538(the)X 1685(event)X 1923(that)X 2097(an)X 2217(incompatible)X 2750(future)X 3009(version)X 3321(of)X 3430(multipart)X 3813(messages)X 720 5088(is)N 808(de\256ned)X 1115(later.)X 720 5312(The)N 904(resource)X 1264(speci\256cation,)X 1809(which)X 2078(is)X 2176(always)X 2477(required)X 2833(for)X 2980(multipart)X 3369(messages,)X 3791(is)X 3890(used)X 4101(to)X 720 5424(specify)N 1023(the)X 1166(format)X 1448(of)X 1553(the)X 1696(encapsulation)X 2251(boundary.)X 2687(The)X 2862(encapsulation)X 3417(boundary)X 3805(is)X 3893(de\256ned)X 720 5536(as)N 834(two)X 1012(hyphen)X 1329(characters)X 1754(\("-",)X 1954(decimal)X 2294(code)X 2510(45\))X 2672(followed)X 3048(by)X 3178(the)X 3331(resource-speci\256cation)X 720 5648(portion)N 1054(of)X 1190(the)X 1364(Content-type)X 1923(header)X 2236(\256eld)X 2463(with)X 2690(any)X 2885(leading)X 3224(or)X 3359(trailing)X 3693(white)X 3962(space)X 720 5760(removed.)N 1111(\(DISCUSSION:)X 1788(The)X 1969(speci\256cation)X 2487(that)X 2663(white)X 2908(space)X 3153(be)X 3275(removed)X 3643(is)X 3738(intended)X 4101(to)X 13 p %%Page: 13 13 12 s 0 xH 0 xS 1 f 2344 400(-)N 2400(13)X 2520(-)X 720 688(eliminate)N 1119(the)X 1276(possible)X 1629(introduction)X 2138(of)X 2256(ambiguity)X 2685(caused)X 2985(by)X 3119(the)X 3275(addition)X 3629(or)X 3747(deletion)X 4096(of)X 720 800(white)N 960(space)X 1200(by)X 1322(message)X 1674(transport)X 2042(agents.)X 2362(They)X 2586(hyphens)X 2932(are)X 3076(for)X 3214(rough)X 3464(compatibility)X 4005(with)X 720 912(the)N 869(earlier)X 1146(RFC)X 1357(934)X 1531(method)X 1850(of)X 1960(message)X 2316(encapsulation,)X 2900(and)X 3069(for)X 3211(ease)X 3407(of)X 3517(searching)X 3916(for)X 4058(the)X 720 1024(boundaries)N 1199(in)X 1331(some)X 1591(implementations.)X 2339(However,)X 2772(it)X 2884(should)X 3198(be)X 3347(noted)X 3619(that)X 3822(multipart)X 720 1136(messages)N 1120(are)X 1274(NOT)X 1507(completely)X 1972(compatible)X 2437(with)X 2644(RFC)X 2861(934)X 3041(encapsulations;)X 3671(in)X 3782(particular,)X 720 1248(they)N 930(do)X 1070(not)X 1237(obey)X 1468(RFC)X 1693(934)X 1882(quoting)X 2221(conventions)X 2731(for)X 2888(embedded)X 3329(lines)X 3556(that)X 3746(begin)X 4005(with)X 720 1360(hyphens.\))N 720 1584(Thus,)N 960(a)X 1027(typical)X 1314(multipart)X 1692(content-type)X 2198(header)X 2479(\256eld)X 2674(might)X 2923(look)X 3118(like)X 3287(this:)X 7 f 720 1808 0.4062(Content-type:)AN 1532 0.4167(multipart;)AX 2170(1-S;)X 2460 0.3929(gc0p4Jq0M2Yt08jU534c0p)AX 1 f 720 2032(This)N 918(indicates)X 1288(that)X 1460(the)X 1605(message)X 1958(consists)X 2289(of)X 2396(several)X 2696(parts,)X 2934(each)X 3138(itself)X 3358(structured)X 3770(as)X 3877(an)X 3995(RFC)X 720 2144(822)N 909(message,)X 1304(which)X 1584(are)X 1747(intended)X 2124(to)X 2244(be)X 2380(viewed)X 2703(one-at-a-time,)X 3292(and)X 3476(that)X 3665(the)X 3827(parts)X 4058(are)X 720 2256(separated)N 1108(by)X 1228(the)X 1370(line)X 7 f 720 2480 0.3913(--gc0p4Jq0M2Yt08jU534c0p)AN 1 f 720 2704(The)N 901(encapsulation)X 1462(boundaries)X 1915(must)X 2133(not)X 2287(appear)X 2575(within)X 2853(the)X 3003(encapsulations,)X 3626(and)X 3797(should)X 4085(be)X 720 2816(no)N 840(longer)X 1110(than)X 1300(70)X 1420(characters,)X 1859(not)X 2006(counting)X 2367(the)X 2509(two)X 2677(leading)X 2985(hyphens.)X 720 3040(It)N 821(should)X 1119(be)X 1252(noted)X 1508(that)X 1695(no)X 1833(interpretation)X 2395(is)X 2501(speci\256ed)X 2885(for)X 3039(any)X 3220(lines)X 3444(preceding)X 3866(the)X 4027(\256rst)X 720 3152(encapsulation)N 1276(boundary)X 1665(or)X 1771(following)X 2171(the)X 2314(last)X 2473(one.)X 2685(In)X 2790(general,)X 3123(these)X 3346("pre\256x")X 3673(and)X 3837("post\256x")X 720 3264(areas)N 952(of)X 1066(multipart)X 1454(messages)X 1851(should)X 2141(be)X 2266(regarded)X 2637(as)X 2751(comments,)X 3206(and)X 3380(implementations)X 4058(are)X 720 3376(free)N 900(to)X 1005(discard)X 1313(them.)X 1584(However,)X 1990(it)X 2074(is)X 2168(recommended)X 2744(that)X 2919(composing)X 3371(agents)X 3647(use)X 3805(the)X 3952(pre\256x)X 720 3488(area)N 912(to)X 1018(include)X 1333(a)X 1407(short)X 1630(textual)X 1924(message,)X 2305(in)X 2411(MAILASCII,)X 2961(explaining)X 3399(that)X 3575(what)X 3793(follows)X 4112(is)X 720 3600(an)N 839(encapsulated)X 1365(multipart)X 1747(message,)X 2125(intended)X 2485(to)X 2588(be)X 2707(interpreted)X 3153(by)X 3277(software)X 3635(rather)X 3887(than)X 4080(by)X 720 3712(human)N 1009(eyes.)X 1255(This)X 1453(message)X 1806(is)X 1897(for)X 2036(the)X 2182(bene\256t)X 2472(of)X 2580(people)X 2865(who)X 3058(might)X 3311(read)X 3505(the)X 3651(message)X 4005(with)X 720 3824(older)N 942(user)X 1126(agents)X 1396(that)X 1565(do)X 1685(not)X 1832(properly)X 2182(interpret)X 2533(multipart)X 2911(messages.)X 720 4048(The)N 902(use)X 1062(of)X 1174("Content-Type:)X 1807(Multipart")X 2243(as)X 2356(a)X 2432(message)X 2791(part)X 2974(within)X 3253(another)X 3575("Content-Type:)X 720 4160(Multipart")N 1153(is)X 1247(explicitly)X 1642(allowed.)X 2049(In)X 2159(such)X 2365(cases,)X 2622(for)X 2764(obvious)X 3098(reasons,)X 3440(care)X 3631(must)X 3847(be)X 3967(taken)X 720 4272(to)N 827(ensure)X 1110(that)X 1287(each)X 1496(nested)X 1774(mulitpart)X 2160(message)X 2518(should)X 2806(use)X 2966(a)X 3042(different)X 3407(boundary)X 3803(delimiter.)X 720 4384(See)N 883(the)X 1025(example)X 1376(in)X 1475(the)X 1617(following)X 2015(section.)X 720 4608(Overall,)N 1057(the)X 1199(body)X 1415(of)X 1519(a)X 1586(multipart)X 1964(message)X 2314(may)X 2504(be)X 2619(speci\256ed)X 2985(as)X 3089(follows:)X 7 f 10 s 720 4816(body)N 960(:=)X 1104(delimiter)X 1584(1*encapsulation)X 720 5008(encapsulation)N 1392(:=)X 1536(message)X 1920(CRLF)X 2160(delimiter)X 720 5200(delimiter)N 1200(:=)X 1344("--")X 1584()X 3456(CRLF)X 720 5392(message)N 1104(=)X 1200()X 14 p %%Page: 14 14 10 s 0 xH 0 xS 7 f 1 f 12 s 2344 384(-)N 2400(14)X 2520(-)X 3 f 14 s 720 704(5)N 1104(A)X 1213(Complex)X 1671(M)X 1777(ultipart)X 2173(Example)X 1 f 12 s 720 944(What)N 968(follows)X 1295(is)X 1398(the)X 1555(outline)X 1862(of)X 1981(a)X 2063(complex)X 2434(multipart)X 2827(message.)X 3240(This)X 3450(message)X 3815(has)X 3983(three)X 720 1056(parts)N 968(to)X 1104(be)X 1256(displayed)X 1686(serially:)X 2082(an)X 2234(introductory)X 2771(plain)X 3025(text)X 3231(\(MAILASCII\))X 3851(part,)X 4085(an)X 720 1168(embedded)N 1153(multipart)X 1544(message,)X 1931(and)X 2107(a)X 2188(closing)X 2504("rich)X 2731(text")X 2953(part)X 3141(in)X 3254(SGML,)X 3582(which)X 3855(includes)X 720 1280(additional)N 1140(header)X 1431(\256elds)X 1673(to)X 1782(indicate)X 2122(that)X 2301(it)X 2388(originally)X 2796(came)X 3033(from)X 3253(a)X 3329(different)X 3694(sender.)X 4026(The)X 720 1392(embedded)N 1148(multipart)X 1534(message)X 1892(has)X 2052(two)X 2228(parts)X 2447(to)X 2555(be)X 2679(displayed)X 3081(in)X 3189(parallel,)X 3536(a)X 3612(picture)X 3913(and)X 4085(an)X 720 1504(audio)N 958(fragment.)X 7 f 10 s 1008 1712(From:)N 1296(...)X 1008 1808(Subject:)N 1440(...)X 1008 1904(Content-type:)N 1680(multipart;)X 2208(1-s;)X 2448(tweedledum)X 1008 2096(This)N 1248(is)X 1392(a)X 1488(multipart)X 1968(message.)X 1008 2192(If)N 1152(you)X 1344(are)X 1536(reading)X 1920(this)X 2160(text,)X 2448(you)X 2640(might)X 2928(want)X 3168(to)X 1008 2288(consider)N 1440(changing)X 1872(to)X 2016(a)X 2112(user)X 2352(agent)X 2640(that)X 2880(understands)X 1008 2384(how)N 1200(to)X 1344(properly)X 1776(display)X 2160(multipart)X 2640(messages.)X 1008 2480(--tweedledum)N 1008 2672(...Introductory)N 1776(text)X 2016(goes)X 2256(here...)X 1008 2768([Note)N 1296(that)X 1536(the)X 1728(preceding)X 2208(blank)X 2496(line)X 2736(means)X 1008 2864(no)N 1152(header)X 1488(fields)X 1824(were)X 2064(given)X 2352(and)X 2544(this)X 2784(is)X 2928(MAILASCII.])X 1008 2960(--tweedledum)N 1008 3056(Content-type:)N 1680(multipart;)X 2208(1-p;)X 2448(tweedledee)X 1008 3248(This)N 1248(is)X 1392(a)X 1488(multipart)X 1968(message.)X 1008 3344(If)N 1152(you)X 1344(are)X 1536(reading)X 1920(this)X 2160(text,)X 2448(you)X 2640(might)X 2928(want)X 3168(to)X 1008 3440(consider)N 1440(changing)X 1872(to)X 2016(a)X 2112(user)X 2352(agent)X 2640(that)X 2880(understands)X 1008 3536(how)N 1200(to)X 1344(properly)X 1776(display)X 2160(multipart)X 2640(messages.)X 1008 3632(--tweedledee)N 1008 3728(Content-type:)N 1680(u-law;)X 2016(8000)X 2256(HZ;)X 2448(X-NEXT)X 1008 3824(Content-Encoding:)N 1872(Hexadecimal)X 1008 4016(...)N 1200(hex-encoded)X 1776(NeXT-format)X 2352(audio)X 2640(data)X 2880(goes)X 3120(here....)X 1008 4112(--tweedledee)N 1008 4208(Content-type:)N 1680(G3FAX)X 1008 4304(Content-Encoding:)N 1872(Base64)X 1008 4496(...)N 1200(base64-encoded)X 1920(FAX)X 2112(data)X 2352(goes)X 2592(here....)X 1008 4592(--tweedledee)N 1008 4688(--tweedledum)N 1008 4784(From:)N 1296(...)X 1008 4880(Subject:)N 1440(...)X 1008 4976(Content-type:)N 1680(SGML;)X 1968(null)X 1008 5072(Content-Encoding:)N 1872(Quoted-printable)X 1008 5264(...)N 1200(Closing)X 1584(text)X 1824(goes)X 2064(here)X 2304(...)X 1008 5360(--tweedledum)N 15 p %%Page: 15 15 10 s 0 xH 0 xS 7 f 12 s 1 f 2344 400(-)N 2400(15)X 2520(-)X 3 f 14 s 720 704(6)N 1104(The)X 1318(Encoded-Variable)X 2217(Header)X 2600(Field)X 1 f 12 s 720 944(A)N 838(particularly)X 1332(thorny)X 1632(problem,)X 2026(not)X 2199(addressed)X 2628(by)X 2774(the)X 2942(Content-Encoding)X 3698(header)X 4005(\256eld)X 720 1056(speci\256ed)N 1087(earlier)X 1359(in)X 1459(this)X 1623(memo,)X 1913(is)X 2002(the)X 2145(problem)X 2491(of)X 2596(including)X 2985(data)X 3170(other)X 3392(than)X 3582(MAILASCII)X 4101(in)X 720 1168(a)N 787(message)X 1137(header.)X 720 1392(It)N 806(is)X 897(tempting,)X 1291(to)X 1393(many,)X 1658(to)X 1760(simply)X 2049(declare)X 2355(that)X 2527(such)X 2730(inclusion)X 3110(is)X 3201(too)X 3351(problematic,)X 3864(and)X 4031(that)X 720 1504(message)N 1108(headers)X 1464(should)X 1782(always)X 2111(be)X 2263(entirely)X 2619(MAILASCII.)X 3223(After)X 3487(all,)X 3669(most)X 3917(of)X 4058(the)X 720 1616(information)N 1214(in)X 1328(the)X 1486(header)X 1783(is)X 1887(not)X 2050(intended)X 2422(for)X 2574(human)X 2876(consumption)X 3413(anyway.)X 3800(However,)X 720 1728(there)N 949(are)X 1103(certain)X 1402(parts)X 1625(of)X 1741(the)X 1895(header)X 2188(that)X 2369(are)X 2523(intended)X 2891(entirely)X 3222(for)X 3370(human)X 3668(viewing,)X 4037(and)X 720 1840(these)N 942(are)X 1085(the)X 1228(parts)X 1440(where)X 1700(MAILASCII)X 2220(is)X 2309(deemed)X 2634(most)X 2846(unsatisfactory.)X 3459(In)X 3564(particular,)X 3983(there)X 720 1952(is)N 815(widespread)X 1284(desire)X 1545(to)X 1651(have)X 1864(the)X 2013(contents)X 2365(of)X 2476(the)X 2625(Subject)X 2945(\256eld)X 3147(and)X 3316(the)X 3464(names)X 3740(of)X 3850(message)X 720 2064(senders)N 1032(and)X 1195(recipients)X 1594(appear)X 1875(in)X 1974(languages)X 2383(that)X 2552(cannot)X 2833(be)X 2948(represented)X 3416(in)X 3515(MAILASCII.)X 720 2288(The)N 903(heart)X 1129(of)X 1242(the)X 1393(problem)X 1748(is)X 1846(the)X 1998(fact)X 2177(that)X 2356(RFC822)X 2715(prescribes)X 3139(a)X 3216(great)X 3443(deal)X 3638(of)X 3752(syntax)X 4037(and)X 720 2400(semantics)N 1131(for)X 1274(the)X 1423(message)X 1780(header)X 2068(area,)X 2283(all)X 2410(of)X 2520(it)X 2604(based)X 2853(on)X 2979(MAILASCII.)X 3552(Tampering)X 4005(with)X 720 2512(this,)N 908(it)X 987(would)X 1253(seem,)X 1501(could)X 1741(introduce)X 2131(a)X 2200(great)X 2419(deal)X 2606(of)X 2712(complexity,)X 3196(as)X 3302(well)X 3494(as)X 3600(bugs)X 3807(involving)X 720 2624(backward)N 1118(compatibility.)X 720 2848(Instead,)N 1062(this)X 1241(memo)X 1522(proposes)X 1903(a)X 1986(mechanism)X 2465(by)X 2601(which)X 2876(the)X 3035(header)X 3333(area)X 3535(remains)X 3881(entirely)X 720 2960(MAILASCII,)N 1272(but)X 1428(encodes)X 1771(non-MAILASCII)X 2475(information)X 2963(in)X 3071(a)X 3147(manner)X 3469(from)X 3689(which)X 3956(it)X 4042(can)X 720 3072(easily)N 969(be)X 1084(restored)X 1418(by)X 1538(conforming)X 2011(user)X 2195(agents.)X 720 3296(The)N 925(basic)X 1178(idea)X 1394(is)X 1513(that,)X 1737(in)X 1867(certain)X 2185(parts)X 2427(of)X 2562(the)X 2735(headers)X 3084(which)X 3374(are)X 3547(never)X 3817(machine-)X 720 3408(interpreted,)N 1199(the)X 1354(human-readable)X 2012(data)X 2210(might)X 2472(best)X 2663(be)X 2790(represented)X 3270(in)X 3381(a)X 3460(content-type)X 3978(other)X 720 3520(than)N 915(MAILASCII.)X 1463(In)X 1572(such)X 1777(cases,)X 2033(the)X 2180(data)X 2370(are)X 2518(to)X 2623(be)X 2744(represented,)X 3242(in)X 3347(the)X 3495(header)X 3782(\256eld,)X 4007(by)X 4133(a)X 720 3632("variable)N 1097(reference")X 1522(--)X 1613(a)X 1683(placeholder)X 2160(for)X 2298(a)X 2367(value)X 2602(de\256ned)X 2911(elsewhere)X 3322(in)X 3423(the)X 3567(message)X 3919(header)X 720 3744(area.)N 950(The)X 1145(variables)X 1538(are)X 1701(de\256ned)X 2030(by)X 2172(one)X 2357(or)X 2483(more)X 2727("Encoded-Variable")X 3552(headers,)X 3916(with)X 4133(a)X 720 3856(syntax)N 995(as)X 1099(speci\256ed)X 1465(below.)X 720 4080(Thus,)N 975(for)X 1126(example,)X 1516(if)X 1614(a)X 1696(user's)X 1964(name)X 2212(includes)X 2573(characters)X 3004(that)X 3189(cannot)X 3486(be)X 3617(represented)X 4101(in)X 720 4192(MAILASCII,)N 1293(it)X 1377(can)X 1541(be)X 1662(replaced)X 2019(by)X 2145(the)X 2293(name)X 2531(of)X 2640(a)X 2712(variable)X 3052(that)X 3226(is)X 3319(de\256ned)X 3631(elsewhere.)X 4069(To)X 720 4304(improve)N 1074(readability)X 1520(by)X 1649(UA's)X 1889(that)X 2067(only)X 2271(handle)X 2561(MAILASCII,)X 3113(it)X 3201(is)X 3299(recommended)X 3879(that)X 4058(the)X 720 4416(variable)N 1061(name)X 1300(itself)X 1523(be)X 1644(as)X 1754(close)X 1982(an)X 2103(approximation)X 2695(as)X 2805(possible)X 3150(to)X 3254(the)X 3401(correct)X 3698(name.)X 3960(Thus,)X 720 4528(for)N 856(example,)X 1231(one)X 1394(might)X 1643(have;)X 7 f 10 s 720 4736(From:)N 1008($Keld_JXrn_Simonsen)X 1968()X 720 4832(Encoded-Variable:)N 1584(Keld_JXrn_Simonsen)X 2496(=)X 2592(quoted-printable,)X 3456(iso646,)X 1104 4928(Keld_J&0Crn_Simonsen)N 1 f 12 s 720 5136(***)N 901(NOTE:)X 1245(It)X 1342(would)X 1620(be)X 1749(nice)X 1948(to)X 2061(get)X 2217(the)X 2373(character)X 2765(set)X 2910(&)X 3023(hex)X 3200(code)X 3420(right)X 3640(for)X 3790(the)X 3946(above)X 720 5248(example.)N 720 5472(Where)N 1015(multiple)X 1375(variables)X 1761(need)X 1981(to)X 2095(be)X 2225(de\256ned,)X 2571(multiple)X 2932(Encoded-Variable)X 3672(header)X 3968(\256elds)X 720 5584(may)N 910(be)X 1025(used.)X 16 p %%Page: 16 16 12 s 0 xH 0 xS 1 f 2344 400(-)N 2400(16)X 2520(-)X 720 688(It)N 815(is)X 915(important)X 1326(to)X 1437(constrain)X 1826(the)X 1981(use)X 2146(of)X 2263(encoded-variables)X 3001(to)X 3113(places)X 3391(where)X 3663(they)X 3866(will)X 4053(not)X 720 800(interfere)N 1077(with)X 1277(the)X 1424(established)X 1881(syntax)X 2161(or)X 2270(semantics)X 2679(of)X 2788(header)X 3074(\256elds.)X 3359(For)X 3521(that)X 3695(reason,)X 3999(their)X 720 912(use)N 873(is)X 962(explicitly)X 1352(restricted)X 1736(to)X 1836(the)X 1979(Subject)X 2293(and)X 2457(Comments)X 2899(header)X 3181(\256elds,)X 3438(and)X 3602(to)X 3703(the)X 3847("phrase")X 720 1024(portion)N 1025(of)X 1132(RFC)X 1340(822)X 1511(addresses.)X 1954(This)X 2152(implies)X 2463(a)X 2532(small)X 2767(rede\256nition)X 3237(of)X 3343(RFC)X 3550(822's)X 3789("optional-)X 720 1136(\256eld",)N 978("mailbox",)X 1420(and)X 1583("group")X 1909(syntax:)X 7 f 10 s 720 1344(optional-field)N 1440(=)X 1536 1440(/)N 1680("Message-ID")X 2640(":")X 2928(msg-id)X 1536 1536(/)N 1680("Resent-Message-ID")X 2640(":")X 2928(msg-id)X 1536 1632(/)N 1680("In-Reply-To")X 2640(":")X 2880(*\(phrase)X 3312(/)X 3408(msg-id\))X 1536 1728(/)N 1680("References")X 2640(":")X 2880(*\(phrase)X 3312(/)X 3408(msg-id\))X 1536 1824(/)N 1680("Keywords")X 2640(":")X 2880(#phrase)X 1536 1920(/)N 1680("Subject")X 2640(":")X 2880(var-text)X 1536 2016(/)N 1680("Comments")X 2640(":")X 2880(var-text)X 1536 2112(/)N 1680("Encrypted")X 2640(":")X 2832(1#2word)X 1536 2208(/)N 1680(extension-field)X 2928(;)X 3024(To)X 3168(be)X 3312(defined)X 1536 2304(/)N 1680(user-defined-field)X 2928(;)X 3024(May)X 3216(be)X 3360(pre-empted)X 720 2496(mailbox)N 1296(=)X 1440(addr-spec)X 2832(;)X 2928(simple)X 3264(address)X 1536 2592(/)N 1680(var-phrase)X 2208(route-addr)X 2832(;)X 2928(name)X 3168(&)X 3264(addr-spec)X 720 2784(group)N 1296(=)X 1440(var-phrase)X 1968(":")X 2160([#mailbox])X 2688(";")X 1 f 12 s 720 2992(The)N 894(two)X 1062(new)X 1246(syntactic)X 1613(entities,)X 1940("var-text")X 2342(and)X 2505("var-phrase",)X 3037(are)X 3179(de\256ned)X 3486(as)X 3590(follows:)X 7 f 10 s 720 3200(var-text)N 1152(=)X 1296(*text)X 1584(/)X 1680(var-ref)X 720 3392(var-phrase)N 1248(=)X 1392(phrase)X 1728(/)X 1824(var-ref)X 720 3584(var-ref)N 1104(=)X 1248("$")X 1440(var-name)X 720 3776(var-name)N 1152(=)X 1248(atom)X 1 f 12 s 720 3984(NOTE)N 1016(that)X 1201(the)X 1360(de\256nition)X 1770(of)X 1891("atom")X 2203(permits)X 2533(underscores,)X 3057(but)X 3221(not)X 3385(spaces)X 3677(or)X 3798(any)X 3978(other)X 720 4096("specials")N 1132(as)X 1241(de\256ned)X 1553(by)X 1678(RFC)X 1888(822.)X 2109(Note)X 2325(also)X 2509(that)X 2683(this)X 2851(does)X 3056(not)X 3207(actually)X 3541(change)X 3842(the)X 3988(legal)X 720 4208(syntax)N 1003(de\256ned)X 1318(by)X 1446(RFC)X 1659(822,)X 1860(because)X 2198(a)X 2274("var-ref")X 2647(is)X 2744(itself)X 2970(a)X 3046(valid)X 3272(instance)X 3621(of)X 3734("phrase")X 4096(or)X 720 4320("*text".)N 1086(Thus,)X 1349(no)X 1492(correct)X 1807(existing)X 2159(parsers)X 2478(should)X 2781(be)X 2919(broken)X 3233(by)X 3376(the)X 3540(new)X 3746(de\256nitions.)X 720 4432(However,)N 1123(the)X 1268(old)X 1418(parsers)X 1717(will)X 1894(not)X 2044(recognize)X 2446(a)X 2516(difference)X 2934(between)X 3283(a)X 3354(var-ref)X 3644(and)X 3811(any)X 3978(other)X 720 4544(instance)N 1060(of)X 1164(*text)X 1381(or)X 1485(phrase,)X 1784(and)X 1947(will)X 2121(therefore)X 2493(not)X 2640(do)X 2760(any)X 2923(variable)X 3258(substitution.)X 720 4768(The)N 894(syntax)X 1169(of)X 1273(the)X 1415(Encoded-Variable)X 2140(\256eld)X 2335(is)X 2423(de\256ned)X 2730(as)X 2834(follows:)X 7 f 10 s 720 4976(Encoded-variable)N 1536(=)X 1632(var-name)X 2064("=")X 2256(Content-Encoding)X 1632 5072(",")N 1824(Content-Type)X 2448(",")X 2640(var-contents)X 720 5264(var-contents)N 1344(=)X 1440(*text)X 1 f 12 s 720 5472(Here)N 939(the)X 1090(var-contents)X 1599(is)X 1696(the)X 1847(encoded)X 2201(value)X 2443(of)X 2556(the)X 2707(variable,)X 3075(of)X 3188(a)X 3264(type)X 3463(given)X 3710(by)X 3839(Content-)X 720 5584(Type)N 950(and)X 1121(encoded)X 1474(with)X 1677(the)X 1827(encoding)X 2212(given)X 2458(in)X 2564(Content-Encoding.)X 3349(Both)X 3567(a)X 3641(Content-Type)X 720 5696(and)N 883(a)X 950(Content-Encoding)X 1680(are)X 1822(required)X 2167(for)X 2303(each)X 2504(Encoded-Variable)X 3229(header)X 3510(\256eld.)X 17 p %%Page: 17 17 12 s 0 xH 0 xS 1 f 2344 400(-)N 2400(17)X 2520(-)X 3 f 14 s 720 704(7)N 1104 0.3042(Cross-references)AX 1939(Between)X 2372(Encapsulated)X 3047(Parts)X 1 f 12 s 720 944(Within)N 1015(a)X 1085(multipart)X 1466(message,)X 1844(as)X 1952(de\256ned)X 2263(above,)X 2545(there)X 2766(is)X 2858(essentially)X 3293(no)X 3417(cross-encapsulation)X 720 1056(structure.)N 1168(However,)X 1607(multimedia)X 2110(mail)X 2345(systems)X 2712(such)X 2950(as)X 3092(Andrew)X 3463([REF-ATK])X 3994(have)X 720 1168(demonstrated)N 1269(the)X 1417(value)X 1656(of)X 1766(inter-part)X 2155(reference.)X 2593(All)X 2747(that)X 2923(is)X 3018(necessary,)X 3447(in)X 3553(order)X 3787(to)X 3893(make)X 4133(a)X 720 1280(multipart)N 1100(scheme)X 1415(work,)X 1661(is)X 1750(a)X 1818(mechanism)X 2282(to)X 2382(allow)X 2621(one)X 2785(encapsulated)X 3308(part)X 3483(to)X 3583(make)X 3817(reference)X 720 1392(to)N 820(another.)X 1158(Some)X 1402(have)X 1609(proposed)X 1986(the)X 2130(use)X 2284(of)X 2390(a)X 2459(new)X 2645("Content-Label")X 3306(header)X 3589(\256eld)X 3786(within)X 4058(the)X 720 1504(encapsulated)N 1259(parts,)X 1511(in)X 1627(order)X 1871(to)X 1987(give)X 2194(each)X 2412(part)X 2603(a)X 2686(name)X 2935(by)X 3071(which)X 3346(it)X 3440(can)X 3614(be)X 3745(referenced.)X 720 1616(However,)N 1123(this)X 1289(is)X 1380(not)X 1530(necessary,)X 1956(as)X 2064(the)X 2210(established)X 2666(Message-ID)X 3163(header)X 3448(\256eld)X 3647(can)X 3809(in)X 3912(fact)X 4085(be)X 720 1728(used)N 922(for)X 1059(precisely)X 1432(this)X 1596(purpose.)X 1973(Each)X 2191(encapsulated)X 2714(part)X 2889(can)X 3048(include)X 3357(a)X 3425(Message-ID)X 3919(header)X 720 1840(\256eld,)N 939(which)X 1198(can)X 1356(then)X 1546(be)X 1661(used)X 1861(for)X 1997(reference)X 2380(purposes)X 2745(by)X 2865(related)X 3152(body)X 3368(parts.)X 3 f 14 s 720 2080(8)N 1104(Optional)X 1555(Content-size)X 2183(Header)X 2566(Field)X 1 f 12 s 720 2320(In)N 853(the)X 1024(discussions)X 1514(of)X 1647(earlier)X 1947(drafts)X 2219(of)X 2352(this)X 2544(memo,)X 2863(some)X 3120(people)X 3431(indicated)X 3839(a)X 3936(strong)X 720 2432(preference)N 1194(for)X 1373(using)X 1648(a)X 1758(size-counting)X 2344(scheme)X 2700(to)X 2842(delimit)X 3183(the)X 3367(boundaries)X 3855(between)X 720 2544(encapsulated)N 1252(parts)X 1473(of)X 1587(multipart)X 1975(messages.)X 2420(This)X 2625(was)X 2808(rejected)X 3148(because)X 3487(such)X 3697(schemes)X 4058(are)X 720 2656(not,)N 904(in)X 1015(general,)X 1359(suf\256ciently)X 1828(robust)X 2104(across)X 2380(the)X 2534(SMTP)X 2820(transport)X 3198(layer.)X 3475(For)X 3644(example,)X 4031(line)X 720 2768(counts)N 1007(can)X 1178(be)X 1306(altered)X 1606(by)X 1739(line-wrapping)X 2316(MTA's,)X 2659(and)X 2835(byte)X 3038(counts)X 3326(can)X 3497(be)X 3625(altered)X 3925(in)X 4037(any)X 720 2880(number)N 1046(of)X 1158(ways.)X 1435(However,)X 1842(there)X 2066(are)X 2215(restricted)X 2605(environments)X 3160(in)X 3266(which)X 3532(either)X 3783(or)X 3894(both)X 4096(of)X 720 2992(these)N 967(counts)X 1267(can)X 1451(be)X 1592(relied)X 1862(upon,)X 2128(and)X 2317(in)X 2442(such)X 2668(environments)X 3242(it)X 3346(may)X 3562(be)X 3703(desirable)X 4101(to)X 720 3104(implement)N 1187(a)X 1284(count-based)X 1803(approach)X 2210(to)X 2339(delimiters.)X 2827(Therefore)X 3261(this)X 3454(memo)X 3749(speci\256es)X 4133(a)X 720 3216(conventional)N 1246(way)X 1434(to)X 1537(do)X 1661(this,)X 1852(in)X 1955(order)X 2186(to)X 2289(promote)X 2638 0.2083(interoperability)AX 3261(among)X 3551(systems)X 3884(that)X 4058(are)X 720 3328(able)N 905(to)X 1004(take)X 1189(this)X 1352(approach.)X 720 3552(In)N 838(such)X 1052(cases,)X 1317(boundary)X 1718(delimiters,)X 2166(as)X 2284(de\256ned)X 2605(above,)X 2897(are)X 3053(still)X 3236(required.)X 3643(However,)X 4058(the)X 720 3664(header)N 1008(area)X 1200(of)X 1311(an)X 1433(encapsulated)X 1962(part)X 2143(may)X 2340(include)X 2654(an)X 2775(optional)X 3121(Content-Size)X 3654(header)X 3941(which)X 720 3776(indicates)N 1091(where)X 1354(the)X 1500(encapsulated)X 2026(part)X 2204(ends,)X 2432(if)X 2519(its)X 2638(size)X 2816(has)X 2973(not)X 3125(been)X 3336(altered.)X 3652(The)X 3831(size)X 4010(may)X 720 3888(be)N 837(measured)X 1232(in)X 1333(either)X 1579(bytes)X 1808(or)X 1913(lines.)X 2168(Those)X 2428(who)X 2618(use)X 2771(the)X 2914(Content-Size)X 3442(header)X 3724(\256eld)X 3920(should)X 720 4000(still)N 902(preserve)X 1265(the)X 1420(encapsulation)X 1987(boundaries,)X 2471(and)X 2648(should)X 2942(recognize)X 3355(that)X 3538(other)X 3774(agents)X 4058(are)X 720 4112(free)N 894(to)X 993(ignore)X 1263(it)X 1341(in)X 1440(favor)X 1667(of)X 1771(complete)X 2149(reliance)X 2479(on)X 2599(encapsulation)X 3153(boundaries.)X 720 4336(The)N 894(Content-Size)X 1421(header)X 1702(\256eld)X 1897(is)X 1985(de\256ned)X 2292(as)X 2396(follows:)X 7 f 720 4560 0.4091(Content-Size)AN 1474(=)X 1590(1*DIGIT)X 2054("lines")X 1104 4672(/)N 1220(1*DIGIT)X 1684("bytes")X 1 f 720 4896(Note)N 961(that)X 1160(each)X 1391(encapsulated)X 1943(part)X 2148(should)X 2459(still)X 2659(end)X 2853(with)X 3079(a)X 3177(newline)X 3537(followed)X 3934(by)X 4085(an)X 720 5008(encapsulation)N 1287(boundary.)X 1735(However,)X 2148(a)X 2228(message)X 2591(store)X 2815(that)X 2997(wishes,)X 3319(for)X 3468(example,)X 3856(to)X 3968(use)X 4133(a)X 720 5120(storage)N 1029(format)X 1318(that)X 1495(is)X 1591(largely)X 1891(RFC)X 2104(822-compliant,)X 2722(but)X 2877(includes)X 3230(binary)X 3508(storage)X 3818(of)X 3930(binary)X 720 5232(objects,)N 1043(can)X 1203(use)X 1357(the)X 1501(Content-Size)X 2030(header)X 2313(\256eld)X 2510(to)X 2611(indicate)X 2943(whether)X 3279(or)X 3384(not)X 3532(the)X 3675(\256nal)X 3871(newline)X 720 5344(is)N 820(to)X 931(be)X 1058(interpreted)X 1512(as)X 1628(part)X 1814(of)X 1930(the)X 2084(binary)X 2366(object.)X 2687(If)X 2788(the)X 2943(newline)X 3285(follows)X 3610(the)X 3765(number)X 4096(of)X 720 5456(bytes)N 947(speci\256ed)X 1313(for)X 1449(the)X 1591(encapsulation,)X 2169(then)X 2359(it)X 2437(is)X 2525(not)X 2672(part)X 2846(of)X 2950(the)X 3092(encapsulation.)X 18 p %%Page: 18 18 12 s 0 xH 0 xS 1 f 2344 400(-)N 2400(18)X 2520(-)X 720 688(The)N 904(size)X 1088(given)X 1336(by)X 1466(the)X 1618(Content-Size)X 2155(header)X 2446(\256eld)X 2651(is)X 2749(the)X 2901(size)X 3085(of)X 3199(the)X 3351(encapsulation's)X 3984(body)X 720 800(only,)N 940(not)X 1088(counting)X 1450(the)X 1593(blank)X 1832(line)X 2002(that)X 2172(separates)X 2550(the)X 2693(header)X 2975(from)X 3186(the)X 3328(body.)X 3592(In)X 3696(other)X 3918(words,)X 720 912(the)N 867(four)X 1056(bytes)X 1288(CRLF)X 1557(CRLF,)X 1850(which)X 2114(separate)X 2459(header)X 2745(from)X 2961(body,)X 3206(are)X 3354(NOT)X 3581(counted)X 3916(as)X 4026(part)X 720 1024(of)N 824(the)X 966(content-size.)X 3 f 14 s 720 1264(9)N 1104(Summary)X 1 f 12 s 720 1504(Using)N 978(the)X 1125(Content-Type)X 1689(and)X 1857(Content-Encoding)X 2592(header)X 2878(\256elds,)X 3139(it)X 3222(is)X 3315(possible)X 3659(to)X 3763(include,)X 4101(in)X 720 1616(a)N 797(standardized)X 1318(way,)X 1536(arbitrary)X 1902(types)X 2139(of)X 2253(data)X 2448(objects)X 2755(in)X 2864(RFC)X 3079(822)X 3257(mail)X 3462(messages,)X 3882(without)X 720 1728(breaking)N 1088(any)X 1258(of)X 1369(the)X 1518(existing)X 1854(restrictions)X 2314(imposed)X 2672(by)X 2800(RFC)X 3013(821)X 3189(and)X 3360(RFC)X 3573(822.)X 3797(Using)X 4058(the)X 720 1840("mulitpart")N 1193(content-type,)X 1740(it)X 1835(is)X 1940(possible)X 2296(to)X 2412(mix)X 2603(multiple)X 2966(objects)X 3280(of)X 3401(different)X 3774(types)X 4018(in)X 4133(a)X 720 1952(single)N 1024(message.)X 1472(The)X 1696(additional)X 2156(optional)X 2547(header)X 2879(\256eld,)X 3149(Content-Size)X 3727(provides)X 4133(a)X 720 2064(conventional)N 1267(mechanism)X 1755(for)X 1916(an)X 2055(extension)X 2472(deemed)X 2820(desirable)X 3216(by)X 3360(many)X 3622(implementors.)X 720 2176(Finally,)N 1048(a)X 1123(limited)X 1429(mechanism)X 1900(is)X 1996(provided)X 2370(for)X 2514(including)X 2910(non-MAILASCII)X 3613(data)X 3806(in)X 3913(certain)X 720 2288(RFC)N 925(822)X 1093(header)X 1374(\256elds.)X 720 2512(For)N 877(more)X 1099(information,)X 1602(the)X 1744(authors)X 2051(of)X 2155(this)X 2318(document)X 2722(may)X 2912(be)X 3027(contacted)X 3421(via)X 3563(Internet)X 3887(mail:)X 2 f 1428 2736(Nathaniel)N 1832(Borenstein)X 2273()X 1617 2848(Ned)N 1796(Freed)X 2050()X 3 f 14 s 720 3088(10)N 1104(Acknowledgements)X 1 f 12 s 720 3328(This)N 928(RFC)X 1146(is)X 1248(the)X 1404(result)X 1656(of)X 1774(the)X 1930(collective)X 2344(effort)X 2596(of)X 2714(a)X 2795(large)X 3026(number)X 3358(of)X 3476(people,)X 3795(at)X 3903(several)X 720 3440(IETF)N 973(meetings)X 1371(and)X 1560(on)X 1706(the)X 1874(IETF-SMTP)X 2409(and)X 2598(IETF-822)X 3027(mailing)X 3372(lists.)X 3625(Although)X 4037(any)X 720 3552(enumeration)N 1228(seems)X 1489(doomed)X 1825(to)X 1926(suffer)X 2176(from)X 2389(egregious)X 2789(omissions,)X 3223(the)X 3368(following)X 3769(are)X 3914(among)X 720 3664(the)N 874(many)X 1124(contributors)X 1625(to)X 1736(this)X 1910(effort:)X 2210(Harald)X 2507(Alvestrand,)X 2988(Kevin)X 3258(Carosso,)X 3626(Mark)X 3869(Crispin,)X 720 3776(Dave)N 958(Crocker,)X 1328(Walt)X 1552(Daniels,)X 1906(Kevin)X 2177(Donnelly,)X 2595(Johnny)X 2908(Eriksson,)X 3304(Craig)X 3554(Everhart,)X 3946(Bruce)X 720 3888(Howard,)N 1103(Risto)X 1356(Kankkunen,)X 1872(Neil)X 2087(Katin,)X 2374(Steve)X 2637(Kille,)X 2903(Anders)X 3229(Klemets,)X 3623(John)X 3853(Klensin,)X 720 4000(Vincent)N 1076(Lau,)X 1301(Timo)X 1561(Lehtinen,)X 1979(Rick)X 2212(McGowan,)X 2692(Mark)X 2951(Needleman,)X 3465(John)X 3698(Noerenberg,)X 720 4112(David)N 1000(Robinson,)X 1437(Jonathan)X 1824(Rosenberg,)X 2304(Jan)X 2477(Rynning,)X 2877(Mark)X 3130(Sherman,)X 3541(Keld)X 3773(Simonsen,)X 720 4224(Einar)N 963(Stefferud,)X 1379(Michael)X 1729(Stein,)X 1985(Robert)X 2281(Ullman,)X 2628(Stuart)X 2892(Vance,)X 3221(Erik)X 3422(van)X 3596(der)X 3754(Poel,)X 3984(Greg)X 720 4336(Vaudreuil,)N 1156(Brian)X 1397(Wideen,)X 1748(Glenn)X 2010(Wright,)X 2334(and)X 2500(David)X 2762(Zimmerman.)X 3314(The)X 3491(authors)X 3801(apologize)X 720 4448(for)N 856(any)X 1019(omissions)X 1427(from)X 1638(this)X 1801(list,)X 1967(which)X 2226(were)X 2437(certainly)X 2799(unintentional.)X 3 f 14 s 720 4688(11)N 1104(References)X 1 f 12 s 720 4928([REF-PS])N 1176(Adobe)X 1486(Systems,)X 1884(Inc.)X 2109(Postscript)X 2542(Language)X 2976(Reference)X 3422(Manual.)X 3819(Addison-)X 720 5040(Wesley,)N 1057(Reading,)X 1426(Mass.,)X 1700(1985.)X 720 5264([REF-SGML])N 1318(ISO)X 1508(TC97/SC18.)X 2052(Standard)X 2431(Generalized)X 2934(Markup)X 3275(Language.)X 3716(Tech.)X 3970(Rept.)X 720 5376(DIS)N 898(8879,)X 1138(ISO,)X 1340(1986.)X 720 5600([REF-TEX])N 1227(Knuth,)X 1515(Donald)X 1822(E.)X 1953(The)X 2127(TEXbook.)X 2578(Addison-Wesley,)X 3272(Reading,)X 3641(Mass.,)X 3915(1984.)X 19 p %%Page: 19 19 12 s 0 xH 0 xS 1 f 2344 400(-)N 2400(19)X 2520(-)X 720 688([REF-TROFF])N 1348(Ossanna,)X 1731(Joseph)X 2027(F.)X 2139(NROFF/TROFF)X 2807(User's)X 3092(Manual.)X 3469(Bell)X 3665(Laboratories,)X 720 800(Murray)N 1032(Hill,)X 1230(New)X 1435(Jersey,)X 1723(1976.)X 1987(Computing)X 2444(Science)X 2768(Technical)X 3173(Report)X 3459(No.54.)X 720 1024([REF-SCRIBE])N 1396(Unilogic.)X 1825(SCRIBE)X 2205(Document)X 2650(Production)X 3116(Software.)X 3555(Unilogic,)X 3960(1985.)X 720 1136(Fourth)N 1000(Edition.)X 720 1360([REF-ISO646])N 1386(International)X 1975(Standard--Information)X 2937(Processing--ISO)X 3667(7-bit)X 3946(coded)X 720 1472(character)N 1098(set)X 1229(for)X 1365(information)X 1844(interchange,)X 2342(ISO)X 2520(646:1983.)X 720 1696([REF-7BIT])N 1228(International)X 1754(Standard--Information)X 2653(Processing--ISO)X 3320(7-bit)X 3535(and)X 3731(8-bit)X 3946(coded)X 720 1808(character)N 1098(sets--Code)X 1533(extension)X 1926(techniques,)X 2386(ISO)X 2564(2022:1986.)X 720 2032([REF-ANSI])N 1287(Coded)X 1610(Character)X 2057(Set--7-Bit)X 2515(American)X 2968(National)X 3373(Standard)X 3788(Code)X 4064(for)X 720 2144(Information)N 1204(Interchange,)X 1707(ANSI)X 1954(X3.4-1986.)X 720 2368([REF-X400])N 1261(Schicker,)X 1654(Pietro,)X 1940("Message)X 2348(Handling)X 2739(Systems,)X 3116(X.400",)X 3449(Message)X 3818(Handling)X 720 2480(Systems)N 1068(and)X 1235(Distributed)X 1696(Applications,)X 2240(E.)X 2351(Stefferud,)X 2761(O-j.)X 2941(Jacobsen,)X 3340(and)X 3507(P.)X 3612(Schicker,)X 4000(eds.,)X 720 2592(North-Holland,)N 1334(1989,)X 1574(pp.)X 1718(3-41.)X 720 2816([RFC-821])N 1207(Postel,)X 1532(J.B.)X 1771(Simple)X 2110(Mail)X 2358(Transfer)X 2750(Protocol.)X 3190(August,)X 3557(1982,)X 3840(Network)X 720 2928(Information)N 1204(Center,)X 1509(RFC-821.)X 720 3152([RFC-822])N 1238(Crocker,)X 1621(D.)X 1787(Standard)X 2178(for)X 2339(the)X 2506(format)X 2812(of)X 2941(ARPA)X 3245(Internet)X 3594(text)X 3789(messages.)X 720 3264(August,)N 1045(1982,)X 1285(Network)X 1645(Information)X 2129(Center,)X 2434(RFC-822.)X 720 3488([RFC-934])N 1219(Rose,)X 1465(M.T.;)X 1714(Stefferud,)X 2126(E.A.)X 2356(Proposed)X 2743(standard)X 3099(for)X 3241(message)X 3622(encapsulation.)X 720 3600(January,)N 1067(1985,)X 1307(Network)X 1667(Information)X 2151(Center,)X 2456(RFC-934.)X 720 3824([RFC-1049])N 1239(Sirbu,)X 1497(M.A.)X 1749(Content-type)X 2278(header)X 2561(\256eld)X 2758(for)X 2896(Internet)X 3222(messages.)X 3659(March,)X 3960(1988,)X 720 3936(Network)N 1080(Information)X 1564(Center,)X 1869(RFC-1049.)X 720 4160([RFC-1113])N 1240(Linn,)X 1473(J.)X 1585(Privacy)X 1906(enhancement)X 2443(for)X 2583(Internet)X 2911(electronic)X 3320(mail:)X 3547(Part)X 3730(I)X 3790(-)X 3850(message)X 720 4272(encipherment)N 1270(and)X 1434 0.2356(authentication)AX 2006(procedures)X 2452([Draft].)X 2791(August,)X 3116(1989,)X 3356(Network)X 3716(Information)X 720 4384(Center,)N 1025(RFC-1113.)X 720 4608([RFC-1148])N 1257(Kille,)X 1518(S.E.)X 1746(Mapping)X 2138(between)X 2504(X.400\(1988\))X 3042(/)X 3114(ISO)X 3313(10021)X 3598(and)X 3782(RFC)X 4008(822.)X 720 4720(March,)N 1019(1990,)X 1259(Network)X 1619(Information)X 2103(Center,)X 2408(RFC-1148.)X 720 4944([RFC-1154])N 1251(Robinson,)X 1681(D.;)X 1839(Ullmann,)X 2239(R.)X 2390(Encoding)X 2798(header)X 3094(\256eld)X 3304(for)X 3455(internet)X 3789(messages.)X 720 5056(April,)N 971(1990,)X 1211(Network)X 1571(Information)X 2055(Center,)X 2360(RFC-1154.)X 720 5280([REF-ATK])N 1229(Borenstein,)X 1710(Nathaniel)X 2125(S.,)X 2266(Multimedia)X 2756(Applications)X 3288(Development)X 3847(with)X 4058(the)X 720 5392(Andrew)N 1053(Toolkit,)X 1385(Prentice)X 1725(Hall,)X 1939(1990.)X 720 5616([REF-CCITT84c])N 1484(CCITT)X 1813(SG)X 1986(5/VII,)X 2270("Recommendations)X 3078(X.420,")X 3430(Message)X 3818(Handling)X 720 5728(Systems:)N 1091(Interpersonal)X 1623(Messaging)X 2063(User)X 2268(Agent)X 2527(Layer,)X 2800(October)X 3134(1984.)X 20 p %%Page: 20 20 12 s 0 xH 0 xS 1 f 2344 400(-)N 2400(20)X 2520(-)X 720 688([REF-CCITT/ISO88b])N 1656(CCITT/ISO,)X 2177("CCITT)X 2532(Recommendations)X 3287(X.420/)X 3589(ISO)X 3781(IS)X 3904(10021-)X 720 800(7",)N 855(Message)X 1215(Handling)X 1597(Systems:)X 1968(Interpersonal)X 2500(Messaging)X 2940(System,)X 720 1024([REF-ODA])N 1223(**************)X 720 1248([REF-ULAW])N 1304(***************)X 720 1472([REF-ALAW])N 1304(***************)X 720 1696([REF-DES])N 1197(****************)X 720 1920([REF-PBM])N 1218(****************)X 3 f 14 s 720 2160(Appendix)N 1214(A)X 1323(--)X 1427(The)X 1641(Character)X 2162(Set)X 2340(for)X 2512(the)X 2690(M)X 2796(AILASCII)X 3335(Content-Type)X 1 f 12 s 720 2400(As)N 879(stated)X 1157(in)X 1285(this)X 1477(document,)X 1934(the)X 2105(MAILASCII)X 2653(content-type)X 3188(is)X 3305(based)X 3577(on)X 3726(a)X 3823(series)X 4096(of)X 720 2512(standards)N 1140(and)X 1336(on)X 1489(the)X 1664(historical)X 2080(standard)X 2462(practice)X 2824(in)X 2955(the)X 3129(Internet)X 3485(mail)X 3713(community.)X 720 2624(However,)N 1126(the)X 1274(precise)X 1577(meaning)X 1939(of)X 2049(this)X 2218(content-type)X 2730(has)X 2889(been)X 3102(the)X 3251(subject)X 3555(of)X 3666(some)X 3900(debate.)X 720 2736(In)N 830(this)X 999(appendix,)X 1406(therefore,)X 1808(we)X 1950(de\256ne)X 2214(the)X 2361(MAILASCII)X 2885(content-type.)X 3444(It)X 3532(is)X 3625(our)X 3782(belief)X 4031(that)X 720 2848(this)N 906(de\256nition)X 1322(corresponds)X 1833(with)X 2051(the)X 2216(default)X 2531(assumptions)X 3054(made)X 3311(for)X 3471(messages)X 3882(without)X 720 2960(Content-type)N 1247(headers)X 1565(as)X 1669(de\256ned)X 1976(by)X 2096(RFC)X 2301(822.)X 720 3184(The)N 897(message)X 1251(body)X 1471(is)X 1563(coded)X 1821(in)X 1924(the)X 2070(character)X 2452(set)X 2587(of)X 2695(American)X 3103(National)X 3463(Standard)X 3833(Code)X 4064(for)X 720 3296(Information)N 1208(Interchange,)X 1715(sometimes)X 2155(known)X 2444(as)X 2552("7-bit)X 2801(ASCII")X 3118([REF-7BIT].)X 3645(This)X 3844(is)X 3935(not)X 4085(an)X 720 3408(arbitrary)N 1092(seven-bit)X 1485(character)X 1879(code,)X 2125(but)X 2288(indicates)X 2671(that)X 2857(the)X 3016(message)X 3383(body)X 3616(uses)X 3822(character)X 720 3520(coding)N 1024(that)X 1211(uses)X 1418(the)X 1578(exact)X 1824(correspondence)X 2470(of)X 2592(codes)X 2853(to)X 2970(characters)X 3403(speci\256ed)X 3786(in)X 3902(ASCII.)X 720 3632(National)N 1081(use)X 1238(variations)X 1647(on)X 1772(ISO646)X 2099([REF-ISO646])X 2698(are)X 2845(not)X 2997(ASCII,)X 3300(and)X 3468(neither)X 3765(an)X 3886(explicit)X 720 3744("ASCII")N 1085(content)X 1406(type,)X 1633(nor)X 1797("MAILASCII",)X 2430(nor)X 2594(the)X 2748(default)X 3052(\(omission)X 3467(of)X 3583(a)X 3662(content-type\))X 720 3856(should)N 1024(be)X 1163(used)X 1388(when)X 1645(characters)X 2085(are)X 2252(coded)X 2531(using)X 2788(them.)X 3102(\(Discussion:)X 3631(RFC821)X 4005(very)X 720 3968(explicitly)N 1114(speci\256es)X 1474("ASCII",)X 1855(and)X 2023(references)X 2472(an)X 2592(earlier)X 2868(version)X 3180(of)X 3289(the)X 3436(American)X 3844(National)X 720 4080(Standard)N 1092(cited)X 1310(in)X 1415([REF-ANSI].)X 1988(Whether)X 2350(that)X 2525(speci\256cation,)X 3066(rather)X 3321(than)X 3517(a)X 3590(reference)X 3979(to)X 4085(an)X 720 4192(International)N 1246(Standard,)X 1645(was)X 1827(done)X 2047(deliberately)X 2536(or)X 2649(out)X 2805(of)X 2918(convenience)X 3433(or)X 3546(ignorance,)X 3983(is)X 4080(no)X 720 4304(longer)N 991(interesting:)X 1450(insofar)X 1742(as)X 1847(one)X 2011(of)X 2116(the)X 2259(purposes)X 2625(of)X 2730(specifying)X 3156(a)X 3225(content-type)X 3733(is)X 3823(to)X 3924(permit)X 720 4416(the)N 868(receiver)X 1209(to)X 1314(unambiguously)X 1937(determine)X 2353(how)X 2548(the)X 2696(sender)X 2977(intended)X 3339(the)X 3487(coded)X 3746(message)X 4101(to)X 720 4528(be)N 851(interpreted,)X 1333(assuming)X 1736(anything)X 2113(other)X 2352(than)X 2559("strict)X 2832(ASCII")X 3162(as)X 3283(the)X 3442(default)X 3751(would)X 4032(risk)X 720 4640(unintentional)N 1289(and)X 1488(incompatible)X 2052(changes)X 2422(to)X 2557(the)X 2735(semantics)X 3175(of)X 3315(messages)X 3738(now)X 3962(being)X 720 4752(transmitted.)N 1280(This)X 1481(also)X 1666(implies)X 1980(that)X 2155(messages)X 2548(containing)X 2985(characters)X 3406(coded)X 3666(according)X 4101(to)X 720 4864(national)N 1089(variations)X 1526(on)X 1679(ISO646,)X 2058(or)X 2195(using)X 2460(code-switching)X 3105(procedures)X 3584(\(e.g.,)X 3836(those)X 4096(of)X 720 4976(ISO2022\),)N 1180(as)X 1318(well)X 1542(as)X 1680(8-bit)X 1920(or)X 2059(multiple)X 2464(octet)X 2711(character)X 3124(encodings)X 3573(MUST)X 3898(use)X 4085(an)X 720 5088(appropriate)N 1183(content-type)X 1689(to)X 1788(be)X 1903(consistent)X 2312(with)X 2507(this)X 2670(speci\256cation.\))X 720 5312(Because)N 1074(of)X 1187(the)X 1338(restriction)X 1762(imposed)X 2121(on)X 2250(message)X 2609(bodies)X 2893(by)X 3022(RFC)X 3236(822)X 3413(and,)X 3609(in)X 3717(practice,)X 4080(by)X 720 5424(Message)N 1085(Transport)X 1488(Agents)X 1789(that)X 1963(are)X 2110(more-or-less)X 2625(compliant)X 3040(with)X 3240(RFC)X 3450(821,)X 3646(implementors)X 720 5536(should)N 1000(be)X 1115(careful)X 1407(in)X 1506(several)X 1803(ways)X 2024(regarding)X 2417(MAILASCII)X 2936(text:)X 21 p %%Page: 21 21 12 s 0 xH 0 xS 1 f 2344 400(-)N 2400(21)X 2520(-)X 1008 688(\(1\))N 1185(Delimiters)X 1657(other)X 1921(than)X 2153(CR-LF)X 2491(pairs)X 2744(may)X 2976(be)X 3133(used)X 3375(in)X 3516(the)X 3700(local)X 1008 800(representation)N 1585(of)X 1696(a)X 1770(message)X 2127(on)X 2253(some)X 2486(systems.)X 2868(The)X 3048(persistence)X 3506(of)X 3616(CR-LF)X 1008 912(pairs)N 1219(should)X 1499(not)X 1646(be)X 1761(relied)X 2005(on.)X 1008 1136(\(2\))N 1153(Isolated)X 1491(CR)X 1652(and)X 1824(LF)X 1969(characters)X 2393(are)X 2544(not)X 2700(well)X 2899(tolerated)X 3270(in)X 3378(general;)X 3722(they)X 1008 1248(may)N 1201(be)X 1319(lost)X 1485(or)X 1592(converted)X 1998(to)X 2099(delimiters)X 2511(on)X 2633(some)X 2862(systems,)X 3216(and)X 3381(hence)X 3632(should)X 1008 1360(not)N 1155(be)X 1270(relied)X 1514(on.)X 1008 1584(\(3\))N 1176(TAB)X 1424(characters)X 1871(may)X 2093(be)X 2240(misinterpreted)X 2854(or)X 2991(may)X 3214(be)X 3362 0.2812(automatically)AX 1008 1696(converted)N 1430(to)X 1547(variable)X 1900(numbers)X 2273(of)X 2395(spaces.)X 2736(This)X 2949(is)X 3055(unavoidable)X 3568(in)X 3685(some)X 1008 1808(environments,)N 1591(notably)X 1915(those)X 2153(not)X 2312(based)X 2567(on)X 2699(the)X 2853(ASCII)X 3139(character)X 3529(set.)X 3696(Such)X 1008 1920(conversion)N 1459(is)X 1552(STRONGLY)X 2092(DISCOURAGED,)X 2831(but)X 2982(it)X 3064(may)X 3258(occur,)X 3524(and)X 3691(users)X 1008 2032(of)N 1148(MAILASCII)X 1703(format)X 2020(should)X 2336(not)X 2519(rely)X 2730(on)X 2887(the)X 3066(persistence)X 3555(of)X 3696(TAB)X 1008 2144(characters.)N 1008 2368(\(4\))N 1198(Lines)X 1491(longer)X 1816(than)X 2061(80)X 2236(characters)X 2706(may)X 2951(be)X 3121(wrapped)X 3531(in)X 3685(some)X 1008 2480(environments.)N 1620(Line)X 1861(wrapping)X 2288(is)X 2416(STRONGLY)X 2991(DISCOURAGED,)X 3765(but)X 1008 2592(unavoidable)N 1507(in)X 1610(some)X 1841(cases.)X 2096(Applications)X 2616(which)X 2879(depend)X 3186(on)X 3311(lines)X 3522(not)X 3674(being)X 1008 2704(wrapped)N 1398(should)X 1713(use)X 1900(mechanisms)X 2435(other)X 2692(than)X 2917(unencoded)X 3393(MAILASCII)X 1008 2816(bodyparts)N 1411(to)X 1510(transmits)X 1887(messages.)X 1008 3040(\(5\))N 1175(Trailing)X 1517("white)X 1801(space")X 2085(characters)X 2507(\(SPACE,)X 2893(TAB,)X 3141(etc.\))X 3342(on)X 3470(a)X 3545(line)X 3722(may)X 1008 3152(be)N 1123(discarded)X 1516(by)X 1636(some)X 1863(transport)X 2229(agents,)X 2523(and)X 2686(hence)X 2935(should)X 3215(not)X 3362(be)X 3477(relied)X 3721(on.)X 720 3376(See)N 885(RFC)X 1092(821,)X 1286(RFC)X 1493(822,)X 1688(and)X 1854(RFC1113)X 2254(for)X 2393(additional)X 2806(information)X 3288(about)X 3529(canonical)X 3926(SMTP)X 720 3488(formats.)N 1100(Authors)X 1447(of)X 1565(software)X 1934(which)X 2207(composes)X 2624("MAILASCII")X 3235(in)X 3347(compliance)X 3829(with)X 4037(this)X 720 3600(RFC)N 925(should)X 1205(be)X 1320(well-acquainted)X 1960(with)X 2155(SMTP)X 2429(formats.)X 720 3824(The)N 894(complete)X 1272(MAILASCII)X 1791(character)X 2169(set)X 2300(is)X 2388(listed)X 2621(below:)X 2907(*****)X 3171(CONTROL)X 3648(CHARS????)X 7 f 10 s 768 4032(0)N 864(nul)X 1104(16)X 1248(dle)X 1488(32)X 1632(sp)X 1872(48)X 2064(0)X 2256(64)X 2448(@)X 2640(80)X 2832(P)X 3072(96)X 3264(`)X 3456(112)X 3696(p)X 768 4128(1)N 864(soh)X 1104(17)X 1248(dc1)X 1488(33)X 1680(!)X 1872(49)X 2064(1)X 2256(65)X 2448(A)X 2640(81)X 2832(Q)X 3072(97)X 3264(a)X 3456(113)X 3696(q)X 768 4224(2)N 864(stx)X 1104(18)X 1248(dc2)X 1488(34)X 1680(")X 1872(50)X 2064(2)X 2256(66)X 2448(B)X 2640(82)X 2832(R)X 3072(98)X 3264(b)X 3456(114)X 3696(r)X 768 4320(3)N 864(etx)X 1104(19)X 1248(dc3)X 1488(35)X 1680(#)X 1872(51)X 2064(3)X 2256(67)X 2448(C)X 2640(83)X 2832(S)X 3072(99)X 3264(c)X 3456(115)X 3696(s)X 768 4416(4)N 864(eot)X 1104(20)X 1248(dc4)X 1488(36)X 1680($)X 1872(52)X 2064(4)X 2256(68)X 2448(D)X 2640(84)X 2832(T)X 3024(100)X 3264(d)X 3456(116)X 3696(t)X 768 4512(5)N 864(enq)X 1104(21)X 1248(nak)X 1488(37)X 1680(%)X 1872(53)X 2064(5)X 2256(69)X 2448(E)X 2640(85)X 2832(U)X 3024(101)X 3264(e)X 3456(117)X 3696(u)X 768 4608(6)N 864(ack)X 1104(22)X 1248(syn)X 1488(38)X 1680(&)X 1872(54)X 2064(6)X 2256(70)X 2448(F)X 2640(86)X 2832(V)X 3024(102)X 3264(f)X 3456(118)X 3696(v)X 768 4704(7)N 864(bel)X 1104(23)X 1248(etb)X 1488(39)X 1680(')X 1872(55)X 2064(7)X 2256(71)X 2448(G)X 2640(87)X 2832(W)X 3024(103)X 3264(g)X 3456(119)X 3696(w)X 768 4800(8)N 864(bs)X 1104(24)X 1248(can)X 1488(40)X 1680(\()X 1872(56)X 2064(8)X 2256(72)X 2448(H)X 2640(88)X 2832(X)X 3024(104)X 3264(h)X 3456(120)X 3696(x)X 768 4896(9)N 864(ht)X 1104(25)X 1248(em)X 1488(41)X 1680(\))X 1872(57)X 2064(9)X 2256(73)X 2448(I)X 2640(89)X 2832(Y)X 3024(105)X 3264(i)X 3456(121)X 3696(y)X 720 4992(10)N 864(nl)X 1104(26)X 1248(sub)X 1488(42)X 1680(*)X 1872(58)X 2064(:)X 2256(74)X 2448(J)X 2640(90)X 2832(Z)X 3024(106)X 3264(j)X 3456(122)X 3696(z)X 720 5088(11)N 864(vt)X 1104(27)X 1248(esc)X 1488(43)X 1680(+)X 1872(59)X 2064(;)X 2256(75)X 2448(K)X 2640(91)X 2832([)X 3024(107)X 3264(k)X 3456(123)X 3696({)X 720 5184(12)N 864(np)X 1104(28)X 1248(fs)X 1488(44)X 1680(,)X 1872(60)X 2064(<)X 2256(76)X 2448(L)X 2640(92)X 2832(\\)X 3024(108)X 3264(l)X 3456(124)X 3696(|)X 720 5280(13)N 864(cr)X 1104(29)X 1248(gs)X 1488(45)X 1680(-)X 1872(61)X 2064(=)X 2256(77)X 2448(M)X 2640(93)X 2832(])X 3024(109)X 3264(m)X 3456(125)X 3696(})X 720 5376(14)N 864(so)X 1104(30)X 1248(rs)X 1488(46)X 1680(.)X 1872(62)X 2064(>)X 2256(78)X 2448(N)X 2640(94)X 2832(\303)X 3024(110)X 3264(n)X 3456(126)X 3696(\304)X 720 5472(15)N 864(si)X 1104(31)X 1248(us)X 1488(47)X 1680(/)X 1872(63)X 2064(?)X 2256(79)X 2448(O)X 2640(95)X 2832(_)X 3024(111)X 3264(o)X 3456(127)X 3648(del)X 21 p %%Trailer xt xs From owner-ietf-822@dimacs.rutgers.edu Tue Apr 23 02:40:09 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA07913; Tue, 23 Apr 91 02:02:50 EDT Received: from srawgw.sra.co.jp by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA07904; Tue, 23 Apr 91 02:02:38 EDT Received: from srava.sra.co.jp by srawgw.sra.co.jp (5.64WH/1.4) id AA11599; Tue, 23 Apr 91 15:02:56 +0900 Received: from sran8.sra.co.jp by srava.sra.co.jp (5.64b/6.4J.6-BJW) id AA01457; Tue, 23 Apr 91 15:01:53 +0900 Received: from localhost by sran8.sra.co.jp (4.0/6.4J.6-SJ) id AA10647; Tue, 23 Apr 91 14:58:29 JST Return-Path: Message-Id: <9104230558.AA10647@sran8.sra.co.jp> Reply-To: erik@sra.co.jp From: erik@sra.co.jp (Erik M. van der Poel) To: ietf-822@dimacs.rutgers.edu Subject: ISO-CHARSET-TYPE -- some comments Date: Tue, 23 Apr 91 14:58:24 +0900 Sender: erik@sran8.sra.co.jp > ISO-CHARSET-TYPE -- Indicates the document contains text in an ISO > standard character set by ints International Registration number. Each > ISO character set defines a new standard mail content type, given by the > string "ISO-IR-" followed by the numeric value of the character set. > Thus, for example, a content-type of "ISO-IR-6" specifies a character > set that is extremely similar, and perhaps identical, to MAILASCII. ISO-IR-6 only contains the 94 characters between space and delete *exclusive*, and is therefore not even close to MAILASCII, which contains all the control characters and space as well. > However, it should be noted that even when the Content-type is an > ISO-IR- character set type, certain control characters will always be > construed according to the guidelines of RFC 821 and RFC 822. In > particular, character positions 13, 10, and 32 will always be > interpreted at times as CR, LF, and SPACE, respectively. "always ... at times" ??? :-) Wouldn't it be better to specify that only the single-byte 94-character sets are allowed in ISO-IR-n? I don't think the multibyte sets should be allowed since they would be encoded together with ASCII within the framework of ISO 2022, which I think needs its own Content-Type name. Also, the single-byte 96-character sets should not be allowed since they have the 8th bit up and are usually used together with ASCII to form e.g. Latin-1. I think Latin-1 should be given its own Content-Type, and it can be encoded in Quoted-Printable, or something similar (e.g. quoted-readable \"u). Finally, we shouldn't allow control character sets, for the reasons that you give above. Would it be helpful to include a list of currently allowed character sets in an Appendix or something? Another approach would be to acknowledge that what we are really trying to support are the national ISO 646 variants. So you might give the Content-Type a name like this: ISO-646- E.g. ISO-646-4 (for the United Kingdom) Comments? Keld? Regards, Erik From owner-ietf-822@dimacs.rutgers.edu Tue Apr 23 03:10:11 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA08355; Tue, 23 Apr 91 02:27:45 EDT Received: from srawgw.sra.co.jp by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA08351; Tue, 23 Apr 91 02:27:37 EDT Received: from srava.sra.co.jp by srawgw.sra.co.jp (5.64WH/1.4) id AA11759; Tue, 23 Apr 91 15:27:56 +0900 Received: from sran8.sra.co.jp by srava.sra.co.jp (5.64b/6.4J.6-BJW) id AA01767; Tue, 23 Apr 91 15:26:51 +0900 Received: from localhost by sran8.sra.co.jp (4.0/6.4J.6-SJ) id AA10699; Tue, 23 Apr 91 15:23:27 JST Return-Path: Message-Id: <9104230623.AA10699@sran8.sra.co.jp> Reply-To: erik@sra.co.jp From: erik@sra.co.jp (Erik M. van der Poel) To: ietf-822@dimacs.rutgers.edu Subject: Quoted-Printable -- some problems Date: Tue, 23 Apr 91 15:23:24 +0900 Sender: erik@sran8.sra.co.jp > In this encoding, ASCII characters 9 (tab), 10 (nl), 13 (cr), 32 through > 37, inclusive, 39 through 91, and 93 through 127, inclusive, are > unchanged. Is it really safe to allow delete (127) to pass through unchanged? Doesn't it have a special meaning on some systems (like deleting the previous character)? Also, why don't you allow Control-L to pass through? (You even use it yourself e.g. in your text version of the RFC-XXXX draft.) > Style #2: An 8 bit value from 160 through 255 may, alternately, be > represented by an ampersand character followed by the character > obtained by the removal of the high order bit, i.e. by subtracting > 128 from the value. Thus the 8 bit value 193 may be represented as > "&A". I think this should be 161 to 254. 160 - 128 is 32 (space), which may cause problems if it is at the end of a line, since some systems strip trailing spaces, as you say somewhere else in the draft. Also, 255 - 128 is 127 (delete), which is dangerous as I explained above. Regards, Erik From owner-ietf-822@dimacs.rutgers.edu Tue Apr 23 04:40:12 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA11999; Tue, 23 Apr 91 04:31:32 EDT Received: from alpha.Xerox.COM by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA11995; Tue, 23 Apr 91 04:31:27 EDT Received: from holmes.parc.xerox.com ([13.1.100.162]) by alpha.xerox.com with SMTP id <16235>; Tue, 23 Apr 1991 01:31:17 PDT Received: by holmes.parc.xerox.com id <33025>; Tue, 23 Apr 1991 01:32:05 -0700 Received: from Messages.7.14.N.CUILIB.3.45.SNAP.NOT.LINKED.holmes.parc.xerox.com.sun4.40 via MS.5.6.holmes.parc.xerox.com.sun4_40; Tue, 23 Apr 1991 01:32:03 -0700 (PDT) Message-Id: Date: Tue, 23 Apr 1991 01:32:03 PDT Sender: Bill Janssen From: Bill Janssen To: ietf-822@dimacs.rutgers.edu, Nathaniel Borenstein Subject: Re: A Kinder, Gentler New Draft In-Reply-To: <4c4lj560M2YtQKUHpL@thumper.bellcore.com> References: <4c4lj560M2YtQKUHpL@thumper.bellcore.com> One of the more common Content-Encodings I'm seeing right now is uuencoded-compressed. Would this be Content-Encoding "BASE64"? How does one talk about the compression? Or is it simply not a valid encoding? I'd be loath for it to go away, as it can be readily (though clumsily) sent by, and received by, people without fancy UAs. Bill From owner-ietf-822@dimacs.rutgers.edu Tue Apr 23 05:40:12 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA12819; Tue, 23 Apr 91 05:17:24 EDT Received: from srawgw.sra.co.jp by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA12810; Tue, 23 Apr 91 05:17:17 EDT Received: from srava.sra.co.jp by srawgw.sra.co.jp (5.64WH/1.4) id AA12512; Tue, 23 Apr 91 18:17:38 +0900 Received: from sran8.sra.co.jp by srava.sra.co.jp (5.64b/6.4J.6-BJW) id AA03306; Tue, 23 Apr 91 18:16:37 +0900 Received: from localhost by sran8.sra.co.jp (4.0/6.4J.6-SJ) id AA10854; Tue, 23 Apr 91 18:13:12 JST Return-Path: Message-Id: <9104230913.AA10854@sran8.sra.co.jp> Reply-To: erik@sra.co.jp From: erik@sra.co.jp (Erik M. van der Poel) To: ietf-822@dimacs.rutgers.edu Subject: Encoded-Variable header Date: Tue, 23 Apr 91 18:13:10 +0900 Sender: erik@sran8.sra.co.jp > From: $Keld_JXrn_Simonsen > Encoded-Variable: Keld_JXrn_Simonsen = quoted-printable, iso646, > Keld_J&0Crn_Simonsen I think it might be a good idea to carefully distinguish the quoted printable used for headers (as above), and the quoted printable used within the message body. The description for Quoted-Printable talks about backslash `\', but this should not be used in headers, since RFC822 attaches a special meaning to backslash. So how about creating a new encoding called Strictly-Quoted (or whatever) which carefully avoids backslash, parentheses, etc? Actually, the above example is wrong since there is a Hex code after the ampersand, while the Quoted-Printable section says that ampersand is used together with a single character e.g. &A. But then having a single character after & may be dangerous e.g. &(. > *** NOTE: It would be nice to get the character set & hex code right > for the above example. I think the character set that Keld uses in his name is a variant of ISO 646. I don't know what the ISO registration number is. Maybe 9-1 or 9-2? Keld, do you know? Also, I don't think the vertical bar `|' in Keld's name needs quoting, since vertical bar is not a special RFC822 character (which is why Keld was able to use it). The `|' in Keld's name is not a bit-stripped character. It's a 7-bit national variant. Actually, maybe we can do all this with one new header called Header-Type (or something): From: Keld J|rn Simonsen Header-Type: iso646-9-1 The encoding is always Strictly-Quoted, and the type is specified by the Header-Type. This applies only to the people's names throughout the header and also the Subject line. Here is another example: From: erik@sra.co.jp (Q1B$B%Q28%j%C%/Q1BQ28B) Header-Type: message2022 This is my name in Japanese, in the special form of ISO 2022, Strictly-Quoted with Q as the escape character. (The Q is just an example. We might want to use a different escape character.) The Japanese will probably want to use both a Romanized form of their name and the quoted Japanese: From: taro@foo.ac.jp (Taro Suzuki Q1B$BNkLZBQ40OQ3AQ1BQ28B) Regards, Erik From owner-ietf-822@dimacs.rutgers.edu Tue Apr 23 08:10:12 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA15433; Tue, 23 Apr 91 07:59:08 EDT Received: from CBROWN.CLAREMONT.EDU by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA15429; Tue, 23 Apr 91 07:59:04 EDT Date: Tue, 23 Apr 1991 04:58 PDT From: "Ned Freed, Postmaster" Subject: Re: A Kinder, Gentler New Draft To: janssen@parc.xerox.com Cc: ietf-822@dimacs.rutgers.edu Message-Id: X-Envelope-To: ietf-822@dimacs.rutgers.edu X-Vms-To: IN%"janssen@parc.xerox.com" X-Vms-Cc: IN%"ietf-822@dimacs.rutgers.edu" For various reasons, including the patent status of commonly used compression schemes and related legal matters, we elected not to specify any standard compression systems in the RFC. BASE64 is similar to UUENCODE but not the same; it is fully documented so you can compare the encoding yourself if you want to. There are a number of reasons for avoiding UUENCODE -- its use of spaces makes it less than reliable in mail systems (trailing blank suppression is a fairly common phenomenon), the chosen scheme is compatible with the scheme used for privacy enhanced mail, and finally the BASE64 scheme is somewhat cleaner in several ways than what UUENCODE uses. Another possible encoding candidate is BASE85. This is specified in the PostScript level 2 documentation. It is 5% better than BASE64, but does require multiplication and division to compute. It does not have problems with the characters it uses. Anyone interested in this additional encoding? Ned From owner-ietf-822@dimacs.rutgers.edu Tue Apr 23 09:40:11 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA17426; Tue, 23 Apr 91 09:19:59 EDT Received: from thumper.bellcore.com by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA17422; Tue, 23 Apr 91 09:19:57 EDT Received: from greenbush.bellcore.com by thumper.bellcore.com (4.1/4.7) id for ietf-822@dimacs.rutgers.edu; Tue, 23 Apr 91 09:19:54 EDT Received: by greenbush.bellcore.com (4.12/4.7) id for ietf-822@dimacs.rutgers.edu; Tue, 23 Apr 91 09:23:20 edt Received: from Messages.7.14.N.CUILIB.3.45.SNAP.NOT.LINKED.greenbush.mouseclub.sun4.40 via MS.5.6.greenbush.mouseclub.sun4_40; Tue, 23 Apr 1991 09:23:16 -0400 (EDT) Message-Id: Date: Tue, 23 Apr 1991 09:23:16 -0400 (EDT) From: Nathaniel Borenstein To: ietf-822@dimacs.rutgers.edu Subject: Re: A Kinder, Gentler New Draft References: <4c4lj560M2YtQKUHpL@thumper.bellcore.com>, , I'm a big fan of nuking uuencoding, because it isn't really standard -- different implementations are weirdly incompatible, and most versions don't work in a pipe. I implemented all 3 of the new encodings in a total of about an hour and a half, though, so I'm confident that such tools can become quite widespread. (I might even be able to give them away, but there are legal issue involved.) As far as compression goes, I would like to see a Content-type: compressed-message, indicating that the body is an 822 message compressed. Doing it this way allows the compressed message to be of another arbitrary content-type, e.g. audio. The only reason the RFC doesn't currently define one is that I don't know enough to do so. If you or someone else can find a good reference on a standard compression algorithm (LZW?) we could make it a predefined content-type in this RFC. Wanna give it a shot? From owner-ietf-822@dimacs.rutgers.edu Tue Apr 23 10:10:12 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA17612; Tue, 23 Apr 91 09:25:34 EDT Received: from thumper.bellcore.com by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA17608; Tue, 23 Apr 91 09:25:32 EDT Received: from greenbush.bellcore.com by thumper.bellcore.com (4.1/4.7) id for ietf-822@dimacs.rutgers.edu; Tue, 23 Apr 91 09:25:28 EDT Received: by greenbush.bellcore.com (4.12/4.7) id for ietf-822@dimacs.rutgers.edu; Tue, 23 Apr 91 09:28:52 edt Received: from Messages.7.14.N.CUILIB.3.45.SNAP.NOT.LINKED.greenbush.mouseclub.sun4.40 via MS.5.6.greenbush.mouseclub.sun4_40; Tue, 23 Apr 1991 09:28:46 -0400 (EDT) Message-Id: Date: Tue, 23 Apr 1991 09:28:46 -0400 (EDT) From: Nathaniel Borenstein To: ietf-822@dimacs.rutgers.edu Subject: Re: Encoded-Variable header In-Reply-To: <9104230913.AA10854@sran8.sra.co.jp> References: <9104230913.AA10854@sran8.sra.co.jp> I think that the problems with the Encoded-Variable header may be slightly exagerated. Recall that the encoding only happens in two places: comments and route-phrases. In each case, you can use double quotes to encapsulate virtually anything that doesn't include double quotes, and you can use the \22 hex code notation for embedded double quotes. The problem with the "Header-type" solution is that you might have multiple header fields in this boat, using different character sets, and how do you associate the right "Header-type" with the right other header field? Finally, I stand corrected on Keld's name. I had assumed that the | was actually an 8 bit character. It still makes an interesting example, though, if we can do it right. Even better would be if someone has an example that really does use an 8-bit value. Any takers? -- Nathaniel From owner-ietf-822@dimacs.rutgers.edu Tue Apr 23 13:40:13 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA23923; Tue, 23 Apr 91 13:03:06 EDT Received: from TWG.COM by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA23892; Tue, 23 Apr 91 13:02:14 EDT Received: from Obelix.twg.com by twg.com with SMTP ; Tue, 23 Apr 91 09:51:30 PST Received: from obelix.twg.com by Obelix.TWG.COM id aa00884; 23 Apr 91 9:51 PDT To: ietf-822@dimacs.rutgers.edu Subject: Re: A Kinder, Gentler New Draft In-Reply-To: Your message of Tue, 23 Apr 91 01:32:03 -0700. Date: Tue, 23 Apr 91 09:51:16 -0700 From: David Herron Message-Id: <9104230951.aa00884@Obelix.TWG.COM> > One of the more common Content-Encodings I'm seeing right now is > uuencoded-compressed. Would this be Content-Encoding "BASE64"? How > does one talk about the compression? Or is it simply not a valid > encoding? No, the encoding table specified for Base64 does not match the encoding table which uuencode uses. The technique is extraordinarily similar, however. It would be a Very Good thing if uuencode were included in the standard, regardless of its warts (using trailing blanks for instance). Specifying that the uuencode used be the version which does a checksum for each line avoids the trailing blank problem and provides a bit more assurance that "it worked" ... As for compression .. It belongs in either Content-Encoding: or Content-Type:. What you really have is a sequence of possible encodings which can be placed on a file, as we all know. It would be good to be able to specify a sequence of encodings which were applied to the data. We might as well seperate the ultimate data type from the encodings used, as the end-user is going to be mainly interested in that and rather uninterested in what magic was performed to get it to him/her. Suggestion: Content-Type := type [ ";" ver-num [ ";" 1#resource-ref ]] [In other words, unchanged from before] Content-Encoding := 1#content-encoding-type content-encoding-type := "Base64" / "UUENCODE" / "UUENCODE-CRC" / "HEXADECIMAL" / "QUOTED-PRINTABLE" / "8BIT" / "BINARY" / "7BIT" / "LZW-COMPRESS" / "HUFF-COMPRESS" / "other-COMPRESS" / "X-"atom Example: Content-Type: PPM Content-Encoding: uuencode-crc lzw-compress I am having a hard time coming up with an example which would include more levels of encoding. But why limit ourselves needlessly? The Content-Encoding: likely should be a comma-seperated list. This is to allow for richer encoding types. Repeating the example above: Content-Type: PPM Content-Encoding: uuencode crc, lzw-compress 12bit David From owner-ietf-822@dimacs.rutgers.edu Tue Apr 23 14:40:13 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA26949; Tue, 23 Apr 91 14:22:13 EDT Received: from thumper.bellcore.com by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA26940; Tue, 23 Apr 91 14:22:09 EDT Received: from greenbush.bellcore.com by thumper.bellcore.com (4.1/4.7) id for ietf-822@dimacs.rutgers.edu; Tue, 23 Apr 91 14:22:00 EDT Received: by greenbush.bellcore.com (4.12/4.7) id for ietf-822@dimacs.rutgers.edu; Tue, 23 Apr 91 14:25:27 edt Received: from Messages.7.14.N.CUILIB.3.45.SNAP.NOT.LINKED.greenbush.mouseclub.sun4.40 via MS.5.6.greenbush.mouseclub.sun4_40; Tue, 23 Apr 1991 14:25:23 -0400 (EDT) Message-Id: Date: Tue, 23 Apr 1991 14:25:23 -0400 (EDT) From: Nathaniel Borenstein To: ietf-822@dimacs.rutgers.edu Subject: Re: A Kinder, Gentler New Draft In-Reply-To: <9104230951.aa00884@Obelix.TWG.COM> References: <9104230951.aa00884@Obelix.TWG.COM> I'd really like to keep the content-encoding simple. I'm not convinced that allowing multiple nested encodings is beneficial enough to be worth the extra cost involved. As to compression, I would personally be happy to either include a compressed-message content-type or a compressed content-encoding, if someone knowledgable about compression can write up a good, exhaustive, implementable, legally not-a-rat-hole specification. I'm simply not qualified. The same "I want to see a rigorous spec" philosophy applies, in my mind, to specifying the use of uuencode. -- Nathaniel From owner-ietf-822@dimacs.rutgers.edu Tue Apr 23 15:10:12 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA28248; Tue, 23 Apr 91 14:54:27 EDT Received: from NRI.RESTON.VA.US by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA28238; Tue, 23 Apr 91 14:54:17 EDT Received: from NRI by NRI.NRI.Reston.VA.US id aa11986; 23 Apr 91 14:42 EDT To: Nathaniel Borenstein , Ned Freed Cc: gvaudreuil@nri.reston.va.us, ietf-822@dimacs.rutgers.edu Subject: Re: TEXT version of Draft RFC In-Reply-To: Your message of "Mon, 22 Apr 91 13:23:51 EDT." Date: Tue, 23 Apr 91 14:42:45 -0400 From: Greg Vaudreuil Message-Id: <9104231442.aa11986@NRI.NRI.Reston.VA.US> Nathaniel, Ned, After reading the last revision of the draft document, I have noticed several missing points. As I understood the original proposal, a user agent to be conformant to the protocol needed to implement #all# the encoding types to allow interoperation. This was to allow a sender to choose the most efficient encoding without regard to the abilities of the receiver. This text and related discussion does not appear in the document. This is the reason (I believe) that there are proposals for many new encoding types! Related to the first point, there was discussion about having a "mandatory" encoding type for text-like stuff so a minimal user agent, designed to only do text processing did not need to have the encoding-decoding complexity. Consensus was not reached. To clip the minutes: A strawman poll was taken with the following options. 1. Body part ``a'' must be sent with encoding type ``y'' 2. Body part ``a'' should be sent with encoding type ``y'', but may be sent with any encoding x,y,z 3. Body part ``a'' can be sent with any encoding x,y,z 4. Body parts a, b, c can be sent in any encoding x,y,z except for body part ``d'' which must be sent in ``x'' There was no majority, with most expressing preference for (2), and equal number expressing either (3) or (4). This needs to be addressed in the document. Third, none of the content-types are for 8 bit text. Only ASCII is specified as a defined content type. I realize that this is not yet settled on the mailing list, but it would be nice to have at least a strawman available for other content-types as well as examples. Possible examples include the 8859-n family, the 2022 family, 646 national variant family, and the 10646 set. It is not necessary to pick a #standard# way to encoding character sets, but it would be very useful to at least demonstrate the various encodings. Now, I have one very strong feeling in terms of selecting a character set. One of the hallmarks of Internet protocols is that they are implementable as written. Profiling documents should not be required for the one implementation to work with another. In the case of this document, this is no longer true. I can implement this RFCXXXX, and not be able to send mail to another person implementing this document, even if we are using the same language, unless we have a prior agreement about what character set we are using. I can write French in Latin 1, a 2022 variant, or in Unicode, and unless all implementations have support for all possible character sets, I cannot count on interoperating. That said, I'd like to see a "common" character set defined in this document. At this point, that seems to point to either 10646, or Unicode. Both have their dis-advantages, but they are implementable. Use of other codes are also acceptable. Another option is to specify character sets for specific domains of use, ie. Romance languages use Latin-1. I think this gets very messy very fast. Comments? From owner-ietf-822@dimacs.rutgers.edu Tue Apr 23 15:40:13 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA29715; Tue, 23 Apr 91 15:31:45 EDT Received: from transarc.com by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA29708; Tue, 23 Apr 91 15:31:41 EDT Received: by transarc.com (5.54/3.15) id for ietf-822@dimacs.rutgers.edu; Tue, 23 Apr 91 15:31:24 EDT Received: via switchmail; Tue, 23 Apr 1991 15:31:21 -0400 (EDT) Received: from apollo.transarc.com via qmail ID ; Tue, 23 Apr 1991 15:31:10 -0400 (EDT) Received: from apollo.transarc.com via qmail ID ; Tue, 23 Apr 1991 15:31:03 -0400 (EDT) Received: from Messages.7.14.N.CUILIB.3.45.SNAP.NOT.LINKED.apollo.transarc.com.pmax.3 via MS.5.6.apollo.transarc.com.pmax_3; Tue, 23 Apr 1991 15:31:01 -0400 (EDT) Message-Id: Date: Tue, 23 Apr 1991 15:31:01 -0400 (EDT) From: Craig_Everhart@transarc.com To: ietf-822@dimacs.rutgers.edu, Nathaniel Borenstein Subject: Re: A Kinder, Gentler New Draft In-Reply-To: References: <9104230951.aa00884@Obelix.TWG.COM> Multiple nested encodings is probably a good idea, at least to specify. And the Content-encoding: list needs to be comma-separated if you're ever going to be able to define anything that takes an argument, e.g.: x-mumble-signature 13827423747234 239324237487234 as a legitimate ``content-encoding''. Yes, this suggests an ``encoding'' that is the identity transformation if the bits that flow through it produce mumble-signature value 13827423747234 when encoded with signed with parameter 239324237487234, but it's a legitimate desire that the standard should allow. Craig From owner-ietf-822@dimacs.rutgers.edu Tue Apr 23 16:10:13 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA00154; Tue, 23 Apr 91 15:41:56 EDT Received: from dkuug.dk by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA00141; Tue, 23 Apr 91 15:41:46 EDT Received: by dkuug.dk (5.64+/8+bit/IDA-1.2.8) id AA07836; Tue, 23 Apr 91 21:41:26 +0200 Date: Tue, 23 Apr 91 21:41:26 +0200 From: Keld J|rn Simonsen Message-Id: <9104231941.AA07836@dkuug.dk> To: erik@sra.co.jp, ietf-822@dimacs.rutgers.edu Subject: Re: ISO-CHARSET-TYPE -- some comments X-Charset: ASCII X-Char-Esc: 29 Erik (living in Japan) writes: > Another approach would be to acknowledge that what we are really > trying to support are the national ISO 646 variants. So you might give > the Content-Type a name like this: > > ISO-646- > > E.g. > > ISO-646-4 (for the United Kingdom) > > Comments? Keld? I don't think that the main thing is support for national ISO 646 variants. The main thing is support for other characters/letters than ASCII. And then support for the character sets that is used on the machines that the users use... Be it latin1, latin2, Greek, national ISO 646 variant, japanese encoding, IBM codepages ..., and hopefully without loss of interoperability and without information loss. I would rather stick to the more general naming than ISO-646-4, such as ISO-IR-4 - as I think the 8-bit sets may become extremely important. The ISO-IR-nr is not my invention, BTW, but an EWOS PT recommendation - as mentioned earlier. Keld From owner-ietf-822@dimacs.rutgers.edu Tue Apr 23 16:40:14 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA00395; Tue, 23 Apr 91 15:47:56 EDT Received: from dkuug.dk by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA00388; Tue, 23 Apr 91 15:47:42 EDT Received: by dkuug.dk (5.64+/8+bit/IDA-1.2.8) id AA07412; Tue, 23 Apr 91 21:21:46 +0200 Date: Tue, 23 Apr 91 21:21:46 +0200 From: Keld J|rn Simonsen Message-Id: <9104231921.AA07412@dkuug.dk> To: erik@sra.co.jp, ietf-822@dimacs.rutgers.edu Subject: Re: Encoded-Variable header X-Charset: ASCII X-Char-Esc: 29 > > From: $Keld_JXrn_Simonsen > > Encoded-Variable: Keld_JXrn_Simonsen = quoted-printable, iso646, > > Keld_J&0Crn_Simonsen > > I think it might be a good idea to carefully distinguish the quoted > printable used for headers (as above), and the quoted printable used > within the message body. The description for Quoted-Printable talks > about backslash `\', but this should not be used in headers, since > RFC822 attaches a special meaning to backslash. So how about creating > a new encoding called Strictly-Quoted (or whatever) which carefully > avoids backslash, parentheses, etc? > > Actually, the above example is wrong since there is a Hex code after > the ampersand, while the Quoted-Printable section says that ampersand > is used together with a single character e.g. &A. But then having a > single character after & may be dangerous e.g. &(. > > > > *** NOTE: It would be nice to get the character set & hex code right > > for the above example. > > I think the character set that Keld uses in his name is a variant of > ISO 646. I don't know what the ISO registration number is. Maybe 9-1 > or 9-2? Keld, do you know? It is ISO-IR-60 (Norwegian). Actually the character in my middle name appears in quite some ISO registered character sets. It is a "LATIN SMALL LETTER O WITH STROKE" in the 10646 sense. In different notations: ALFA-hex &x mnemonic &o/ > Also, I don't think the vertical bar `|' in Keld's name needs quoting, > since vertical bar is not a special RFC822 character (which is why > Keld was able to use it). The `|' in Keld's name is not a bit-stripped > character. It's a 7-bit national variant. Yes, it is allowed (as an ASCII code) according to 822, and I use it in the Danish/Norwegian 7-bit national ISO 646 variant sense. (Oh, you are only allowed to use strict ASCII in mail? Well, that was not a requirement in the old uucp world, where I come from:-) The use of the national ISO 646 variant in email has been quite commonplace all over Scandinavia (incl Finland) since the birth of email here. The Icelandics went to latin1, because they cannot do with just 7-bit. And actually the other Nordic people want to be able to use something better than 7-bit, this is what all this discussion on this list is all about... > Actually, maybe we can do all this with one new header called > Header-Type (or something): > > From: Keld J|rn Simonsen > Header-Type: iso646-9-1 Yes NETF wanted a header like that, but only 10646cm5 or TEXT-HEX. Keld From owner-ietf-822@dimacs.rutgers.edu Tue Apr 23 17:10:14 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA02580; Tue, 23 Apr 91 16:32:28 EDT Received: from dkuug.dk by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA02567; Tue, 23 Apr 91 16:32:13 EDT Received: by dkuug.dk (5.64+/8+bit/IDA-1.2.8) id AA09292; Tue, 23 Apr 91 22:31:47 +0200 Date: Tue, 23 Apr 91 22:31:47 +0200 From: Keld J|rn Simonsen Message-Id: <9104232031.AA09292@dkuug.dk> To: gvaudre@nri.reston.va.us, net@ymir.claremont.edu Subject: Re: TEXT version of Draft RFC Cc: gvaudreuil@nri.reston.va.us, ietf-822@dimacs.rutgers.edu X-Charset: ASCII X-Char-Esc: 29 Greg writes: > Third, none of the content-types are for 8 bit text. Only ASCII is > specified as a defined content type. I realize that this is not yet > settled on the mailing list, but it would be nice to have at least a > strawman available for other content-types as well as examples. > Possible examples include the 8859-n family, the 2022 family, 646 > national variant family, and the 10646 set. > It is not necessary to pick a #standard# way to encoding character > sets, but it would be very useful to at least demonstrate the various > encodings. I have earlier provided an extensive list of character sets and a way of encoding them, which are actually also put forward in ISO as a proposal for encoding them. Almost all of the ECMA registry is covered, along with some 40 vendor defined character sets, and C routines to handle conversions between them. The code and data are essentially free, also for commercial use. > Now, I have one very strong feeling in terms of selecting a character > set. One of the hallmarks of Internet protocols is that they are > implementable as written. Profiling documents should not be required > for the one implementation to work with another. In the case of this > document, this is no longer true. I can implement this RFCXXXX, and > not be able to send mail to another person implementing this document, > even if we are using the same language, unless we have a prior > agreement about what character set we are using. I can write French > in Latin 1, a 2022 variant, or in Unicode, and unless all > implementations have support for all possible character sets, I cannot > count on interoperating. With the abovementioned code and data, true interoperability on all these character sets could be achieved. On the other hand, I would recommend that only a selected list of character sets should be generally accepted. NETF and EUnet has decided for 2 such universal accepted character sets namely ASCII and 10646 in compaction method 5 level 2. If this list should be extended, I would recommend the 8859 series and nothing more. Well, Japanese, Chinese ... > That said, I'd like to see a "common" character set defined in this > document. At this point, that seems to point to either 10646, or > Unicode. Both have their dis-advantages, but they are implementable. > Use of other codes are also acceptable. True, both can do the job as a "common" character set. Only one has a status as a (nearly completed) de jure standard. NETF found that 10646 had some distinct advantages such as interoperability with ASCII and an economic compaction method giving in essence no extra cost in transmission volume for most European work. NETF/NORDUnet actually ruled out UNICODE unanimously. > Another option is to specify character sets for specific domains of > use, ie. Romance languages use Latin-1. I think this gets very messy > very fast. I think one should not bind languages and character sets together. Papers I see written here often contain Greek or mathematical special symbols. Latin1 cannot accomodate that although the languages preferred here are Danish and English. Keld From owner-ietf-822@dimacs.rutgers.edu Wed Apr 24 01:10:16 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA17812; Wed, 24 Apr 91 00:45:56 EDT Received: from srawgw.sra.co.jp by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA17808; Wed, 24 Apr 91 00:45:49 EDT Received: from srava.sra.co.jp by srawgw.sra.co.jp (5.64WH/1.4) id AA16151; Wed, 24 Apr 91 13:46:11 +0900 Received: from sran8.sra.co.jp by srava.sra.co.jp (5.64b/6.4J.6-BJW) id AA09822; Wed, 24 Apr 91 13:45:08 +0900 Received: from localhost by sran8.sra.co.jp (4.0/6.4J.6-SJ) id AA11694; Wed, 24 Apr 91 13:41:42 JST Return-Path: Message-Id: <9104240441.AA11694@sran8.sra.co.jp> Reply-To: erik@sra.co.jp From: erik@sra.co.jp (Erik M. van der Poel) To: ietf-822@dimacs.rutgers.edu Subject: Re: ISO-CHARSET-TYPE -- some comments Date: Wed, 24 Apr 91 13:41:39 +0900 Sender: erik@sran8.sra.co.jp OK, let me try to clarify what I meant about ISO-IR-n. Let's start with Latin-1 as an example. This is used as an 8-bit code, maximally with C0, G0, C1 and G1 (though C1 may not be as frequently used -- I don't know), so the registration numbers would be something like: C0 control ISO-IR-1 G0 graphic ISO-IR-6 C1 control ISO-IR-77 G1 graphic ISO-IR-100 These numbers may be wrong, but the point is that "Latin-1" (whatever *that* is) can be construed to include the above *four* sets. So how would you write the header? Content-Type: ISO-IR-1,6,77,100 ??? This is why I'm suggesting a separate Content-Type for Latin-1. Similarly, the Japanese use more than one character set at once, so I want a separate Content-Type for them. (I suggested the name message2022. More about this in another message.) Now to come back to ISO 646. These national variants are used as 7-bit sets, with the "usual" set for C0. So my suggestion was to fix the C0 part, and make the G0 part variable, the value given by the header: Content-Type: ISO-646- I hope this is clear now. Erik PS I foresee a problem with the Norwegian ISO-IR-60, however. This is a 96-character set, and as such may not be used in G0 (if you follow ISO 2022 strictly). Uh-oh... From owner-ietf-822@dimacs.rutgers.edu Wed Apr 24 02:40:19 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA19141; Wed, 24 Apr 91 01:46:28 EDT Received: from Sun.COM by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA19136; Wed, 24 Apr 91 01:46:25 EDT Received: from Eng.Sun.COM (zigzag-bb.Corp.Sun.COM) by Sun.COM (4.1/SMI-4.1) id AA28726; Tue, 23 Apr 91 22:46:22 PDT Received: from vision.Eng.Sun.COM by Eng.Sun.COM (4.1/SMI-4.1) id AA17471; Tue, 23 Apr 91 22:46:28 PDT Received: by vision.Eng.Sun.COM (4.1/SMI-4.1) id AA06781; Tue, 23 Apr 91 22:46:22 PDT Date: Tue, 23 Apr 91 22:46:22 PDT From: Vincent.Lau@eng.sun.com (Vincent Lau) Message-Id: <9104240546.AA06781@vision.Eng.Sun.COM> To: ietf-822@dimacs.rutgers.edu Subject: Comments on Draft RFC Section 2: Content-Type Header Field 1. In addition to the proposed standard types, RFC-XXXX should include a "822-message" type. This "822-message" type indicates the enclosed document consist of an entire embedded RFC-822 compliant message (i.e. header fields, blank line and body). Note, it is different from MAILASCII which only refers to the default message body type. 2. A while ago an issue was brought up that there was a need to identify the character-set in a separate header field, instead of overloading the Content-Type field. The issue may not be obvious to the human-language text message body, but it becomes obvious when non-US-ASCII characters are embedded in some human readable typed message (such as TROFF.) The whole idea is to use Content-Type to bring up a proper editor, viewer or text processor and use a new field (e.g. Codeset) to identify the character set for additional information. This generalized idea also extends to any human-language text body part. The value of "Codeset" may be in "ISO-IR-x" format, or the registered names for ISO10646 or Unicode, or "X-"atom. If this field does not exist, the original contents will be assumed to be in human non-readable format. In following example, "text" is picked arbitrarily to identify the human-language text body part in IA5-TEXT. And troff is in US-ASCII. --gc0p4Jq0M2Yt08jU534c0p Content-Type: troff; null; tbl Codeset: ISO-IR-6 ...... --gc0p4Jq0M2Yt08jU534c0p Content-Type: text Codeset: ISO-IR-2 ...... --gc0p4Jq0M2Yt08jU534c0p 3. Content-Type should not contain different name-spaces for compressed data. It is because compression is an operation, not a data type. If compressed information is needed, it should be in a separated header field. See additional comment in Section 3. If this concept is acceptable, RFC-XXXX should state this clearly. 4. U-LAW and A-LAW. Should it be typed as "audio" and put "U-LAW"/"A-LAW" in the resource-ref field? 5. How do we encapsulate X.400 body parts? Should Content-Type just identify the type and use "Content-Encoding" to say that it's ASN.1 encoded? 6. There is typo in DES-MESSAGE paragraph (page 5): "encrytped" -> "encrypted". Section 3: The Content-Encoding Header Field 1. Since "BASE64" is a better encoding scheme than "HEXDECIMAL", we would recommend to drop "HEXDECIMAL". It is understood that "HEX" converter is easily implemented and may be widely available. Here is the concerns: people may use the same argument to say that "uuencode" is also widely available, so RFC-XXXX should include it. In general, we would like to keep the number of encoding operations as small as possible. 2. Hopefully, the following chart may provide a clearer definition of encoding for UA's and gateways. TRANSPORT| BINARY-SMTP 8BIT-SMTP 7BIT-SMTP SOURCE DATA | ------------------------------------------------------------- 7-bit data | 7BIT 7BIT 7BIT 8-bit data | 8BIT 8BIT QUOTED-PRINTABLE Binary data | BINARY BASE64 BASE64 3. There are 2 types of encodings: transport dependent and data dependent. "Transport Dependent" encoding is an encoding scheme to work around the limitation of transport protocol. And "Content-Encoding" header field describes the transport dependent encoding. 4. "Data dependent" encoding (conversion) may include encryption, compression or ASN.1 encoding. This type of encoding is totally independent from the transport protocol. We encourage that RFC-XXXX to propose a new header field for data dependent encoding, e.g. "Content-Conversion". This field may be multi-valued. Note, during the encoding operations, all operations specified in "Content-Conversion" must be performed prior to the operation in "Content-Encoding". During the decoding operations, the reverse operation in "Content-Encoding" must be performed before "Content-Conversion." Section 4: The "Multipart" Content-Type 1. "1-S" and "1-P". I can sympathize with the wish to include hints about how to display the attachment stream, but I don't think that serial vs parallel is enough information. Given this, I'm against putting in half a solution. Should someone be able to complete a design that would take advantage of parallel display of the body parts, we could add that in later. This new design could define new message headers with the proper information, like Parallel-headers: , where aaa and bbb are content-id fields from the various body parts. 2. From the previous lengthy dicussions about "prefix" area, there were some strong reasons that "prefix" and "postfix" areas should be dropped. Section 5: A Complex Multipart Example 1. In the example, there shouldn't be any "From: ..." and "Subject: ..." in the SGML body part. Section 6: The Encoded-Variable Header Field 1. In the example, Encoded-Variable: Keld_JXrn_Simonsen = quoted-printable, iso646, Keld_J&0Crn_Simonsen Shouldn't "iso646" be replaced by "ISO-IR-x" format? I think having a uniform name space for character set is important. Section Appendix A: The Character Set for the MAILASCII Content-Type 1. State clearly that the five-point section on page 21 documents the common non-compliant implementations of SMTP that we are working around; it is *not* a set recommended practices. It is a hall of shame, not a hall of fame. 2. (4) should be rewritten as: Lines longer than 80 characters may be wrapped *or truncated* in some environments. Line wrapping *and truncation* are STRONGLY DISCOURAGED... 3. (5) should be added with: Discarding trailing "white space" is STRONGLY DISCOURAGED. Addtional comment: 1. An *optional* "Content-Label" header field in each body part is proposed here. This field allows the user to specify a name for a body part and to refer a body part by name in other context. -Vincent Lau & Neil Katin From owner-ietf-822@dimacs.rutgers.edu Wed Apr 24 03:10:20 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA20222; Wed, 24 Apr 91 03:01:27 EDT Received: from srawgw.sra.co.jp by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA20215; Wed, 24 Apr 91 03:01:17 EDT Received: from srava.sra.co.jp by srawgw.sra.co.jp (5.64WH/1.4) id AA16652; Wed, 24 Apr 91 16:01:32 +0900 Received: from sran8.sra.co.jp by srava.sra.co.jp (5.64b/6.4J.6-BJW) id AA11615; Wed, 24 Apr 91 16:00:29 +0900 Received: from localhost by sran8.sra.co.jp (4.0/6.4J.6-SJ) id AA11822; Wed, 24 Apr 91 15:57:02 JST Return-Path: Message-Id: <9104240657.AA11822@sran8.sra.co.jp> Reply-To: erik@sra.co.jp From: erik@sra.co.jp (Erik M. van der Poel) To: ietf-822@dimacs.rutgers.edu Subject: Re: Encoded-Variable header Date: Wed, 24 Apr 91 15:56:57 +0900 Sender: erik@sran8.sra.co.jp > In each case, you can use double > quotes to encapsulate virtually anything that doesn't include double > quotes, and you can use the \22 hex code notation for embedded double > quotes. OK, maybe there isn't really a problem, but I am a little bit concerned about the description in the draft. Since the stuff about quoting with <"> is already in RFC822, maybe we can just include a pointer to RFC822, so that implementors will know what to watch out for. Blindly applying the Quoted-Printable conversion *can* lead to problems, e.g.: From: foo@blurfl.co.jp (Foo B&)r) Note the &), which is a legal Quoted-Printable encoding. All I'm asking for is a clear RFC, to avoid pitfalls. > The problem with the "Header-type" solution is that you might have > multiple header fields in this boat, using different character sets, and > how do you associate the right "Header-type" with the right other header > field? Well, one obvious way to solve this problem is to use ISO 2022, which can cover most of the characters in the world. But perhaps a more acceptable way would be to use something like the Encoded-Variable method. How about a slight simplification? E.g.: From: Keld J|rn Simonsen Header-Encoding: From / J|rn = ISO-646-60 I included the "From" so that the software doesn't blindly apply conversions everywhere in the header. Also, it should be possible to have one or more pairs after the header name: header "/" word "=" content-type *["," word "=" content-type ] Note that the encoding is always Quoted-Printable (or a slightly restricted form of Quoted-Printable). > Even better would be if someone has an > example that really does use an 8-bit value. Keld's middle name in Latin-1 would be J\F8rn. Erik From owner-ietf-822@dimacs.rutgers.edu Wed Apr 24 09:10:22 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA28059; Wed, 24 Apr 91 08:59:48 EDT Received: from srawgw.sra.co.jp by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA28049; Wed, 24 Apr 91 08:59:41 EDT Received: from srava.sra.co.jp by srawgw.sra.co.jp (5.64WH/1.4) id AA17945; Wed, 24 Apr 91 22:00:00 +0900 Received: from sran8.sra.co.jp by srava.sra.co.jp (5.64b/6.4J.6-BJW) id AA15519; Wed, 24 Apr 91 21:59:00 +0900 Received: from localhost by sran8.sra.co.jp (4.0/6.4J.6-SJ) id AA12337; Wed, 24 Apr 91 21:55:27 JST Return-Path: Message-Id: <9104241255.AA12337@sran8.sra.co.jp> Reply-To: erik@sra.co.jp From: erik@sra.co.jp (Erik M. van der Poel) To: ietf-822@dimacs.rutgers.edu Subject: support for all character sets Date: Wed, 24 Apr 91 21:55:21 +0900 Sender: erik@sran8.sra.co.jp Greg goes: > I can write French > in Latin 1, a 2022 variant, or in Unicode, and unless all > implementations have support for all possible character sets, I cannot > count on interoperating. What exactly do you mean by "support"? Do you mean the ability to convert to another character set in the list? What happens when we want to add a new character set to the list? > That said, I'd like to see a "common" character set defined in this > document. At this point, that seems to point to either 10646, or > Unicode. I think 10646 and Unicode should not be mentioned until they become standards. Well, we *can* mention them now, but only in a footnote saying that we might add these to the list later on. Regards, Erik From owner-ietf-822@dimacs.rutgers.edu Wed Apr 24 09:40:23 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA29014; Wed, 24 Apr 91 09:32:32 EDT Received: from TIS.COM by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA29010; Wed, 24 Apr 91 09:32:28 EDT Received: from TIS.COM by TIS.COM (4.1/SUN-5.64) id AA07350; Wed, 24 Apr 91 09:32:32 EDT Message-Id: <9104241332.AA07350@TIS.COM> Reply-To: James M Galvin To: Nathaniel Borenstein Cc: ietf-822@dimacs.rutgers.edu Subject: Re: TEXT version of Draft RFC In-Reply-To: Your message of Mon, 22 Apr 91 13:23:51 EDT. Date: Wed, 24 Apr 91 09:32:21 -0400 From: James M Galvin -----BEGIN PRIVACY-ENHANCED MESSAGE----- Proc-Type: 4,MIC-CLEAR Originator-ID: galvin@tis.com:MEYxCzAJBgNVBAYTAlVTMSQwIgYDVQQKExt UcnVzdGVkIEluZm9ybWF0aW9uIFN5c3RlbXMxETAPBgNVBAsTCEdsZW53b29k:3 Certificate: MIIBljCCAUMCAQMwCgYGKw4HAgMBBQAwRjELMAkGA1UEBhMCVVMx JDAiBgNVBAoTG1RydXN0ZWQgSW5mb3JtYXRpb24gU3lzdGVtczERMA8GA1UECxMI R2xlbndvb2QwHhcNOTEwMzI4MTQxMDI4WhcNOTMwMzI3MTQxMDI4WjBwMQswCQYD VQQGEwJVUzEkMCIGA1UEChMbVHJ1c3RlZCBJbmZvcm1hdGlvbiBTeXN0ZW1zMREw DwYDVQQLEwhHbGVud29vZDEOMAwGA1UECxMFUEVNLTExGDAWBgNVBAMTD0phbWVz IE0uIEdhbHZpbjBYMAoGBFUIAQECAgH8A0oAMEcCQAvaHI82cIkTW/ji7qv3dnwe Xpr9GbmNX6zN4bitpUymN/DqdiPeJMTW79yYOcr8b8XmaKTdvrrVa7jbEED/orkC AwEAATAKBgYrDgcCAwEFAANBAAUgfAfkBpGFOzlvmhx7HHf4q+e+28G1/VSMsq7q 4YJiwOFN8wRv0QXmOFK+Xz9oHvzBXEX25npJfE4sWjJseXI= Issuer-Certificate: MIIBWTCCAQYCAQIwCgYGKw4HAgMBBQAwMzELMAkGA1UEB hMCVVMxJDAiBgNVBAoTG1RydXN0ZWQgSW5mb3JtYXRpb24gU3lzdGVtczAeFw05M TAzMjYyMDEyMjJaFw05MzAzMjUyMDEyMjJaMEYxCzAJBgNVBAYTAlVTMSQwIgYDV QQKExtUcnVzdGVkIEluZm9ybWF0aW9uIFN5c3RlbXMxETAPBgNVBAsTCEdsZW53b 29kMFgwCgYEVQgBAQICAfwDSgAwRwJADVnCBz7OjnFvXJ6vo/pCQ2tb9acMBSEG5 0mBYw7qy52WMFbzfLYEFC1uAQvy+/sT1Isj9QXOgRc3LXKSBZJcgwIDAQABMAoGB isOBwIDAQUAA0EACDEK4GF928N+NEcb5YrACZE2cUqxbLAvT8USPCynsansVaB/R K4UID6z4XnEC0qj6RAlFwFlzuctbeAwaiih7Q== MIC-Info: RSA-MD4,RSA,CjH+dT1j2W0dwHLLQHtMeeHdz1QMVC0caIbUFVxJBP9 KwsO5G6CGyMfZECpZ9ynra3wU6hEuJK+X0AHvuIVyJw== Would you consider "grandfathering" a content-type of "PEM". This is the mechanism by which all the nasty PEM headers, which I have "graciously" provided you above, can be moved into the header portion of a message so they can be under the control of a user agent. This is indeed a sore point from a user interface point of view. Jim -----END PRIVACY-ENHANCED MESSAGE----- From owner-ietf-822@dimacs.rutgers.edu Wed Apr 24 10:40:24 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA29368; Wed, 24 Apr 91 09:46:19 EDT Received: from NRI.RESTON.VA.US by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA29362; Wed, 24 Apr 91 09:46:16 EDT Received: from NRI by NRI.NRI.Reston.VA.US id aa04336; 24 Apr 91 9:40 EDT To: erik@sra.co.jp Cc: ietf-822@dimacs.rutgers.edu Subject: Re: support for all character sets In-Reply-To: Your message of "Wed, 24 Apr 91 21:55:21 +0900." <9104241255.AA12337@sran8.sra.co.jp> Date: Wed, 24 Apr 91 09:40:32 -0400 From: Greg Vaudreuil Message-Id: <9104240940.aa04336@NRI.NRI.Reston.VA.US> >What exactly do you mean by "support"? Do you mean the ability to >convert to another character set in the list? > I mean simply the ability to display the specified character set. Unless my UA knows about ISO 2022-foo, it can't likely display the glyphs can it? >I think 10646 and Unicode should not be mentioned until they become >standards. Well, we *can* mention them now, but only in a footnote >saying that we might add these to the list later on. This is not true. This RFCXXXX document cannot become a full standard until everything referenced in it is a full standard in it's respecitive standards organization. But... We can still reference things list ISO 10646, and expect that it becomes a IS before RFCXXXX become an Full Standard. Even in the best case, this will be over a year from the time RFCXXXX becomes a proposed standard RFC, likely 2 years from now given the fundamental importance this document has in a very popular Internet service. I would expect 10646 to make it's final hurdle in that amount of time... or am I unduly charitable to ISO? Greg V. From owner-ietf-822@dimacs.rutgers.edu Wed Apr 24 11:10:23 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA00974; Wed, 24 Apr 91 10:39:05 EDT Received: from thumper.bellcore.com by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA00969; Wed, 24 Apr 91 10:39:02 EDT Received: from greenbush.bellcore.com by thumper.bellcore.com (4.1/4.7) id for ietf-822@dimacs.rutgers.edu; Wed, 24 Apr 91 10:38:56 EDT Received: by greenbush.bellcore.com (4.12/4.7) id for ietf-822@dimacs.rutgers.edu; Wed, 24 Apr 91 10:42:25 edt Received: from Messages.7.14.N.CUILIB.3.45.SNAP.NOT.LINKED.greenbush.mouseclub.sun4.40 via MS.5.6.greenbush.mouseclub.sun4_40; Wed, 24 Apr 1991 10:42:22 -0400 (EDT) Message-Id: Date: Wed, 24 Apr 1991 10:42:22 -0400 (EDT) From: Nathaniel Borenstein To: James M Galvin Subject: Re: TEXT version of Draft RFC Cc: ietf-822@dimacs.rutgers.edu In-Reply-To: <9104241332.AA07350@TIS.COM> References: <9104241332.AA07350@TIS.COM> I don't know enough about PEM to know the right way to do this. Can all the necessary information be fit into the content-type, version-number, and resource-reference? If so, all we need is some clear prose explaining it. As with compression, I'm not the right person to do this, but I think that a PEM-message content-type is quite possibly a reasonable thing to define. Bear in mind, also, that people can define additional content-types after this RFC is finished. The list in this RFC is intended to provide us with a good shared starting point, but it is not intended to preclude future content-type definitions. -- Nathaniel From owner-ietf-822@dimacs.rutgers.edu Wed Apr 24 11:40:23 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA01505; Wed, 24 Apr 91 10:56:49 EDT Received: from thumper.bellcore.com by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA01501; Wed, 24 Apr 91 10:56:47 EDT Received: from greenbush.bellcore.com by thumper.bellcore.com (4.1/4.7) id for ietf-822@dimacs.rutgers.edu; Wed, 24 Apr 91 10:56:44 EDT Received: by greenbush.bellcore.com (4.12/4.7) id for ietf-822@dimacs.rutgers.edu; Wed, 24 Apr 91 11:00:05 edt Received: from Messages.7.14.N.CUILIB.3.45.SNAP.NOT.LINKED.greenbush.mouseclub.sun4.40 via MS.5.6.greenbush.mouseclub.sun4_40; Wed, 24 Apr 1991 11:00:02 -0400 (EDT) Message-Id: Date: Wed, 24 Apr 1991 11:00:02 -0400 (EDT) From: Nathaniel Borenstein To: ietf-822@dimacs.rutgers.edu Subject: Re: Comments on Draft RFC In-Reply-To: <9104240546.AA06781@vision.Eng.Sun.COM> References: <9104240546.AA06781@vision.Eng.Sun.COM> I'm still not convinced on the "Codeset" issue -- it seems to me that the set of cases where the content-type does not specify a character set is very small, and can be handled, if necessary, by multiple content-types, e.g. "troff-iso-10646". I like the "822-message" and "audio" content-type ideas. I think that a natural analogy to the 822-message you propose is a "compressed-message" content-type, which contains a compressed encapsulated message which may, using its own content-type, contain any other content-type. If we take this approach, we can keep content-encoding simple *and* avoid separate compression-related headers. I'll let Ned comment on your X.400 body part question. I am by no means wedded to preserving the hexadecimal content-encoding. Indeed, since I've argued strongly for minimizing the number of content-encodings, it would be hypocritical of me to advocate keeping it. How do other people feel? I think Vincent's distinction between transport-dependent and data-dependent encodings is a very clarifying notion. I have always thought of the Content-encoding as a transport-dependent encoding, and that data-dependent encoding can be fully specified with the content-type header. Do we yet have any evidence that this is not the case? In the absence of a compelling argument, I'd prefer to avoid the additional complication of a Content-conversion field. As far as parallelism in multipart messages goes, I don't really see it as "half a solution". Given that multipart messages can include multipart messages, you can do some extremely powerful things with the simple distinction between parallel and serial multipart messages. Moreover, it is explicitly stated that implementations are free to ignore the request for parallelism. Given this, it strikes me as harmless, and potentially very useful. Excerpts from internet.ietf-822: 23-Apr-91 Comments on Draft RFC Vincent Lau@eng.sun.com (6094) > 2. From the previous lengthy dicussions about "prefix" area, there were > some strong reasons that "prefix" and "postfix" areas should be > dropped. They *were* dropped. Or am I misinterpreting your use of the terms? I had actually written a version of the RFC that included the Content-Label proposal, but then I got rid of it when I realized that one could easily use the established Message-ID field for this purpose. Is there any advantage to defining a Content-label rather than using Message-ID in the encapsulated parts? Most of your other comments strike me as non-controversial, and I'll certainly try to work them into the next draft. Thanks! -- Nathaniel From owner-ietf-822@dimacs.rutgers.edu Wed Apr 24 13:10:24 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA05690; Wed, 24 Apr 91 12:57:37 EDT Received: from RUTGERS.EDU by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA05684; Wed, 24 Apr 91 12:57:35 EDT Received: from nrtc.northrop.com by rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA01995; Wed, 24 Apr 91 12:57:25 EDT Received: from nma.com by nrtc.nrtc.northrop.com id ad12355; 24 Apr 91 8:56 PST Received: from odin.nma.com by nma.com id aa20014; 24 Nov 91 9:54 PST To: Nathaniel Borenstein Cc: James M Galvin , ietf-822@dimacs.rutgers.edu Subject: Re: TEXT version of Draft RFC In-Reply-To: Your message of Wed, 24 Apr 91 10:42:22 -0500. Reply-To: Stef@ics.uci.edu From: Einar Stefferud Date: Wed, 24 Apr 91 09:53:37 MDT Message-Id: <4585.672512017@nma.com> Sender: stef@nma.com > Bear in mind, also, that people can define additional content-types > after this RFC is finished. The list in this RFC is intended to > provide us with a good shared starting point, but it is not intended > to preclude future content-type definitions. -- Nathaniel Then, we better spell out carefully how additional content-types get defined, and who elevates them to "standards" (if any), et al. Should we be putting our initial set into an appendix, instead of into the main text of RFC-XXXX? Don't we need to cleanly separate the timeless "types" from their "instances" in the RFC? Have we really specified (and is the IANA creating) a new register of context types, with all the registration and publication rules that have to go with it. I recall some handwaving language in the RFC that says that it is Somebody Else's Problem (SEP Technology Strikes Again), but has "somebody else" picked up the implied responsibility? What has to be done, if anything to make it happen? I believe in magic, but ...\Stef From owner-ietf-822@dimacs.rutgers.edu Wed Apr 24 14:10:24 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA06287; Wed, 24 Apr 91 13:15:48 EDT Received: from RUTGERS.EDU by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA06281; Wed, 24 Apr 91 13:15:44 EDT Received: from nrtc.northrop.com by rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA03099; Wed, 24 Apr 91 13:15:31 EDT Received: from nma.com by nrtc.nrtc.northrop.com id aa12464; 24 Apr 91 9:14 PST Received: from odin.nma.com by nma.com id aa20038; 24 Nov 91 10:07 PST To: Nathaniel Borenstein Cc: ietf-822@dimacs.rutgers.edu Subject: Re: Comments on Draft RFC In-Reply-To: Your message of Wed, 24 Apr 91 11:00:02 -0500. Reply-To: Stef@ics.uci.edu From: Einar Stefferud Date: Wed, 24 Apr 91 10:06:16 MDT Message-Id: <4597.672512776@nma.com> Sender: stef@nma.com I had the impression at one time that we were bulding a system of indentifiers that let us cascade a set of processes to be applied to various body part objects. Now, I seem to see a proliferation of singular compound content-type attribute values. > I'm still not convinced on the "Codeset" issue -- it seems to me that > the set of cases where the content-type does not specify a character set > is very small, and can be handled, if necessary, by multiple > content-types, e.g. "troff-iso-10646". So, do we also have "TeX-iso-10646" and "SCRIBE'iso-10646" ad nauseum? Where do all these get registered, and who decides what they mean? Does this group have to stay in sesion forever to decide these things? > I think that a natural analogy to the 822-message you propose is a > "compressed-message" content-type, which contains a compressed > encapsulated message which may, using its own content-type, contain any > other content-type. If we take this approach, we can keep > content-encoding simple *and* avoid separate compression-related headers. This is more of what is bothering me... Do we need to pre-define every possible combination that people will use? Cheers...\Stef From owner-ietf-822@dimacs.rutgers.edu Wed Apr 24 15:40:26 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA11955; Wed, 24 Apr 91 15:26:57 EDT Received: from thumper.bellcore.com by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA11948; Wed, 24 Apr 91 15:26:53 EDT Received: from greenbush.bellcore.com by thumper.bellcore.com (4.1/4.7) id for Stef@ics.uci.edu; Wed, 24 Apr 91 13:52:26 EDT Received: by greenbush.bellcore.com (4.12/4.7) id for ietf-822@dimacs.rutgers.edu; Wed, 24 Apr 91 13:55:56 edt Received: from Messages.7.14.N.CUILIB.3.45.SNAP.NOT.LINKED.greenbush.mouseclub.sun4.40 via MS.5.6.greenbush.mouseclub.sun4_40; Wed, 24 Apr 1991 13:55:52 -0400 (EDT) Message-Id: Date: Wed, 24 Apr 1991 13:55:52 -0400 (EDT) From: Nathaniel Borenstein To: Stef@ics.uci.edu Subject: Re: Comments on Draft RFC Cc: ietf-822@dimacs.rutgers.edu In-Reply-To: <4597.672512776@nma.com> References: <4597.672512776@nma.com> Excerpts from mail: 24-Apr-91 Re: Comments on Draft RFC Einar Stefferud@ics.uci. (1192) > So, do we also have "TeX-iso-10646" and "SCRIBE'iso-10646" ad nauseum? > Where do all these get registered, and who decides what they mean? > Does this group have to stay in sesion forever to decide these things? As per my previous message, I think that the established registration procedures will suffice. I really doubt that there are going to be a lot of these types, anyway. > Do we need to pre-define every possible combination that people will use? Not if we define things as "building blocks". Thus, for example, we don't need to define anything special about compression headers if we have a compressed-message content-type, because all the other mechanisms can then work on it recursively. Similarly for encrypted messages. It seems to me that this is a simple and elegant solution that avoids making the headers even more complex than they already are.. From owner-ietf-822@dimacs.rutgers.edu Wed Apr 24 16:10:25 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA12001; Wed, 24 Apr 91 15:27:50 EDT Received: from thumper.bellcore.com by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA11995; Wed, 24 Apr 91 15:27:46 EDT Received: from greenbush.bellcore.com by thumper.bellcore.com (4.1/4.7) id for Stef@ics.uci.edu; Wed, 24 Apr 91 13:46:52 EDT Received: by greenbush.bellcore.com (4.12/4.7) id for Craig_Everhart@transarc.com; Wed, 24 Apr 91 13:50:21 edt Received: from Messages.7.14.N.CUILIB.3.45.SNAP.NOT.LINKED.greenbush.mouseclub.sun4.40 via MS.5.6.greenbush.mouseclub.sun4_40; Wed, 24 Apr 1991 13:50:14 -0400 (EDT) Message-Id: Date: Wed, 24 Apr 1991 13:50:14 -0400 (EDT) From: Nathaniel Borenstein To: Stef@ics.uci.edu Subject: Re: TEXT version of Draft RFC Cc: James M Galvin , ietf-822@dimacs.rutgers.edu, Craig_Everhart@transarc.com In-Reply-To: <4585.672512017@nma.com> References: <4585.672512017@nma.com> Excerpts from mail: 24-Apr-91 Re: TEXT version of Draft RFC Einar Stefferud@ics.uci. (1025) > Should we be putting our initial set into an appendix, instead of into > the main text of RFC-XXXX? Don't we need to cleanly separate the > timeless "types" from their "instances" in the RFC? I had thought about this. I didn't come up with a strong feeling one way or the other. > Have we really specified (and is the IANA creating) a new register of > context types, with all the registration and publication rules that > have to go with it. I recall some handwaving language in the RFC that > says that it is Somebody Else's Problem (SEP Technology Strikes > Again), but has "somebody else" picked up the implied responsibility? > What has to be done, if anything to make it happen? I believe in > magic, but ...\Stef The current draft simply reproduces the language from RFC 1049, which gives someone to contact (Joyce Reynolds) to register content-types. I assumed that all of this was set up carefully for RFC 1049. Perhaps someone more knowledgable can comment? Perhaps Craig, who is one of the 1049 authors? From owner-ietf-822@dimacs.rutgers.edu Wed Apr 24 16:40:26 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA12772; Wed, 24 Apr 91 15:48:07 EDT Received: from transarc.com by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA12766; Wed, 24 Apr 91 15:48:02 EDT Received: by transarc.com (5.54/3.15) id for ietf-822@dimacs.rutgers.edu; Wed, 24 Apr 91 15:47:35 EDT Received: via switchmail; Wed, 24 Apr 1991 15:47:34 -0400 (EDT) Received: from apollo.transarc.com via qmail ID ; Wed, 24 Apr 1991 15:39:28 -0400 (EDT) Received: from apollo.transarc.com via qmail ID ; Wed, 24 Apr 1991 15:39:22 -0400 (EDT) Received: from Messages.7.14.N.CUILIB.3.45.SNAP.NOT.LINKED.apollo.transarc.com.pmax.3 via MS.5.6.apollo.transarc.com.pmax_3; Wed, 24 Apr 1991 15:39:19 -0400 (EDT) Message-Id: Date: Wed, 24 Apr 1991 15:39:19 -0400 (EDT) From: Craig_Everhart@transarc.com To: Stef@ics.uci.edu, Nathaniel Borenstein Subject: Re: TEXT version of Draft RFC Cc: James M Galvin , ietf-822@dimacs.rutgers.edu In-Reply-To: References: <4585.672512017@nma.com> Um, I was only in a consulting position on RFC 1049, and wasn't listed on the cover page. Marvin Sirbu was the sole author. As far as I know, and I could be wrong, mentioning it in the RFC and getting the RFC approved is the mechanism by which these things happen. You could query IANA@isi.edu directly. Actually, I doubt that IANA is the right registry. It's a *number* registry, and it creates the Internet Numbers document, in which numbers are assigned for use by various protocols, and there's some sort of person (sometimes an RFC) behind each assignment. If you want to use a number, you go off to some person or publication to find out how to do that. In this case, you want further extensions to be coordinated with everybody who speaks RFC-XXXX. I'd suggest that extensions be documented solely in subsequent RFCs, that then refer back to RFC-XXXX and describe exactly what extension is being made. You really have to address the interoperability problem. People are now using Content-type: values for all kinds of things. There's ``Content-type: text'' coming out of AT&T, and there's some weird `Content-type: x-uuencode-apple2-random-stuff'' coming out of Erik Fair's apple.com post office. In Andrew-land, we never thought we'd ever see other Content-type: values other than what we were used to putting there, but now there's this problem in that the pre-3/91 version of AMDS wants to reject any message that has a Content-type:, is going off-Andrew, and whose content-type isn't X-BE2 in a version that it recognizes. Thus, info-appletalk@andrew.cmu.edu would reject Erik Fair's odd postings, rather than just forwarding them without comment. You must fill in the edges of how mailers should behave when given documents that they don't understand! Do MUAs really have to understand *all* those content-types? Do MTAs? Craig From owner-ietf-822@dimacs.rutgers.edu Wed Apr 24 17:10:25 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA14996; Wed, 24 Apr 91 16:44:36 EDT Received: from Sun.COM by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA14992; Wed, 24 Apr 91 16:44:33 EDT Received: from Eng.Sun.COM (zigzag-bb.Corp.Sun.COM) by Sun.COM (4.1/SMI-4.1) id AA07875; Wed, 24 Apr 91 13:44:31 PDT Received: from vision.Eng.Sun.COM by Eng.Sun.COM (4.1/SMI-4.1) id AA05127; Wed, 24 Apr 91 13:44:34 PDT Received: by vision.Eng.Sun.COM (4.1/SMI-4.1) id AA07589; Wed, 24 Apr 91 13:44:32 PDT Date: Wed, 24 Apr 91 13:44:32 PDT From: Vincent.Lau@eng.sun.com (Vincent Lau) Message-Id: <9104242044.AA07589@vision.Eng.Sun.COM> To: ietf-822@dimacs.rutgers.edu Subject: Re: Comments on Draft RFC > > 2. From the previous lengthy dicussions about "prefix" area, there were > > some strong reasons that "prefix" and "postfix" areas should be > > dropped. > > They *were* dropped. Or am I misinterpreting your use of the terms? > Quoting from RFC-XXXX, it stated: "In general, these 'prefix' and 'postfix' areas of multipart messages should be regarded as comments, and implementations are free to discard them. However, it is recommended that composing agents use the prefix area to include a short textual message..." I would like to see a stronger statement that implementations *should* discard the "prefix" and "postfix" areas. Don't recommend the use of them. I am afraid that if a sending UA puts an important (judgment call) message in the prefix area, but the receiving UA discards it. In my opinion, these 2 UA's are *not* interoperable. > I had actually written a version of the RFC that included the > Content-Label proposal, but then I got rid of it when I realized that > one could easily use the established Message-ID field for this purpose. > Is there any advantage to defining a Content-label rather than using > Message-ID in the encapsulated parts? > Content-Label is set by the *sending* UA (e.g. the sender assigns a label "Phone_Message" to an audio typed body part) and it is more user-friendlier than Message-ID when used in a reply message. This field exists in each body part, but it is optional. Message-ID is different from Content-Label that Message-ID refers to the whole message and Content-Label refers to a specific body part in a message. -Vincent From owner-ietf-822@dimacs.rutgers.edu Wed Apr 24 18:10:25 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA16946; Wed, 24 Apr 91 17:40:55 EDT Received: from TWG.COM by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA16929; Wed, 24 Apr 91 17:40:17 EDT Received: from Obelix.twg.com by twg.com with SMTP ; Wed, 24 Apr 91 14:19:47 PST Received: from obelix.twg.com by Obelix.TWG.COM id aa07746; 24 Apr 91 14:19 PDT To: ietf-822@dimacs.rutgers.edu Subject: Re: Comments on Draft RFC Date: Wed, 24 Apr 91 14:19:13 -0700 From: David Herron Message-Id: <9104241419.aa07746@Obelix.TWG.COM> Stef: had the impression at one time that we were bulding a system of Stef: indentifiers that let us cascade a set of processes to be applied to Stef: various body part objects. Yes! This is a very good way of describing what I suggested yesterday. I see no good reason to limit ourselves to two transformations in order to reach a mailable object. Or 3, or 4, or ... The syntax to have an arbitrarily sized list of transformations is easy to put in. It just borrows the same comma-seperated paradigm used in many other header lines. Occasionally To: lines get to be Really Really long, but normally they stay at <= ~5 and everything is fine either way. It should, IMHO, be the same with Content-Encoding:. To bring up `uuencode' again. Someone (Nathaniel?) yesterday mentioned that his impression was that it wasn't strongly "standardized", and therefore not someting to include. My experience is that while there are a few slight variants on the theme, that all (with possibly one caveat) were compatible with one another. Most of the extensions were to place extra information after the claimed end-of-line. That is, each line starts with a character saying how many data bytes are on this line. The uudecode program reads that many characters off and proceeds to the next line. Which means that any extra data is not seen by a uudecode which doesn't expect that extra data to be there. The stock uuencode which has been distributed with 4.{2,3}BSD does have a problem with using trailing blanks. One of the "extensions" was to simply add a non-blank character to the end of the line. Another (this is the `caveat' mentioned above) changed the encoding so that it did not use blanks, this one is likely incompatible. Another added a checksum to each line. They are all intercompatible, I have used them all and stared at all the source code, etc. I strongly feel that uuencode should be mentioned as one of the possible encodings. Not because it's any more efficient or "better" than the others, but because it's widely implemented, widely available, and widely known. For a recipient who does not have an RFC-1154++ complient UA which cannot decode these funky messages we're talking about, they likely will not know what to do with this weirdo looking stuff in the middle of the message. But by using something familiar to lots of people, there is a greater chance that the message will be decodable. David From owner-ietf-822@dimacs.rutgers.edu Wed Apr 24 19:40:27 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA20659; Wed, 24 Apr 91 19:39:40 EDT Received: from RUTGERS.EDU by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA20653; Wed, 24 Apr 91 19:39:37 EDT Received: from nrtc.northrop.com by rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA28479; Wed, 24 Apr 91 19:39:27 EDT Received: from nma.com by nrtc.nrtc.northrop.com id ab14756; 24 Apr 91 15:38 PST Received: from odin.nma.com by nma.com id aa20452; 24 Nov 91 16:02 PST To: Vincent Lau Cc: ietf-822@dimacs.rutgers.edu Subject: Re: Comments on Draft RFC In-Reply-To: Your message of Wed, 24 Apr 91 13:44:32 -0800. <9104242044.AA07589@vision.Eng.Sun.COM> Reply-To: Stef@ics.uci.edu From: Einar Stefferud Date: Wed, 24 Apr 91 16:01:18 MDT Message-Id: <4971.672534078@nma.com> Sender: stef@nma.com I agree that the perfix and postfix stuff needs to be very clear, else people will use it with bad effects, as noted by Vincent. I also agree that "content-label" and "message-id" are quite different in look and feel. Also, does not Message-ID require some specific formatting and content per RFC822? Guaranteed unique and such... Best...\Stef From owner-ietf-822@dimacs.rutgers.edu Wed Apr 24 21:40:28 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA23421; Wed, 24 Apr 91 21:17:30 EDT Received: from alpha.Xerox.COM by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA23417; Wed, 24 Apr 91 21:17:26 EDT Received: from holmes.parc.xerox.com ([13.1.100.162]) by alpha.xerox.com with SMTP id <16473>; Wed, 24 Apr 1991 18:17:15 PDT Received: by holmes.parc.xerox.com id <33025>; Wed, 24 Apr 1991 18:18:17 -0700 Received: from Messages.7.14.N.CUILIB.3.45.SNAP.NOT.LINKED.holmes.parc.xerox.com.sun4.40 via MS.5.6.holmes.parc.xerox.com.sun4_40; Wed, 24 Apr 1991 18:18:11 -0700 (PDT) Message-Id: Date: Wed, 24 Apr 1991 18:18:11 PDT Sender: Bill Janssen From: Bill Janssen To: Stef@ics.uci.edu, Nathaniel Borenstein Subject: Re: compressed as Content-Encoding, rather than Content-Type Cc: ietf-822@dimacs.rutgers.edu In-Reply-To: References: <4597.672512776@nma.com> Excerpts from ext.ietf-822: 24-Apr-91 Re: Comments on Draft RFC Nathaniel Borenstein@thu (965) > > Do we need to pre-define every possible combination that people will use? > Not if we define things as "building blocks". Thus, for example, we > don't need to define anything special about compression headers if we > have a compressed-message content-type, because all the other mechanisms > can then work on it recursively. Similarly for encrypted messages. It > seems to me that this is a simple and elegant solution that avoids > making the headers even more complex than they already are.. But if we take this mechanism to extremes, we really don't need Content-Encoding at all, as we can keep on recursively applying decoding mechanisms. I feel that "compressed" should be another Content-Encoding, rather than part of the Content-Type. Bill From owner-ietf-822@dimacs.rutgers.edu Wed Apr 24 22:10:27 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA23863; Wed, 24 Apr 91 21:35:26 EDT Received: from alpha.Xerox.COM by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA23856; Wed, 24 Apr 91 21:35:22 EDT Received: from holmes.parc.xerox.com ([13.1.100.162]) by alpha.xerox.com with SMTP id <16473>; Wed, 24 Apr 1991 18:35:09 PDT Received: by holmes.parc.xerox.com id <33025>; Wed, 24 Apr 1991 18:36:19 -0700 Received: from Messages.7.14.N.CUILIB.3.45.SNAP.NOT.LINKED.holmes.parc.xerox.com.sun4.40 via MS.5.6.holmes.parc.xerox.com.sun4_40; Wed, 24 Apr 1991 18:36:05 -0700 (PDT) Message-Id: Date: Wed, 24 Apr 1991 18:36:05 PDT Sender: Bill Janssen From: Bill Janssen To: ietf-822@dimacs.rutgers.edu, Nathaniel Borenstein Subject: Re: Comments on Draft RFC In-Reply-To: References: <9104240546.AA06781@vision.Eng.Sun.COM> Excerpts from ext.ietf-822: 24-Apr-91 Re: Comments on Draft RFC Nathaniel Borenstein@thu (2624) > I think Vincent's distinction between transport-dependent and > data-dependent encodings is a very clarifying notion. I have always > thought of the Content-encoding as a transport-dependent encoding, and > that data-dependent encoding can be fully specified with the > content-type header. I like the thought of putting all the information about the "type" of document into the Content-Type header, and putting all the information about how to extract a document of that type from the message, into the Content-Encoding header. I think of the Content-Encoding as describing a small set of actions that a UA might be expected to perform, while the Content-Type might involve arbitrary other formatting, display, and interaction systems. The Content-Encoding is specified mainly for the purpose of getting a document of Content-Type from the sender UA to the receiver UA without change. I also like Vincent's notion of dropping HEXADECIMAL in favor of BASE64 or BASE85. Do we then assume that basically every UA is able to deal with BASE64 (via free code, etc.)? I believe that this would then quickly make UUENCODED obsolete. I also feel that there should be the standard Content-Encoding "COMPRESSED", for which I do not have a rigorous definition, but which would mean something like "run through UNIX 'compress' (and then encoded with BASE64?)". I appreciate the legal tangles with compress, but also know there's a lot of copies out there. Of course, perhaps a standard compression routine could also be donated, and made freely available, so that it is not necessary to use UNIX compress. Bill From owner-ietf-822@dimacs.rutgers.edu Wed Apr 24 22:40:26 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA24397; Wed, 24 Apr 91 21:55:54 EDT Received: from Sun.COM by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA24393; Wed, 24 Apr 91 21:55:50 EDT Received: from Eng.Sun.COM (exodus-bb.Corp.Sun.COM) by Sun.COM (4.1/SMI-4.1) id AA21163; Wed, 24 Apr 91 18:55:46 PDT Received: from skylark.Eng.Sun.COM by Eng.Sun.COM (4.1/SMI-4.1) id AA01889; Wed, 24 Apr 91 18:54:54 PDT Received: by skylark.Eng.Sun.COM (4.1/SMI-4.1) id AA20249; Wed, 24 Apr 91 18:54:25 PDT Date: Wed, 24 Apr 91 18:54:25 PDT From: Neil.Katin@eng.sun.com (Neil Katin) Message-Id: <9104250154.AA20249@skylark.Eng.Sun.COM> To: nsb@thumper.bellcore.com Subject: Re: Comments on Draft RFC Cc: ietf-822@dimacs.rutgers.edu > Date: Wed, 24 Apr 1991 13:55:52 -0400 (EDT) > From: Nathaniel Borenstein > Subject: Re: Comments on Draft RFC > > Excerpts from mail: 24-Apr-91 Re: Comments on Draft RFC Einar > Stefferud@ics.uci. (1192) > > > So, do we also have "TeX-iso-10646" and "SCRIBE'iso-10646" ad nauseum? > > Where do all these get registered, and who decides what they mean? > > Does this group have to stay in sesion forever to decide these things? > > As per my previous message, I think that the established registration > procedures will suffice. I really doubt that there are going to be a > lot of these types, anyway. Nathaniel, I strongly disagree with your proposal. It is wrong from both the user agent perspective, and country gateways. Of course, "right" and "wrong" assume some underlying model of how these things will be used. I'll explain how I intend to use this info in our UA's and gateways, and why your solution is not suitable for us. Think about who needs to know about the character set, who needs to know about the "type" of the attachment, and what relationship there is between the two. In X.400 land (and at selected gateways in SMTP land) there are gateway functions that attempt to translate from one character set to another. Some of these translations must be done (EBCDIC->ASCII, ISO10646->LATIN1, etc) for anything to be done with the information on the other side of the gateway. These gateways clearly *only care about the character set*, and couldn't care less about the type of the attachment. In your proposal, how should these gateways recognize what character set a body part is in? These gateways either need to understand about character set postfixes ("*-iso-10646" means that the doc is in 10646 char set?) or the gateways need to know that "tex-iso-10646" maps to the iso-10646 character set via some table lookup. The second case is clearly unworkable -- the gateway needs to know about *every type in the universe* if it wants to do character set conversion; the former proposal moves the complexity to parsing the type name space instead of a separate header field; it also means that type names are not an atomic object; instead they need to be parsed before they are interpreted. The user agent has all the problems of the gateway (assuming it will attempt to translate character sets) and then it has the UA specific problems. My model is that the UA will use the type field to try and figure out what editor (viewer?) to use to display the body part to the user. But the type field is no longer a simple object in your proposal. Instead it is now a compound object that must be parsed. In case it is not clear yet, I believe that any conceivable benefits to having one fewer header fields are overwhelmed by the cost of having a "complex" type field that needs to be parsed. Also, you keep on asserting that there won't be very many of these. I disagree with that part too. There are currently three different multi-charset "standards" available (iso10646, iso2022, unicode), as well as a plethora of national 8 bit char sets. Trying to build compound type names puts us in the position of needing N x M type names, where N is the actual number of types and M is the number of character sets. This is a "bad thing" to inflict on people. > > Do we need to pre-define every possible combination that people will use? > > Not if we define things as "building blocks". Thus, for example, we > don't need to define anything special about compression headers if we > have a compressed-message content-type, because all the other mechanisms > can then work on it recursively. Similarly for encrypted messages. It > seems to me that this is a simple and elegant solution that avoids > making the headers even more complex than they already are.. OK. What we haven't asked is "what should be the philosophy of what we should put into separate header fields?" My opinion is that we partition problems based on who will need to interpret the information, and how hard it will be for UAs and gateways to interpret this information. By my model, Content-encoding is clearly proper to put in a header field. It has a clearly definined purpose -- to render a body part capable of being passed through a particular transport. Implicit in the design is the ability to make an SMTP++ gateway that can easily translate an RFC-XXXX message down to something that can be passed through old-style SMTP. Charset also passes my muster: it identifies the mapping between the binary values in the body part and conceptual glyphs that one might paint on the screen. But where does "compression" fit in? An encapsulated message doesn't seem to hack it, because one probably wants the ability to compress different body parts in different ways (an image compression algorithm will be different from a good text compresser). I think of "compression" in general as a filter function: a black box to the email UA that you pass the body part through before using it. By that model the "filter" doesn't have to be something that compresses; it could be something that decrypts instead. Is this general enough to pass muster for its own header field? It does to me, but I feel less strongly about this one. Neil Katin From owner-ietf-822@dimacs.rutgers.edu Wed Apr 24 23:48:39 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA25874; Wed, 24 Apr 91 22:54:36 EDT Received: from srawgw.sra.co.jp by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA25865; Wed, 24 Apr 91 22:54:30 EDT Received: from srava.sra.co.jp by srawgw.sra.co.jp (5.64WH/1.4) id AA20748; Thu, 25 Apr 91 11:54:51 +0900 Received: from sran8.sra.co.jp by srava.sra.co.jp (5.64b/6.4J.6-BJW) id AA20079; Thu, 25 Apr 91 11:53:48 +0900 Received: from localhost by sran8.sra.co.jp (4.0/6.4J.6-SJ) id AA13069; Thu, 25 Apr 91 11:50:20 JST Return-Path: Message-Id: <9104250250.AA13069@sran8.sra.co.jp> Reply-To: erik@sra.co.jp From: erik@sra.co.jp (Erik M. van der Poel) To: ietf-822@dimacs.rutgers.edu Subject: Re: compressed as Content-Encoding, rather than Content-Type Date: Thu, 25 Apr 91 11:50:19 +0900 Sender: erik@sran8.sra.co.jp > > > Do we need to pre-define every possible combination that people will use? > > > > Not if we define things as "building blocks". Thus, for example, we > > don't need to define anything special about compression headers if we > > have a compressed-message content-type, because all the other mechanisms > > can then work on it recursively. Similarly for encrypted messages. It > > seems to me that this is a simple and elegant solution that avoids > > making the headers even more complex than they already are.. > > But if we take this mechanism to extremes, we really don't need > Content-Encoding at all, as we can keep on recursively applying decoding > mechanisms. I feel that "compressed" should be another > Content-Encoding, rather than part of the Content-Type. But what if we want to compress *and* base64 it? Are you suggesting: Content-Encoding: base64, compress I agree with Nathaniel that it might be more elegant to recursively apply RFC-XXXX rules. E.g. at the outermost level you might have: Body-Type: base64 Then, when you decode the *body*, you end up with another RFC-XXXX format message, whose header might include: Body-Type: compress Next, when you uncompress the *body* of this message, you end up with yet another RFC-XXXX format message, whose header might include: Body-Type: tar I would think that it is not necessary to have two separate headers like Content-Type and Content-Encoding. We can do this with one header called Body-Type. I gave it a new name so that it doesn't collide with existing implementations of Content-Type such as Erik Fair's and AT&T's. Maybe we don't need to give it a new name... Another way of doing this would be something like: Body-Type: base64, compress, tar but then we would need a way of combining certain orthogonal types such as TeX and Latin-1: Body-Type: base64, compress, TeX/Latin-1 If we use the recursive method, we can separate the orthogonal types: Body-Type: TeX Codeset: Latin-1 Comments? Erik From owner-ietf-822@dimacs.rutgers.edu Thu Apr 25 01:10:27 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA29241; Thu, 25 Apr 91 00:59:59 EDT Received: from srawgw.sra.co.jp by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA29237; Thu, 25 Apr 91 00:59:53 EDT Received: from srava.sra.co.jp by srawgw.sra.co.jp (5.64WH/1.4) id AA21238; Thu, 25 Apr 91 14:00:12 +0900 Received: from sran8.sra.co.jp by srava.sra.co.jp (5.64b/6.4J.6-BJW) id AA21206; Thu, 25 Apr 91 13:59:08 +0900 Received: from localhost by sran8.sra.co.jp (4.0/6.4J.6-SJ) id AA13243; Thu, 25 Apr 91 13:55:40 JST Return-Path: Message-Id: <9104250455.AA13243@sran8.sra.co.jp> Reply-To: erik@sra.co.jp From: erik@sra.co.jp (Erik M. van der Poel) To: ietf-822@dimacs.rutgers.edu Subject: recursive RFC-XXXX Date: Thu, 25 Apr 91 13:55:38 +0900 Sender: erik@sran8.sra.co.jp Whoops, I just realized that there could be a problem with the recursive RFC-XXXX format method. If I send a uuencoded, compressed tar file using the recursive method, and the receiver does not have one of those new UAs, he/she will first have to uudecode the body, then invoke a binary editor to remove the header from the result, then uncompress, then remove another header, and so on. I don't think it's reasonable to expect the poor user to have a binary editor, nor can we expect the user to be able to edit out the header perfectly. So we should stick to non-recursive methods, e.g.: Body-Type: uuencode, compress, tar Of course, we can adopt the recursive method so that there is an incentive for users to upgrade to a better UA, but I think it would cause too much of a hassle. The receiver would have to either (a) take the trouble to edit out the headers, or (b) ask the sender to re-send in a better format, which would be a hassle for the sender. Or (c) get a new UA, but that may take some time. Erik From owner-ietf-822@dimacs.rutgers.edu Thu Apr 25 05:10:29 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA02347; Thu, 25 Apr 91 04:28:36 EDT Received: from srawgw.sra.co.jp by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA02343; Thu, 25 Apr 91 04:28:26 EDT Received: from srava.sra.co.jp by srawgw.sra.co.jp (5.64WH/1.4) id AA22130; Thu, 25 Apr 91 17:28:43 +0900 Received: from sran8.sra.co.jp by srava.sra.co.jp (5.64b/6.4J.6-BJW) id AA23205; Thu, 25 Apr 91 16:13:03 +0900 Received: from localhost by sran8.sra.co.jp (4.0/6.4J.6-SJ) id AA13413; Thu, 25 Apr 91 16:09:31 JST Return-Path: Message-Id: <9104250709.AA13413@sran8.sra.co.jp> Reply-To: erik@sra.co.jp From: erik@sra.co.jp (Erik M. van der Poel) To: ietf-822@dimacs.rutgers.edu Subject: Re: support for all character sets Date: Thu, 25 Apr 91 16:09:29 +0900 Sender: erik@sran8.sra.co.jp > >What exactly do you mean by "support"? Do you mean the ability to > >convert to another character set in the list? > > I mean simply the ability to display the specified character set. Is this really the province of an 822-like RFC? 822 seems to me to be more like something that specifies the format of messages, including the header names, how to write headers, etc. On the other hand, I can fully understand your desire for interoperability. But I don't think we can really expect everyone to have all fonts. There will always be at least a small number of people with dumb terminals that can only display e.g. ASCII. Or people with 1.5 Megabyte RAM X terminals, that can't display Japanese. When people create standards, they usually try to limit the specs to something that is implementable, both technically and politically. From this viewpoint, if we want to specify that the characters be displayable, we would have to limit the character set to the least common denominator, probably something like the invariant set of ISO 646. Now, I ask, is this really what we want? (It's not what you want, you even mentioned Unicode!) So, I think you should relax the displayability requirement, and just say that the UAs must be able to interconvert between the listed codesets. Of course, there will be messages that cannot be converted to one of the desired codesets. In these cases, I would suggest not converting at all, perhaps telling the user that it couldn't be converted. Erik From owner-ietf-822@dimacs.rutgers.edu Thu Apr 25 05:40:30 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA02533; Thu, 25 Apr 91 04:44:28 EDT Received: from lth.se by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA02529; Thu, 25 Apr 91 04:44:19 EDT Received: from Himmelsborg.dna.lth.se by lth.se (5.65+bind 1.7+ida 1.4.2/LTH-4-NS); Thu, 25 Apr 91 10:44:07 +0200 (MET) Received: by dna.lth.se (5.65+bind 1.7+ida 1.4.2/DNA-4-NS); Thu, 25 Apr 91 10:44:06 +0200 (MET) Date: Thu, 25 Apr 91 10:44:06 +0200 From: Dan Oscarsson Message-Id: <9104250844.AA15141@dna.lth.se> To: ietf-822@dimacs.rutgers.edu Subject: Some comments to new draft Part 2: Content-Type header There are to many of them. There should be a few general ones, internation standards if possible. Why have scribe,tex,troff,dvi,pbm,pgm,ppm? They are very special. Why not then FrameMaker? I am not sure that the pbm, p... are the best standard for images. We should use an international standard (no fax standard, please) for images. There ought to be a few iso character set standards defined. ISO 8859-1 and ISO 10646. And MAILASCII, ISO 8859-1 and ISO 10646 should be the recommended types for text bodies. About the postscript type: Why talk about laserpreps? This is something Apple is using? Each postscript body should be self-containd. It may not depend on some separate code having been loaded into the printer. Part 3.1: Quoted-Printable Why use hexadecimal in Style #1? If style #1 is used for readability I prefer octal codes. If style #1 is for efficiency each 4 bits could be coded as "A"+value, this gives faster encoding/decoding. Dan -- Dan Oscarsson Department of Computer Science Lund Institute of Technology e-mail: Dan.Oscarsson@dna.lth.se Box 118 S-221 00 Lund, Sweden From owner-ietf-822@dimacs.rutgers.edu Thu Apr 25 08:10:31 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA05703; Thu, 25 Apr 91 08:08:12 EDT Received: from srawgw.sra.co.jp by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA05699; Thu, 25 Apr 91 08:08:07 EDT Received: from srava.sra.co.jp by srawgw.sra.co.jp (5.64WH/1.4) id AA23133; Thu, 25 Apr 91 21:08:25 +0900 Received: from sran8.sra.co.jp by srava.sra.co.jp (5.64b/6.4J.6-BJW) id AA27189; Thu, 25 Apr 91 21:07:26 +0900 Received: from localhost by sran8.sra.co.jp (4.0/6.4J.6-SJ) id AA14176; Thu, 25 Apr 91 21:03:58 JST Return-Path: Message-Id: <9104251204.AA14176@sran8.sra.co.jp> Reply-To: erik@sra.co.jp From: erik@sra.co.jp (Erik M. van der Poel) To: ietf-822@dimacs.rutgers.edu Subject: support for all content types Date: Thu, 25 Apr 91 21:03:56 +0900 Sender: erik@sran8.sra.co.jp > ... unless all > implementations have support for all possible character sets, I cannot > count on interoperating. What about all the other Content-Types in the draft? Should all implementations support all types, including SGML, audio, and what have you? Erik PS :-) From owner-ietf-822@dimacs.rutgers.edu Thu Apr 25 08:40:30 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA05696; Thu, 25 Apr 91 08:02:56 EDT Received: from srawgw.sra.co.jp by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA05692; Thu, 25 Apr 91 08:02:49 EDT Received: from srava.sra.co.jp by srawgw.sra.co.jp (5.64WH/1.4) id AA23122; Thu, 25 Apr 91 21:03:04 +0900 Received: from sran8.sra.co.jp by srava.sra.co.jp (5.64b/6.4J.6-BJW) id AA27095; Thu, 25 Apr 91 21:02:04 +0900 Received: from localhost by sran8.sra.co.jp (4.0/6.4J.6-SJ) id AA14165; Thu, 25 Apr 91 20:58:36 JST Return-Path: Message-Id: <9104251158.AA14165@sran8.sra.co.jp> Reply-To: erik@sra.co.jp From: erik@sra.co.jp (Erik M. van der Poel) To: ietf-822@dimacs.rutgers.edu Subject: codeset conversion by gateways Date: Thu, 25 Apr 91 20:58:32 +0900 Sender: erik@sran8.sra.co.jp > Some of these translations must be done (EBCDIC->ASCII, > ISO10646->LATIN1, etc) for anything to be done with the information > on the other side of the gateway. These gateways clearly *only care > about the character set*, and couldn't care less about the type > of the attachment. And these EBCDIC<->ASCII gateways probably don't care if the Content-Type says MAILASCII too, right? In the EBCDIC world, we may find EBCDIC messages that nevertheless say Content-Type: MAILASCII. Is this simply a question of updating the gateways at some point in time after RFC-XXXX becomes a standard? Erik From owner-ietf-822@dimacs.rutgers.edu Thu Apr 25 09:10:31 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA06853; Thu, 25 Apr 91 08:50:55 EDT Received: from NRI.RESTON.VA.US by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA06847; Thu, 25 Apr 91 08:50:52 EDT Received: from NRI by NRI.NRI.Reston.VA.US id aa04157; 25 Apr 91 8:43 EDT To: erik@sra.co.jp Cc: ietf-822@dimacs.rutgers.edu Subject: Re: support for all content types In-Reply-To: Your message of "Thu, 25 Apr 91 21:03:56 +0900." <9104251204.AA14176@sran8.sra.co.jp> Date: Thu, 25 Apr 91 08:43:37 -0400 From: Greg Vaudreuil Message-Id: <9104250843.aa04157@NRI.NRI.Reston.VA.US> From Erik: > What about all the other Content-Types in the draft? Should all > implementations support all types, including SGML, audio, and what > have you? I have only concern for text. I see no way around the necessity to ask if you are capable of "advanced" functionality. I simply do not see plain text (even in other character sets) as advanced functionality. > So, I think you should relax the displayability requirement, and just > say that the UAs must be able to interconvert between the listed > codesets. Of course, there will be messages that cannot be converted > to one of the desired codesets. In these cases, I would suggest not > converting at all, perhaps telling the user that it couldn't be > converted. Well, if you can convert from ISO 10646 to national 646-n, and then display 646-n on your dumb terminal, then I'd say you can handle 10646. You can do something with it, even if you must suffer info loss for Kanji. > On the other hand, I can fully understand your desire for > interoperability. But I don't think we can really expect everyone to > have all fonts. There will always be at least a small number of people > with dumb terminals that can only display e.g. ASCII. Or people with > 1.5 Megabyte RAM X terminals, that can't display Japanese. This is precisely what I'm getting at. If I pick a series of codesets, like MAILASCII, Latin-1 and ISO 10646, they are all upwardly compatable. If I send Japanese, I must use ISO 10646. I have no option. If I send English in ASCII, I can use 10646, but I can subset it to ASCII. If I send French, I can use either 10646, or subset it to Latin-1. There is a big difference between implementing this series of character sets and asking that I implement (or be able to convert to and from) Unicode, 646-n, and iso10646. With the former series, I implement the level of functionality I need. By allowing any arbitrary character set, I must implement all sets that can possibly give me the functionality I need, because must expect any one of them. > Is this really the province of an 822-like RFC? 822 seems to me to be > more like something that specifies the format of messages, including > the header names, how to write headers, etc. This is very much a message format issue. RFC explicity addressed this issue. It said, "you must use ASCII". If it is not a 822 issue, then is it an topic for a separate implementor agreement? NETF specified use of latin-1 on there networks, NSFnet specifies ISO 10646 for the US scientific community, and Japan specifies 2022-n for internal use and ISO 10646 for external use.... This is a nightmare that the IETF/IAB/Internet community has never had to deal with. Telnet is a good example of a protocol with lots of options. If I implement the required functionality, I can get info back and forth. I may not be able to do fancy stuff, but I can negotiate for advanced features. There is no negotiation at the 822-UA level, so we must specify required functionality. Greg V. From owner-ietf-822@dimacs.rutgers.edu Thu Apr 25 09:40:31 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA08257; Thu, 25 Apr 91 09:32:17 EDT Received: from thumper.bellcore.com by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA08252; Thu, 25 Apr 91 09:32:14 EDT Received: from greenbush.bellcore.com by thumper.bellcore.com (4.1/4.7) id for ietf-822@dimacs.rutgers.edu; Thu, 25 Apr 91 09:32:09 EDT Received: by greenbush.bellcore.com (4.12/4.7) id for ietf-822@dimacs.rutgers.edu; Thu, 25 Apr 91 09:35:40 edt Received: from Messages.7.14.N.CUILIB.3.45.SNAP.NOT.LINKED.greenbush.mouseclub.sun4.40 via MS.5.6.greenbush.mouseclub.sun4_40; Thu, 25 Apr 1991 09:35:37 -0400 (EDT) Message-Id: Date: Thu, 25 Apr 1991 09:35:37 -0400 (EDT) From: Nathaniel Borenstein To: Stef@ics.uci.edu, Craig_Everhart@transarc.com Subject: Re: TEXT version of Draft RFC Cc: James M Galvin , ietf-822@dimacs.rutgers.edu In-Reply-To: References: <4585.672512017@nma.com> , Excerpts from mail: 24-Apr-91 Re: TEXT version of Draft RFC Craig_Everhart@transarc. (1860) > Do MUAs really have to understand *all* those content-types? Do MTAs? I certainly wouldn't think so; all you need to know is not to muck with their insides. If you have to reject them, you should package them up as a Content-type: 822message or a Content-type: multipart including an 822message, so that the data is still retrievable on the inside by a viewer that does understand the format. I'd characterize the AMDS behavior in rejecting out-of-hand "foreign" content-types to be a bug, pure and simple. Gee, I wonder what those idiots who wrote Andrew were thinking, eh? From owner-ietf-822@dimacs.rutgers.edu Thu Apr 25 10:10:31 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA08650; Thu, 25 Apr 91 09:39:53 EDT Received: from thumper.bellcore.com by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA08646; Thu, 25 Apr 91 09:39:51 EDT Received: from greenbush.bellcore.com by thumper.bellcore.com (4.1/4.7) id for ietf-822@dimacs.rutgers.edu; Thu, 25 Apr 91 09:39:48 EDT Received: by greenbush.bellcore.com (4.12/4.7) id for ietf-822@dimacs.rutgers.edu; Thu, 25 Apr 91 09:43:18 edt Received: from Messages.7.14.N.CUILIB.3.45.SNAP.NOT.LINKED.greenbush.mouseclub.sun4.40 via MS.5.6.greenbush.mouseclub.sun4_40; Thu, 25 Apr 1991 09:43:15 -0400 (EDT) Message-Id: Date: Thu, 25 Apr 1991 09:43:15 -0400 (EDT) From: Nathaniel Borenstein To: Stef@ics.uci.edu, Bill Janssen Subject: Re: compressed as Content-Encoding, rather than Content-Type Cc: ietf-822@dimacs.rutgers.edu In-Reply-To: References: <4597.672512776@nma.com> , Excerpts from mail: 24-Apr-91 Re: compressed as Content-E.. Bill Janssen@parc.xerox. (857) > But if we take this mechanism to extremes, we really don't need > Content-Encoding at all, as we can keep on recursively applying decoding > mechanisms. I feel that "compressed" should be another > Content-Encoding, rather than part of the Content-Type. I could sooner agere to this, actually, than to a genuine proliferation of content-encodings, although I don't think that's what Bill's proposing so it isn't really a fair comparison. Here's the way I see it: Content-type, together with encapsulated messages & multipart content-types, really is a powerful enough mechanism to do ANY of the things we've discussed on this list. Content-Encoding isn't strictly necessary at all. HOWEVER, the 7-bit and line-length limitations of SMTP pose a set of problems that affect LOTS of different content-types, notably including 8-bit text. Having a very standard mechanism of turning any arbitrary content-type into an 8-bit-safe content type is, in my opinion, highly desirable, and worth the extra mechanism of a Content-Encoding header. At least, it is so long as it is simple enough to admit of a "standard" implementation, because this is never going to be true of content-types, which are going to proliferate. If Content-Encodings are not going to be standard -- that is, if they're going to proliferate just as Content-types do -- then I see no reason at all for the separate header. I think there's a very strong argument for two content-encodings, the base64 and the quoted-printable, because they correspond to very different situations (binary content types vs. near-ascii). The case for hexadecimal is much weaker, and I think they get weaker still from there. I can see the argument for a compressed encoding, because it offers a significant advantage, but I'm not sure the world is ready to argree on a standard compression algorithm. Beyond that, I think that multiple encodings opens up a Pandora's box of complication. If we're going to go down that road, I for one would advocate stepping back and taking the approach the Bill mentions, though I don't think he really advocated it: giving up Content-Encoding entirely, in favor of a few additional content-types such as "base64-encoded-message". That alternative is sounding better to me all the time! -- Nathaniel From owner-ietf-822@dimacs.rutgers.edu Thu Apr 25 10:40:31 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA09584; Thu, 25 Apr 91 10:17:25 EDT Received: from enet-gw.pa.dec.com by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA09580; Thu, 25 Apr 91 10:17:18 EDT Received: by enet-gw.pa.dec.com; id AA09296; Thu, 25 Apr 91 07:16:12 -0700 Message-Id: <9104251416.AA09296@enet-gw.pa.dec.com> Received: from casee.enet; by decwrl.enet; Thu, 25 Apr 91 07:16:48 PDT Date: Thu, 25 Apr 91 07:16:48 PDT From: "Tom Morris - Digital Equipment - Valbonne, France 25-Apr-1991 1542" To: ietf-822@dimacs.rutgers.edu Subject: Re: Comments on Draft RFC >I like the "822-message" and "audio" content-type ideas. Please don't include an "audio" content type unless there a specific standard referenced. There are lots of different types of audio codings and just saying "audio" will get us the same place we are today with "text". Tom Morris From owner-ietf-822@dimacs.rutgers.edu Thu Apr 25 10:58:25 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA09191; Thu, 25 Apr 91 10:00:22 EDT Received: from thumper.bellcore.com by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA09187; Thu, 25 Apr 91 10:00:20 EDT Received: from greenbush.bellcore.com by thumper.bellcore.com (4.1/4.7) id for ietf-822@dimacs.rutgers.edu; Thu, 25 Apr 91 10:00:17 EDT Received: by greenbush.bellcore.com (4.12/4.7) id for ietf-822@dimacs.rutgers.edu; Thu, 25 Apr 91 10:03:46 edt Received: from Messages.7.14.N.CUILIB.3.45.SNAP.NOT.LINKED.greenbush.mouseclub.sun4.40 via MS.5.6.greenbush.mouseclub.sun4_40; Thu, 25 Apr 1991 10:03:43 -0400 (EDT) Message-Id: Date: Thu, 25 Apr 1991 10:03:43 -0400 (EDT) From: Nathaniel Borenstein To: Neil.Katin@eng.sun.com (Neil Katin) Subject: Re: Comments on Draft RFC Cc: ietf-822@dimacs.rutgers.edu In-Reply-To: <9104250154.AA20249@skylark.Eng.Sun.COM> References: <9104250154.AA20249@skylark.Eng.Sun.COM> I think I'm in the process of being persuaded on the "charset" issue. At least, Neil's belief that it is needed seems to be a lot stronger (currently) than my belief that it isn't. What would people (Neil & others) think of augmenting the Content-type header to include character set info? Right now we have Content-type: type [; ver-num [; resource-ref] ] We could easily add the charset information into this header, instead of a charset header. For example, Content-type: type [; ver-num [; resource-ref [; charset ] ] ] or, if you believe (as I do) that charsets are probably more frequently used than some of these other things: Content-type: type [/ charset] [; ver-num [; resource-ref] ] If we went down this route, we could restore "text" as the default content-type, with something like "USASCII" (let's NOT argue over this string just yet) as the default charset. A common type of European message might have Content-type: text/iso-646 A "typical" troff message might have Content-type: troff ; null; mm And a European troff message might have Content-type: troff/iso-646 ; null; mm Is there an advantage to this scheme over a separate "Character-Set" header field? Only in that it preserves the idea that a single header describes the entire content of the message. (Oh yes, and it saves a few bytes :-)) Are there any disadvantages? More to the point, are there people out there whose opposition to a character set header is stronger than mine, and who would like to take up the argument against it? From owner-ietf-822@dimacs.rutgers.edu Thu Apr 25 11:10:33 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA09841; Thu, 25 Apr 91 10:28:51 EDT Received: from thumper.bellcore.com by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA09837; Thu, 25 Apr 91 10:28:49 EDT Received: from greenbush.bellcore.com by thumper.bellcore.com (4.1/4.7) id for ietf-822@dimacs.rutgers.edu; Thu, 25 Apr 91 10:28:15 EDT Received: by greenbush.bellcore.com (4.12/4.7) id for morris@casee.enet.dec.com; Thu, 25 Apr 91 10:31:44 edt Received: from Messages.7.14.N.CUILIB.3.45.SNAP.NOT.LINKED.greenbush.mouseclub.sun4.40 via MS.5.6.greenbush.mouseclub.sun4_40; Thu, 25 Apr 1991 10:31:40 -0400 (EDT) Message-Id: Date: Thu, 25 Apr 1991 10:31:40 -0400 (EDT) From: Nathaniel Borenstein To: ietf-822@dimacs.rutgers.edu, "Tom Morris - Digital Equipment - Valbonne, France 25-Apr-1991 1542" Subject: Re: Comments on Draft RFC In-Reply-To: <9104251416.AA09296@enet-gw.pa.dec.com> References: <9104251416.AA09296@enet-gw.pa.dec.com> If you look at the draft, it currently defines "u-law" and "a-law", and will reference the appropriate audio telephony standards. The proposal I was agreeing with was to make a content-type "audio" with the phrase "u-law" or "a-law" relegated to the ver-num or resource-reference fields. Is that not appropriate? I think we really could use some more guidance from people who are better versed in audio standards... From owner-ietf-822@dimacs.rutgers.edu Thu Apr 25 11:27:40 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA09065; Thu, 25 Apr 91 09:52:35 EDT Received: from thumper.bellcore.com by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA09061; Thu, 25 Apr 91 09:52:32 EDT Received: from greenbush.bellcore.com by thumper.bellcore.com (4.1/4.7) id for ietf-822@dimacs.rutgers.edu; Thu, 25 Apr 91 09:49:57 EDT Received: by greenbush.bellcore.com (4.12/4.7) id for janssen@parc.xerox.com; Thu, 25 Apr 91 09:53:27 edt Received: from Messages.7.14.N.CUILIB.3.45.SNAP.NOT.LINKED.greenbush.mouseclub.sun4.40 via MS.5.6.greenbush.mouseclub.sun4_40; Thu, 25 Apr 1991 09:53:23 -0400 (EDT) Message-Id: <8c5hxHm0M2YtA2gYZL@thumper.bellcore.com> Date: Thu, 25 Apr 1991 09:53:23 -0400 (EDT) From: Nathaniel Borenstein To: ietf-822@dimacs.rutgers.edu, Bill Janssen Subject: Re: Comments on Draft RFC In-Reply-To: References: <9104240546.AA06781@vision.Eng.Sun.COM> , Excerpts from mail: 24-Apr-91 Re: Comments on Draft RFC Bill Janssen@parc.xerox. (1710) > I also like Vincent's notion of dropping HEXADECIMAL in favor of BASE64 > or BASE85. Do we then assume that basically every UA is able to deal > with BASE64 (via free code, etc.)? I believe that this would then > quickly make UUENCODED obsolete. I think this is completely plausible. I wrote a quick implementation of all 3 encodings (base64, hex, and quoted-printable) and it took about 300 lines of code. I can imagine a much smaller implementation, too. (Unfortunately, Bellcore is VERY reluctant to release such things, so don't bother asking for the present, but the ease of implementation convinces me that public domain versions will be quickly forthcoming.) > I also feel that there should be the standard Content-Encoding > "COMPRESSED", for which I do not have a rigorous definition, but which > would mean something like "run through UNIX 'compress' (and then encoded > with BASE64?)". I appreciate the legal tangles with compress, but also > know there's a lot of copies out there. Of course, perhaps a standard > compression routine could also be donated, and made freely available, so > that it is not necessary to use UNIX compress. I guess that I'm pretty sympathetic in principle, but I see the "legal" tangles as harder than you imply. One possibility is using Ullman and Jung's "LZJU90" compression algorithm, for which he Robert recently sent me a description including an implementation. I fear I'm not really technically competent to evaluate its suitability, however. Perhaps, Robert, you should post it to this list? (You might want to leave out the implmentation for the purposes of discussion, though it is certainly good to know that a public-domain implementation exists.) As I've worked to develop a "consensus" RFC, I've found that lots of things I used to feel strongly about have become less important, in my eyes, than reaching a consensus. For example, I no longer feel strongly about whether or not we have a compressed encoding -- I'll happily go along with it if a good spec can be made available. In general, I think that reaching consensus is more important than most of the remaining open details. But I still feel VERY strongly that Content-Encoding should be a very simple mechanism. In particular, the idea of nested encodings strikes me as overly and unnecessarily complicated, and offering little or nothing in the way of payoff. Sure, lets trade hexadecimal for compressed, if we can define "compressed" properly. But that's still a small enough number of encodings that I see no reason we should ever need to nest or cascade them. From owner-ietf-822@dimacs.rutgers.edu Thu Apr 25 11:40:35 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA09942; Thu, 25 Apr 91 10:35:29 EDT Received: from thumper.bellcore.com by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA09938; Thu, 25 Apr 91 10:35:28 EDT Received: from greenbush.bellcore.com by thumper.bellcore.com (4.1/4.7) id for ietf-822@dimacs.rutgers.edu; Thu, 25 Apr 91 10:35:26 EDT Received: by greenbush.bellcore.com (4.12/4.7) id for ietf-822@dimacs.rutgers.edu; Thu, 25 Apr 91 10:38:55 edt Received: from Messages.7.14.N.CUILIB.3.45.SNAP.NOT.LINKED.greenbush.mouseclub.sun4.40 via MS.5.6.greenbush.mouseclub.sun4_40; Thu, 25 Apr 1991 10:38:51 -0400 (EDT) Message-Id: Date: Thu, 25 Apr 1991 10:38:51 -0400 (EDT) From: Nathaniel Borenstein To: ietf-822@dimacs.rutgers.edu Subject: Re: UUencode In-Reply-To: <9104241419.aa07746@Obelix.TWG.COM> References: <9104241419.aa07746@Obelix.TWG.COM> (Let's try to get away from "Re: Comments on Draft RFC, shall we? Our subject lines have really deteriorated!) Uuencode implementations also differ widely in their (in)ability to work in a pipe, which can be a problem if you want to write an implementation that simply pipes a body through uudecode. That's another way in which uuencode/uudecode are "non-standard". I agree that the SYNTAX of a cascaded content-encoding header is not problematic. The semantics are even pretty clear, too. The implementation, though more complex than the simple content-encoding field, is also not beyond most programmers' abilities. What I don't see, however, is why the cascading is necessary. If it isn't necessary, the fact that it isn't too terribly complex is irrelevant -- it's still unnecessary! From owner-ietf-822@dimacs.rutgers.edu Thu Apr 25 11:59:20 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA09915; Thu, 25 Apr 91 10:32:04 EDT Received: from thumper.bellcore.com by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA09911; Thu, 25 Apr 91 10:32:02 EDT Received: from greenbush.bellcore.com by thumper.bellcore.com (4.1/4.7) id for ietf-822@dimacs.rutgers.edu; Thu, 25 Apr 91 10:31:59 EDT Received: by greenbush.bellcore.com (4.12/4.7) id for ietf-822@dimacs.rutgers.edu; Thu, 25 Apr 91 10:35:29 edt Received: from Messages.7.14.N.CUILIB.3.45.SNAP.NOT.LINKED.greenbush.mouseclub.sun4.40 via MS.5.6.greenbush.mouseclub.sun4_40; Thu, 25 Apr 1991 10:35:25 -0400 (EDT) Message-Id: Date: Thu, 25 Apr 1991 10:35:25 -0400 (EDT) From: Nathaniel Borenstein To: ietf-822@dimacs.rutgers.edu Subject: Re: Comments on Draft RFC In-Reply-To: <9104242044.AA07589@vision.Eng.Sun.COM> References: <9104242044.AA07589@vision.Eng.Sun.COM> Excerpts from internet.ietf-822: 24-Apr-91 Re: Comments on Draft RFC Vincent Lau@eng.sun.com (1611) > I would like to see a stronger statement that implementations *should* > discard the "prefix" and "postfix" areas. Don't recommend the use of them. > I am afraid that if a sending UA puts an important (judgment call) message > in the prefix area, but the receiving UA discards it. In my opinion, these > 2 UA's are *not* interoperable. I'm happy to say they should be discarded. But are you also arguing against the short textual message saying, in effect, "this is a multipart message; if you're seeing this, you've got a problem"? That still seems like a good idea to me. Excerpts from internet.ietf-822: 24-Apr-91 Re: Comments on Draft RFC Vincent Lau@eng.sun.com (1611) > Content-Label is set by the *sending* UA (e.g. the sender assigns a label > "Phone_Message" to an audio typed body part) and it is more user-friendlier > than Message-ID when used in a reply message. This field exists in each > body part, but it is optional. Message-ID is different from Content-Label > that Message-ID refers to the whole message and Content-Label refers to a > specific body part in a message. Ah, but in a multipart message, each "body part" is an encapsulated message, and can therefore have its own message-id header. Therefore they are, I believe, functionally identical. I'm also not sold by the argument to "user-friendliness" because I think that, in either case, the header field is intended for software use rather than for human reading. Or is this assumption incorrect? From owner-ietf-822@dimacs.rutgers.edu Thu Apr 25 12:10:32 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA10113; Thu, 25 Apr 91 10:46:00 EDT Received: from uvaarpa.Virginia.EDU by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA10104; Thu, 25 Apr 91 10:45:57 EDT Received: from uvacs.cs.Virginia.EDU by uvaarpa.Virginia.EDU id aa01999; 25 Apr 91 10:45 EDT Received: from station6.cs.Virginia.EDU by uvacs.cs.Virginia.EDU (4.1/5.1.UVA) id AA09051; Thu, 25 Apr 91 10:45:02 EDT Posted-Date: Thu, 25 Apr 91 10:45:14 EDT Return-Path: Received: by station6.cs.Virginia.EDU (4.1/SMI-2.0) id AA04084; Thu, 25 Apr 91 10:45:14 EDT Date: Thu, 25 Apr 91 10:45:14 EDT From: rja7m@uvacs.cs.virginia.edu Message-Id: <9104251445.AA04084@station6.cs.Virginia.EDU> In-Reply-To: Dan Oscarsson "Some comments to new draft" (Apr 25, 10:44am) X-Mailer: Mail User's Shell (7.2.0 10/31/90) To: IETF 822 WG Subject: Re: Some comments to new draft % Part 2: Content-Type header % There ought to be a few iso character set standards defined. % ISO 8859-1 and ISO 10646. % And MAILASCII, ISO 8859-1 and ISO 10646 should be the recommended % types for text bodies. US-ASCII should be the default type for messages. No other recommendations need be made since anything else will be labelled as to how the characters are encoded and the MUA and mail gateways can then handle them appropriately. The commonly used character set encodings need unique defined types so that mail gateways can make appropriate translations. A mail gateway needs to know how the bits map to glyphs so that it can make appropriate translations (EBCDIC <--> ISO 8859-1, for example). For this reason, I would urge that ALL of the ISO 8859-N character set standards be formally supported by the RFC (including only ISO 8859-1 is a kind of European parochialism). Also, EBCDIC should be included as a predefined type (There are such machines on the Internet running TCP/IP, SMTP, & friends). Eventually we will need to also support 16bit and 32bit character sets as well (UNICODE is 16bit and DIS 10646 is 32bit). It seems to me that marking text as TeX or Troff source is a secondary consideration and not one that a mail gateway should have to be aware of or handle specially. To make it a consideration that has to be handled by a mail gateway makes it harder to implement a mail gateway that will work correctly and interoperate cleanly. Randall Atkinson randall@Virginia.EDU From owner-ietf-822@dimacs.rutgers.edu Thu Apr 25 12:28:26 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA11045; Thu, 25 Apr 91 11:07:30 EDT Received: from uvaarpa.Virginia.EDU by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA11041; Thu, 25 Apr 91 11:07:27 EDT Received: from uvacs.cs.Virginia.EDU by uvaarpa.Virginia.EDU id aa03121; 25 Apr 91 11:07 EDT Received: from station6.cs.Virginia.EDU by uvacs.cs.Virginia.EDU (4.1/5.1.UVA) id AA09885; Thu, 25 Apr 91 11:06:35 EDT Posted-Date: Thu, 25 Apr 91 11:06:49 EDT Return-Path: Received: by station6.cs.Virginia.EDU (4.1/SMI-2.0) id AA04111; Thu, 25 Apr 91 11:06:49 EDT Date: Thu, 25 Apr 91 11:06:49 EDT From: rja7m@uvacs.cs.virginia.edu Message-Id: <9104251506.AA04111@station6.cs.Virginia.EDU> In-Reply-To: Greg Vaudreuil "Re: support for all content types" (Apr 25, 8:43am) X-Mailer: Mail User's Shell (7.2.0 10/31/90) To: IETF 822 WG Subject: Re: support for all content types On Apr 25, 8:43am, Greg Vaudreuil wrote: Subject: Re: support for all content types % I have only concern for text. I see no way around the necessity to % ask if you are capable of "advanced" functionality. I simply do not % see plain text (even in other character sets) as advanced % functionality. I don't think that there is any practical way to force any MUA to support all the conceivable types. If the commonly used text types are defined then the MUAs and gateways have the opportunity of doing things correctly. I would hope that getting the "basic" functionality correct is the primary concern as Greg suggests. % Well, if you can convert from ISO 10646 to national 646-n, and then % display 646-n on your dumb terminal, then I'd say you can handle % 10646. You can do something with it, even if you must suffer info % loss for Kanji. This isn't possible even for all of the Romanised languages. For example, Vietnamese is (mostly) supported by DIS 10646 but cannot be represented _correctly_ in any of the ISO 8859 or any of the ISO 646 character set standards. It can be approximated in ASCII or ISO 8859-1 though I don't think that kind of conversion ability should be mandated. % This is precisely what I'm getting at. If I pick a series of % codesets, like MAILASCII, Latin-1 and ISO 10646, they are all upwardly % compatable. If I send Japanese, I must use ISO 10646. I have no % option. If I send English in ASCII, I can use 10646, but I can subset % it to ASCII. If I send French, I can use either 10646, or subset it % to Latin-1. (Latin-1 is ISO 8859-1 ) % There is a big difference between implementing this series of % character sets and asking that I implement (or be able to convert to % and from) Unicode, 646-n, and iso10646. With the former series, I % implement the level of functionality I need. By allowing any arbitrary % character set, I must implement all sets that can possibly give me % the functionality I need, because must expect any one of them. This seems to present a case for restricting the draft RFC's type definitions to only the US-ASCII (X3.4-1986), ISO 646-N, ISO 8859-N, and ISO DIS 10646 standards. This would be easier to implement as well as addressing the problems of Internationalisation. % This is very much a message format issue. RFC explicity addressed % this issue. It said, "you must use ASCII". If it is not a 822 issue, % then is it an topic for a separate implementor agreement? NETF % specified use of latin-1 on there networks, NSFnet specifies ISO 10646 % for the US scientific community, and Japan specifies 2022-n for % internal use and ISO 10646 for external use.... This is a nightmare % that the IETF/IAB/Internet community has never had to deal with. This seems to indicate that the RFC should at least include type definitions for the sets in my paragraph above simply for reason of inter-operability with the above networks. It still isn't clear to me how it is practical to enforce support for any character set encoding for all hosts. After all, there are a lot of systems still running that don't use the DNS. Randall Atkinson randall@Virginia.EDU From owner-ietf-822@dimacs.rutgers.edu Thu Apr 25 14:10:33 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA14738; Thu, 25 Apr 91 13:46:53 EDT Received: from TWG.COM by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA14726; Thu, 25 Apr 91 13:46:43 EDT Received: from Obelix.twg.com by twg.com with SMTP ; Thu, 25 Apr 91 10:25:29 PST Received: from obelix.twg.com by Obelix.TWG.COM id aa12109; 25 Apr 91 10:24 PDT To: ietf-822@dimacs.rutgers.edu Subject: Re: Some comments to new draft In-Reply-To: Your message of Thu, 25 Apr 91 10:44:06 +0200. <9104250844.AA15141@dna.lth.se> Date: Thu, 25 Apr 91 10:24:50 -0700 From: David Herron Message-Id: <9104251024.aa12109@Obelix.TWG.COM> > Part 2: Content-Type header > > There are to many of them. > There should be a few general ones, internation standards if possible. > > Why have scribe,tex,troff,dvi,pbm,pgm,ppm? They are very special. > Why not then FrameMaker? Yesterday we had a meeting to discuss the functional specification for the UA we're writing. Marketing here wants to include things like Wingz & 1/2/3 spreadsheets, dBASE files, etc! As I see it the end-user is going to want to know DIRECTLY that the following thing is some specific sort of file. Things like what encodings used to make it mailable are going to be irrelevant to (my concept of) the end-user. Fortunately defining something like X-dBASE-IV is quite possible. David From owner-ietf-822@dimacs.rutgers.edu Thu Apr 25 14:40:34 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA14769; Thu, 25 Apr 91 13:47:23 EDT Received: from enet-gw.pa.dec.com by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA14761; Thu, 25 Apr 91 13:47:16 EDT Received: by enet-gw.pa.dec.com; id AA03044; Thu, 25 Apr 91 10:46:53 -0700 Received: by fork-city.pa.dec.com; id AA28086; Thu, 25 Apr 91 10:46:43 -0700 Message-Id: <9104251746.AA28086@fork-city.pa.dec.com> To: ietf-822@dimacs.rutgers.edu Subject: multiple body parts of different encodings Date: Thu, 25 Apr 91 10:46:43 -0700 From: Paul A Vixie X-Mts: smtp what we're looking for here seems to be complex messages, where a message contains multiple parts and each part can be encoded using a different method. some folks want the encoding methods to be nestable, so that a uuencode'd GIF can be represented directly. everybody wants this to be more or less compatible with the "------" method that we use for digests now. most people want to be able to represent non-textual bodyparts in 8-bit format; some people want to be able to represent even textual body parts in 8-bit format. this is a hellacious hairball and we need to start by acknowledging that everybody can't get what they want. more than that, probably no single person will get everything they want. so listen up, all: prepare to be disappointed. prepare for a final solution that is a rotten compromise that makes your stomach turn. to that end :-), let me spec a little of this out: message :: headers blank-line body headers :: header [...] body :: line [...] i'd like to keep the name part of the header (everything to the left of the colon) spec'd to 7-bit ASCII. i'd like all the reserved words in complex headers like "received:" to 7-bit ASCII. actual text (like the subject field or the full-name/comment parts of to/cc/from) should somehow be allowed to be 8-bit. i don't know what to suggest for the stuff inside of in to/cc/from. the restrictions on domain names (anything to the right of an @) should be whatever DNS spec's, which is probably 7-bit ASCII. without saying anything about body parts or encodings, i'm already into an 8-bit transport. if a message that has 8-bit data in its Subject: field needs to be sent over a 7-bit transport, it has to be encoded. this encoding needs to be something that a user or user-agent can make sense of, since it may never reach another 8-bit transport and even if it does i'm not sure it should be decoded back into 8-bit data. we can argue that one. an encoding like \NNN where NNN is an octal number would suit those of us in the UNIX(tm) community pretty well but we aren't the whole world and no doubt a Norwegian recipient would rather not see a \221 where an accented character would normally appear. we may want to consider a new encoding enumeration which is painful to generate or tear down but which has equivilences which are chosen to be readable in their encoded form. like i said, this can be argued. the headers will have to have some kind of magic cookie added to them when a message (headers and/or body) has 8-bit data in it. this cookie can be munched slightly when/if a transport or user-agent needs to encode it into the "readable 7-bit" notation i mentioned earlier. i believe that the presence of this magic cookie (really "a new header") should tell anyone who cares about the distinction, that this message conforms to the newer mail RFC's and all else that that may imply. let's try to agree on a framework with arguable variables, and then argue about the variables. does anyone think that what i've said above will work? (it's a restatement of what several other people have said, so i know that *somebody* thinks it's reasonable). does anyone know a reason why we cannot or should not start with the above framework? cheers, Paul Vixie DEC Western Research Lab Palo Alto, California, USA ...!decwrl!vixie ...!vixie!paul From owner-ietf-822@dimacs.rutgers.edu Thu Apr 25 14:21:06 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA12977; Thu, 25 Apr 91 12:50:02 EDT Received: from RUTGERS.EDU by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA12973; Thu, 25 Apr 91 12:49:55 EDT Received: from Relay.Prime.COM by rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA06553; Thu, 25 Apr 91 12:48:42 EDT Message-Id: <9104251648.AA06553@rutgers.edu> Received: (from user ARIEL) by Relay.Prime.COM; 25 Apr 91 12:23:37 EDT To: ietf-822@dimacs.rutgers.edu From: Robert Ullmann Subject: the LZJU90 doc Nathaniel referred to Encoding: 45 text, 1140 text Date: 25 Apr 91 12:23:38 EDT Hi, Following is the LZJU90 document Nathaniel referred to; I posted it once before, but no one was interested then. It does single-pass conversion of an "arbitrary binary object" to a mailable (malleable? :-) object. (I say "arbitrary" in quotes because there is the assumption that the object is "flat": a single ordered sequence of octets). It is an LZ77-class algorithm. This has several interesting characteristics: 1) [most important] LZ77 algorithms are (presently) considered to be in the public domain, and do not infring upon (e.g.) the Unisys patent. 2) most of the work is done by the encoder. Since objects are usually compressed once and uncompressed either once or many times (e.g. something sent to a mailing list) this is a Good Thing. 3) many possible encoding algorithms are possible, with the SAME DECODER. (think about that!) In particular, this means that the encoder can make whatever time/space/efficiency tradeoffs it likes, the decoder still works. It also means that a proprietary (or even patentable) algorithm might be used for encoding, without preventing anyone from decoding with public domain software. 4) the decoder runs in _bounded_ space (32KB) and _linear_ time. The code given in the RFC draft is explicitly placed in the public domain. It is written for maximum portability and a certain degree of clarity; production software can be more efficient. (In particular, the encoder runs in O(n log n) time, with a high proportionality constant; methods are known that run nearly in O(n), with a very small constant. Code for these methods is not in the public domain at present :-) Best Regards, Robert Ullmann Prime Computer, Inc. +1 508 620 2800 x1736 Network Working Group R. Jung, R. Ullmann Request for Comments: DRAFT Prime Computer, Inc. January 1991 LZJU90: Compressed Encoding for Binary Mail 1. Status of this Memo This memo describes an encoding [1] for a binary object to be sent in an Internet mail message. The encoding provides both compression and representation in a text format that will successfully survive transmission through the many different mailers and gateways that comprise the Internet and connected mail networks. Distribution of this memo is unlimited. 2. Introduction The encoding first compresses the binary object, using a modified LZ77 algorithm, called LZJU90. It then encodes each 6 bits of the output of the compression as a text character, using a character set chosen to survive any translations between codes, such as ASCII to EBCDIC. The 64 six-bit strings 000000 through 111111 are represented by the characters "+", "-", "0" to "9", "A" to "Z", and "a" to "z". The output text begins with a line identifying the encoding. This is for visual reference only, the Encoding: field in the header identifies the section to the user program. It also names the object that was encoded, usually by a file name. The format of this line is: * LZJU90 where is optional. For example: * LZJU90 vmunix This is followed by the compressed and encoded data, broken into lines where convenient. It is recommended that lines be broken every 78 characters, to survive mailers than restrict line length. The decoder must accept lines with 1 to 1000 characters on each line. After this, there is one final line that gives the number of bytes in the original data and a CRC of the original data. This should match the byte count and CRC found during decompression. This line has the format: * Jung, Ullmann [Page 1] RFC DRAFT LZJU90: Compressed Encoding for Binary Mail January 1991 where is a decimal number, and CRC is 8 hexadecimal digits. For example: * 4128076 5AC2D50E The count used in the Encoding: field in the message header is the total number of lines, including the start and end lines that begin with *. A complete example is given in section 6. 3. Specification of the LZJU90 compression This data compression specification uses the Lempel-Ziv-Storer-Szymanski model of mixing pointers and literal characters. The data compression is defined by the decoding algorithm. Any encoder that emits symbols which cause the decoder to produce the original input is defined to be valid. There are many possible strategies for the maximal-string matching that the encoder does, section 5 gives the code for one such algorithm. Regardless of which algorithm is used, and what tradeoffs are made between compression ratio and execution speed or space, the result can always be decoded by the simple decoder. The compressed data consists of a mixture of unencoded literal characters and copy pointers which point to an earlier occurrence of the string to be encoded. Compressed data contains two types of codewords: LITERAL pass the literal directly to the uncompressed output COPY length, offset go back offset characters in the output and copy length characters forward to the current position. To distinguish between codewords, the copy length is used. A copy length of zero indicates that the following codeword is a literal codeword. A copy length greater than zero indicates that the following codeword is a copy codeword. To improve copy length encoding, a threshold value of 2 has been subtracted from the original copy length for copy codewords, because the minimum copy length is 3 in this compression scheme. The maximum offset value is set at 32255. Larger offsets offer extremely low improvements in compression (less than 1 percent, typically). No special encoding is done on the LITERAL characters. However, unary Jung, Ullmann [Page 2] RFC DRAFT LZJU90: Compressed Encoding for Binary Mail January 1991 encoding is used for the copy length and copy offset values to improve compression. A start-step-stop unary code is used. A (start, step, stop) unary code of the integers is defined as follows: The Nth codeword has N ones followed by a zero followed by a field of size START + (N * STEP). If the field width is equal to STOP then the preceding zero can be omitted. The integers are laid out sequentially through these codewords. For example, (0, 1, 4) would look like: Codeword Range 0 0 10x 1-2 110xx 3-6 1110xxx 7-14 1111xxxx 15-30 Below are the actual values used for copy length and copy offset: The copy length is encoded with a (0, 1, 7) code leading to a maximum copy length of 256 by including the THRESHOLD value of 2. Codeword Range 0 0 10x 3-4 110xx 5-8 1110xxx 9-16 11110xxxx 17-32 111110xxxxx 33-64 1111110xxxxxx 65-128 1111111xxxxxxx 129-256 The copy offset is encoded with a (9, 1, 14) code leading to a maximum copy offset of 32255. Offset 0 is reserved as an end of compressed data flag. Codeword Range 0xxxxxxxxx 0-511 10xxxxxxxxxx 512-1535 110xxxxxxxxxxx 1536-3583 1110xxxxxxxxxxxx 3485-7679 11110xxxxxxxxxxxxx 7680-15871 11111xxxxxxxxxxxxxx 15872-32255 Jung, Ullmann [Page 3] RFC DRAFT LZJU90: Compressed Encoding for Binary Mail January 1991 The 0 has been chosen to signal the start of the field for ease of encoding. The stop values are useful in the encoding to prevent out of range values for the lengths and offsets, as well as shortening some codes by one bit. The worst case compression using this scheme is a 1/8 increase in size of the encoded data. (One zero bit followed by 8 character bits). After the character encoding, the worst case ratio is 3/2 to the original data. The minimum copy length of 3 has been chosen because the worst case copy length and offset is 3 bits (3) and 19 bits (32255) for a total of 22 bits to encode a 3 character string (24 bits). Jung, Ullmann [Page 4] RFC DRAFT LZJU90: Compressed Encoding for Binary Mail January 1991 4. The Decoder As mentioned previously, the compression is defined by the decoder. Any encoder that produced output that is correctly decoded is by definition correct. The following is an implementation of the decoder, written more for clarity and as much portability as possible, rather than for maximum speed. When optimized for a specific environment, it will run significantly faster. /* LZJU 90 Decoding program */ #include typedef unsigned char uchar; typedef unsigned int uint; #define N 32255 #define THRESHOLD 3 #define STRTP 9 #define STEPP 1 #define STOPP 14 #define STRTL 0 #define STEPL 1 #define STOPL 7 static FILE *in; static FILE *out; static int getbuf; static int getlen; static long in_count; static long out_count; static long crc; static long crctable[256]; static uchar xxcodes[] = "+-0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ\ abcdefghijklmnopqrstuvwxyz"; static uchar ddcodes[256]; static uchar text[N]; Jung, Ullmann [Page 5] RFC DRAFT LZJU90: Compressed Encoding for Binary Mail January 1991 #define CRCPOLY 0xEDB88320 #define CRC_MASK 0xFFFFFFFF #define UPDATE_CRC(crc, c) \ crc = crctable[((uchar)(crc) ^ (uchar)(c)) & 0xFF] \ ^ (crc >> 8) #define START_RECD "* LZJU90" void MakeCrctable() /* Initialize CRC-32 table */ { uint i, j; long r; for (i = 0; i <= 255; i++) { r = i; for (j = 8; j > 0; j--) { if (r & 1) r = (r >> 1) ^ CRCPOLY; else r >>= 1; } crctable[i] = r; } } int GetXX() /* Get xxcode and translate */ { int c; do { if ((c = fgetc(in)) == EOF) c = 0; } while (c == '\n'); in_count++; return ddcodes[c]; } int GetBit() /* Get one bit from input buffer */ { int c; while (getlen <= 0) { c = GetXX(); getbuf |= c << (10-getlen); getlen += 6; } c = (getbuf & 0x8000) != 0; getbuf <<= 1; getbuf &= 0xFFFF; getlen--; return(c); } Jung, Ullmann [Page 6] RFC DRAFT LZJU90: Compressed Encoding for Binary Mail January 1991 int GetBits(len) /* Get len bits */ int len; { int c; while (getlen <= 10) { c = GetXX(); getbuf |= c << (10-getlen); getlen += 6; } if (getlen < len) { c = (uint)getbuf >> (16-len); getbuf = GetXX(); c |= getbuf >> (6+getlen-len); getbuf <<= (10+len-getlen); getbuf &= 0xFFFF; getlen -= len - 6; } else { c = (uint)getbuf >> (16-len); getbuf <<= len; getbuf &= 0xFFFF; getlen -= len; } return(c); } int DecodePosition() /* Decode offset position pointer */ { int c; int width; int plus; int pwr; plus = 0; pwr = 1 << STRTP; for (width = STRTP; width < STOPP; width += STEPP) { c = GetBit(); if (c == 0) break; plus += pwr; pwr <<= 1; } if (width != 0) c = GetBits(width); c += plus; return(c); } Jung, Ullmann [Page 7] RFC DRAFT LZJU90: Compressed Encoding for Binary Mail January 1991 int DecodeLength() /* Decode code length */ { int c; int width; int plus; int pwr; plus = 0; pwr = 1 << STRTL; for (width = STRTL; width < STOPL; width += STEPL) { c = GetBit(); if (c == 0) break; plus += pwr; pwr <<= 1; } if (width != 0) c = GetBits(width); c += plus; return(c); } void InitCodes() /* Initialize decode table */ { int i; for (i = 0; i < 256; i++) ddcodes[i] = 0; for (i = 0; i < 64; i++) ddcodes[xxcodes[i]] = i; return; } main(ac, av) /* main program */ int ac; char **av; { int r; int j, k; int c; int pos; char buf[80]; char name[3]; long num, bytes; if (ac < 3) { fprintf(stderr, "usage: judecode in out\n"); exit(1); } in = fopen(av[1], "r"); if (!in){ fprintf(stderr, "Can't open %s\n", av[1]); exit(1); } Jung, Ullmann [Page 8] RFC DRAFT LZJU90: Compressed Encoding for Binary Mail January 1991 out = fopen(av[2], "w"); if (!out) { fprintf(stderr, "Can't open %s\n", av[2]); fclose(in); exit(1); } while (1) { if (fgets(buf, sizeof(buf), in) == NULL) { fprintf(stderr, "Unexpected EOF\n"); exit(1); } if (strncmp(buf, START_RECD, strlen(START_RECD)) == 0) break; } in_count = 0; out_count = 0; getbuf = 0; getlen = 0; InitCodes(); MakeCrctable(); crc = CRC_MASK; r = 0; while (feof(in) == 0) { c = DecodeLength(); if (c == 0) { c = GetBits(8); UPDATE_CRC(crc, c); out_count++; text[r] = c; fputc(c, out); if (++r >= N) r = 0; } else { pos = DecodePosition(); if (pos == 0) break; pos--; j = c + THRESHOLD - 1; pos = r - pos - 1; if (pos < 0) pos += N; Jung, Ullmann [Page 9] RFC DRAFT LZJU90: Compressed Encoding for Binary Mail January 1991 for (k = 0; k < j; k++) { c = text[pos]; text[r] = c; UPDATE_CRC(crc, c); out_count++; fputc(c, out); if (++r >= N) r = 0; if (++pos >= N) pos = 0; } } } fgetc(in); /* skip newline */ if (fscanf(in, "* %ld %lX", &bytes, &num) != 2) { fprintf(stderr, "CRC record not found\n"); exit(1); } else if (crc != num) { fprintf(stderr, "CRC error, expected %lX, found %lX\n", crc, num); exit(1); } else if (bytes != out_count) { fprintf(stderr, "File size error, expected %lu, found %lu\n", bytes, out_count); exit(1); } else fprintf(stderr, "File decoded to %lu bytes correctly\n", out_count); fclose(in); fclose(out); return; } Jung, Ullmann [Page 10] RFC DRAFT LZJU90: Compressed Encoding for Binary Mail January 1991 5. An example of an Encoder Many algorithms are possible for the encoder, with different tradeoffs between speed, size, and complexity. The following is a simple example program which is fairly efficient; more sophisticated implementations will run much faster, and in some cases produce somewhat better compression. This example also shows that the encoder need not use the entire window available. Not using the full window costs a small amount of compression, but can greatly increase the speed of some algorithms. /* LZJU 90 Encoding program */ #include typedef unsigned char uchar; typedef unsigned int uint; #define N 8192 /* Size of window buffer */ #define F 256 /* Size of look-ahead buffer */ #define THRESHOLD 3 #define NIL N /* End of tree's node */ #define STRTP 9 #define STEPP 1 #define STOPP 14 #define STRTL 0 #define STEPL 1 #define STOPL 7 #define CHARSLINE 78 static FILE *in; static FILE *out; static int putlen; static int putbuf; static int char_ct; static long in_count; static long out_count; static long crc; static long crctable[256]; static uchar xxcodes[] = "+-0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ\ abcdefghijklmnopqrstuvwxyz"; Jung, Ullmann [Page 11] RFC DRAFT LZJU90: Compressed Encoding for Binary Mail January 1991 static uchar text[N + F - 1]; static int match_position; static int match_length; static int lson[N + 1]; static int rson[N + 257]; static int dad[N + 1]; #define CRCPOLY 0xEDB88320 #define CRC_MASK 0xFFFFFFFF #define UPDATE_CRC(crc, c) \ crc = crctable[((uchar)(crc) ^ (uchar)(c)) & 0xFF] \ ^ (crc >> 8) void MakeCrctable() /* Initialize CRC-32 table */ { uint i, j; long r; for (i = 0; i <= 255; i++) { r = i; for (j = 8; j > 0; j--) { if (r & 1) r = (r >> 1) ^ CRCPOLY; else r >>= 1; } crctable[i] = r; } } void PutXX(c) /* Translate and put xxcode */ int c; { c = xxcodes[c & 0x3F]; if (++char_ct > CHARSLINE) { char_ct = 1; fputc('\n', out); } fputc(c, out); out_count++; } Jung, Ullmann [Page 12] RFC DRAFT LZJU90: Compressed Encoding for Binary Mail January 1991 void PutBits(c, len) /* Put rightmost "len" bits of "c" */ int c, len; { c <<= 16 - len; c &= 0xFFFF; putbuf |= (uint) c >> putlen; c <<= 16 - putlen; c &= 0xFFFF; putlen += len; while (putlen >= 6) { PutXX(putbuf >> 10); putlen -= 6; putbuf <<= 6; putbuf &= 0xFFFF; putbuf |= (uint) c >> 10; c = 0; } } void EncodePosition(ch) /* Encode offset position pointer */ int ch; { int width; int prefix; int pwr; pwr = 1 << STRTP; for (width = STRTP; ch >= pwr; width += STEPP, pwr <<= 1) ch -= pwr; if ((prefix = width - STRTP) != 0) PutBits(0xffff, prefix); if (width < STOPP) width++; else if (width > STOPP) abort(); PutBits(ch, width); } Jung, Ullmann [Page 13] RFC DRAFT LZJU90: Compressed Encoding for Binary Mail January 1991 void EncodeLength(ch) /* Encode code length */ int ch; { int width; int prefix; int pwr; pwr = 1 << STRTL; for (width = STRTL; ch >= pwr; width += STEPL, pwr <<= 1) ch -= pwr; if ((prefix = width - STRTL) != 0) PutBits(0xffff, prefix); if (width < STOPL) width++; else if (width > STOPL) abort(); PutBits(ch, width); } void InitTree() { int i; for (i = N + 1; i <= N + 256; i++) rson[i] = NIL; for (i = 0; i < N; i++) dad[i] = NIL; } Jung, Ullmann [Page 14] RFC DRAFT LZJU90: Compressed Encoding for Binary Mail January 1991 /* Insert string of length F, text[r..r+F-1], into one of the trees (text[r]'th tree) and return the longest-match position and length via the global variables match_position and match_length. If match_length = F, then remove the old node in favor of the new one, because the old one will be deleted sooner. Note r plays double role, as tree node and position in buffer. */ void InsertNode(r) int r; { int i, cmp, p, c; uchar *key, *keyp, *txtp; cmp = 1; key = &text[r]; p = N + 1 + key[0]; rson[r] = lson[r] = NIL; match_length = 0; for ( ; ; ) { if (cmp >= 0) { if (rson[p] != NIL) p = rson[p]; else { rson[p] = r; dad[r] = p; return; } } else { if (lson[p] != NIL) p = lson[p]; else { lson[p] = r; dad[r] = p; return; } } txtp = &text[p]; keyp = key; if ((cmp = *++keyp - *++txtp) != 0) continue; for (i = 2; i < F; i++) if ((cmp = *++keyp - *++txtp) != 0) break; Jung, Ullmann [Page 15] RFC DRAFT LZJU90: Compressed Encoding for Binary Mail January 1991 if (i > match_length) { match_position = ((r - p) & (N - 1)); if ((match_length = i) >= F) break; } else if (i == match_length) { if ((c = ((r - p) & (N - 1))) < match_position) match_position = c; } } dad[r] = dad[p]; lson[r] = lson[p]; rson[r] = rson[p]; dad[lson[p]] = r; dad[rson[p]] = r; if (rson[dad[p]] == p) rson[dad[p]] = r; else lson[dad[p]] = r; dad[p] = NIL; } Jung, Ullmann [Page 16] RFC DRAFT LZJU90: Compressed Encoding for Binary Mail January 1991 void DeleteNode(p) /* Delete node p from tree */ int p; { int q; if (dad[p] == NIL) return; if (rson[p] == NIL) q = lson[p]; else if (lson[p] == NIL) q = rson[p]; else { q = lson[p]; if (rson[q] != NIL) { do { q = rson[q]; } while (rson[q] != NIL); rson[dad[q]] = lson[q]; dad[lson[q]] = dad[q]; lson[q] = lson[p]; dad[lson[p]] = q; } rson[q] = rson[p]; dad[rson[p]] = q; } dad[q] = dad[p]; if (rson[dad[p]] == p) rson[dad[p]] = q; else lson[dad[p]] = q; dad[p] = NIL; } main(ac, av) /* main program */ int ac; char **av; { int r, s, i, c; int last_match_length; int len; if (ac < 3) { fprintf(stderr, "usage: juencode in out\n"); exit(1); } in = fopen(av[1], "r"); if (!in) { fprintf(stderr, "Can't open %s\n", av[1]); exit(1); } Jung, Ullmann [Page 17] RFC DRAFT LZJU90: Compressed Encoding for Binary Mail January 1991 out = fopen(av[2], "w"); if (!out) { fprintf(stderr, "Can't open %s\n", av[2]); fclose(in); exit(1); } char_ct = 0; in_count = 0; out_count = 0; putbuf = 0; putlen = 0; MakeCrctable(); crc = CRC_MASK; fprintf(out, "* LZJU90 %s\n", av[1]); InitTree(); r = 0; s = 0; /* Fill lookahead buffer */ for (len = 0; len < F && (c = fgetc(in)) != EOF; len++) { UPDATE_CRC(crc, c); in_count++; text[s++] = c; } while (len > 0) { InsertNode(r); if (match_length > len) match_length = len; if (match_length < THRESHOLD) { EncodeLength(0); PutBits(text[r], 8); match_length = 1; } else { EncodeLength(match_length - THRESHOLD + 1); EncodePosition(match_position); } Jung, Ullmann [Page 18] RFC DRAFT LZJU90: Compressed Encoding for Binary Mail January 1991 last_match_length = match_length; for (i = 0; i < last_match_length && (c = fgetc(in)) != EOF; i++) { UPDATE_CRC(crc, c); in_count++; DeleteNode(s); text[s] = c; if (s < F - 1) text[s + N] = c; s = (s + 1) & (N - 1); } while (i++ < last_match_length) { DeleteNode(s); s = (s + 1) & (N - 1); len--; } r = (r + last_match_length) & (N - 1); } /* end compression indicator */ EncodeLength(1); EncodePosition(0); PutBits(0, 7); fprintf(out, "\n* %lu %08lX\n", in_count, crc); fprintf(stderr, "Encoded %lu bytes to %lu symbols\n", in_count, out_count); fclose(in); fclose(out); } 6. LZJU90 example The following is an example of an LZJU90 compressed object. Using this as source for the program in section 4 will reveal what it is. * LZJU90 example 8-mBtWA7WBVZ3dEBtnCNdU2WkE4owW+l4kkaApW+o4Ir0k33Ao4IE4kk bYtk1XY618NnCQl+OHQ61d+J8FZBVVCVdClZ2-LUI0v+I4EraItasHbG VVg7c8tdk2lCBtr3U86FZANVCdnAcUCNcAcbCMUCdicx0+u4wEETHcRM 7tZ2-6Btr268-Eh3cUAlmBth2-IUo3As42laIE2Ao4Yq4G-cHHT-wCEU 6tjBtnAci-I++ * 190 081E2601 References [1] David Robinson, Robert L. Ullmann. Encoding Header Field for Internet Messages. RFC 1154, Prime Computer, April, 1990. Jung, Ullmann [Page 19] RFC DRAFT LZJU90: Compressed Encoding for Binary Mail January 1991 Author's Address Robert Jung 2606 Village Road West Norwood, MA 02062 USA Phone: +1 617 769 5999 Email: robjung@world.std.com Robert Ullmann 10-30 Prime Computer, Inc. 500 Old Connecticut Path Framingham, MA 01701 USA Phone: +1 508 620 2800 x1736 Email: Ariel@Relay.Prime.COM Jung, Ullmann [Page 20] From owner-ietf-822@dimacs.rutgers.edu Thu Apr 25 16:10:35 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA22181; Thu, 25 Apr 91 15:57:34 EDT Received: from Sun.COM by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA22172; Thu, 25 Apr 91 15:57:22 EDT Received: from Eng.Sun.COM (zigzag-bb.Corp.Sun.COM) by Sun.COM (4.1/SMI-4.1) id AA19891; Thu, 25 Apr 91 12:57:18 PDT Received: from vision.Eng.Sun.COM by Eng.Sun.COM (4.1/SMI-4.1) id AA08997; Thu, 25 Apr 91 12:57:34 PDT Received: by vision.Eng.Sun.COM (4.1/SMI-4.1) id AA08467; Thu, 25 Apr 91 12:57:19 PDT Date: Thu, 25 Apr 91 12:57:19 PDT From: Vincent.Lau@eng.sun.com (Vincent Lau) Message-Id: <9104251957.AA08467@vision.Eng.Sun.COM> To: ietf-822@dimacs.rutgers.edu Subject: Re: Comments on Draft RFC > > I would like to see a stronger statement that implementations *should* > > discard the "prefix" and "postfix" areas. Don't recommend the use of > > them. I am afraid that if a sending UA puts an important (judgment > > call) message in the prefix area, but the receiving UA discards it. > > In my opinion, these 2 UA's are *not* interoperable. > > I'm happy to say they should be discarded. But are you also arguing > against the short textual message saying, in effect, "this is a > multipart message; if you're seeing this, you've got a problem"? That > still seems like a good idea to me. > All I am asking for: RFC-XXXX must state clearly that all UA's and (822<->X.400) gateways should discard them and must not convert these areas into textual body parts (it is IA5text in X.400.) These areas should not be exposed to the users. > > Content-Label is set by the *sending* UA (e.g. the sender assigns a > > label "Phone_Message" to an audio typed body part) and it is more > > user-friendlier than Message-ID when used in a reply message. This > > field exists in each body part, but it is optional. Message-ID is > > different from Content-Label that Message-ID refers to the whole > > message and Content-Label refers to a specific body part in a message. > > Ah, but in a multipart message, each "body part" is an encapsulated > message, and can therefore have its own message-id header. Therefore > they are, I believe, functionally identical. I'm also not sold by the > argument to "user-friendliness" because I think that, in either case, > the header field is intended for software use rather than for human > reading. Or is this assumption incorrect? > The value in Content-Label is controlled by the user, but Message-ID is not. The information in Content-Label is intended for human being. -Vincent From owner-ietf-822@dimacs.rutgers.edu Thu Apr 25 17:40:36 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA26026; Thu, 25 Apr 91 17:11:22 EDT Received: from NRI.RESTON.VA.US by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA26019; Thu, 25 Apr 91 17:11:17 EDT Received: from NRI by NRI.NRI.Reston.VA.US id aa14081; 25 Apr 91 16:57 EDT Cc: ietf-822@dimacs.rutgers.edu Subject: Re: Comments on Draft RFC In-Reply-To: Your message of "Thu, 25 Apr 91 12:57:19 PDT." <9104251957.AA08467@vision.Eng.Sun.COM> Date: Thu, 25 Apr 91 16:57:02 -0400 From: Greg Vaudreuil Message-Id: <9104251657.aa14081@NRI.NRI.Reston.VA.US> I'm confused. Is there a message-ID header in each body part, even if the body part is not an encapsulated message? This seems to be assumed for the cross-referencing and parallel display we are working on. By my reading, the only headers are content-type and content-encoding in the body part. With questions, Greg V. From owner-ietf-822@dimacs.rutgers.edu Thu Apr 25 18:40:35 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA28121; Thu, 25 Apr 91 18:03:59 EDT Received: from Sun.COM by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA28117; Thu, 25 Apr 91 18:03:56 EDT Received: from Eng.Sun.COM (exodus-bb.Corp.Sun.COM) by Sun.COM (4.1/SMI-4.1) id AA09208; Thu, 25 Apr 91 15:03:49 PDT Received: from skylark.Eng.Sun.COM by Eng.Sun.COM (4.1/SMI-4.1) id AA20301; Thu, 25 Apr 91 15:02:56 PDT Received: by skylark.Eng.Sun.COM (4.1/SMI-4.1) id AA02373; Thu, 25 Apr 91 15:02:27 PDT Date: Thu, 25 Apr 91 15:02:27 PDT From: Neil.Katin@eng.sun.com (Neil Katin) Message-Id: <9104252202.AA02373@skylark.Eng.Sun.COM> To: ietf-822@dimacs.rutgers.edu, nsb@thumper.bellcore.com Subject: Re: Comments on Draft RFC > Date: Thu, 25 Apr 1991 10:35:25 -0400 (EDT) > From: Nathaniel Borenstein > Subject: Re: Comments on Draft RFC ... stuff deleted... > Ah, but in a multipart message, each "body part" is an encapsulated > message, and can therefore have its own message-id header. I've been having trouble agreeing to some of the proposals being floated with respect to just encapsulating everything recursively, and I think that this sentence from Nathaniel captures where I part ways with his conceptualization. I think of body parts in the X.400 sense -- a body part is *not* just a message, it is an entirely different data-type. The set of headers that are "legal" in a message is much broader than in a body part. In fact, "body part headers" is just a typographical convenience to encode information; they really come from a different name space and are interpreted on their own. I will freely admit that, if we were starting with a clean sheet of paper we could define body parts in either way, but since X.400 interoperability is (should be?) a major goal, it makes sense to define the architecture of the message structure to be compatible with X.400 body parts. Does anyone actively want to do something that is not compatible with X.400 in this area? Are the gains worth it? Neil From owner-ietf-822@dimacs.rutgers.edu Thu Apr 25 18:55:26 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA28210; Thu, 25 Apr 91 18:09:27 EDT Received: from TWG.COM by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA28202; Thu, 25 Apr 91 18:09:12 EDT Received: from Obelix.twg.com by twg.com with SMTP ; Thu, 25 Apr 91 15:05:57 PST Received: from obelix.twg.com by Obelix.TWG.COM id aa15308; 25 Apr 91 15:05 PDT To: Nathaniel Borenstein Cc: ietf-822@dimacs.rutgers.edu Subject: Re: UUencode In-Reply-To: Your message of Thu, 25 Apr 91 10:38:51 -0400. Date: Thu, 25 Apr 91 15:05:42 -0700 From: David Herron Message-Id: <9104251505.aa15308@Obelix.TWG.COM> > Uuencode implementations also differ widely in their (in)ability to work > in a pipe, which can be a problem if you want to write an implementation > that simply pipes a body through uudecode. That's another way in which > uuencode/uudecode are "non-standard". Ok.. that's a little problem. It's just as easy to embed uu{en,de}code in a program as it is to embed Base64. In fact it might be the same actual code, but different encoding tables. At any rate the user agent we're developing includes uu{en,de}code embedded in the program. It was no trouble to do since it just slid right in. (Hadda do this since the UA is also portable to PCs and Macs). I have no qualms with supporting Base{64,85}. Technically they are better than uuencode. However uuencode is very very strongly entrenched and widely implemented. One time I read a story from Dave Crocker about some social engineering he tried putting into RFC-822. Generating addresses like To: Nathaniel Borenstein were encouraged over To: (Nathaniel Borenstein) nsb@thumper.bellcore.com or To: nsb@thumper.bellcore.com Since the first looks much nicer. But it didn't work. I mention this because your stance on uuencode smacks of this sort of social engineering. Hmm.. I haven't noticed. Am I the only person supporting uuencode? If so I'll shut up now.. David From owner-ietf-822@dimacs.rutgers.edu Thu Apr 25 19:10:35 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA28978; Thu, 25 Apr 91 18:26:11 EDT Received: from NRI.RESTON.VA.US by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA28974; Thu, 25 Apr 91 18:26:08 EDT Received: from NRI by NRI.NRI.Reston.VA.US id aa15328; 25 Apr 91 18:17 EDT Org: Corp. for National Research Initiatives Phone: (703) 620-8990 ; Fax: (703) 620-0913 To: ietf-smtp@dimacs.rutgers.edu, ietf-822@dimacs.rutgers.edu Cc: gvaudre@nri.reston.va.us Subject: INTERIM WORKING GROUP MEETINGS Date: Thu, 25 Apr 91 18:17:34 -0400 From: Greg Vaudreuil Message-Id: <9104251817.aa15328@NRI.NRI.Reston.VA.US> Folks, I would like to discharge my action item to schedule a meeting of the mail extension working group(s). I plan to hold two meetings, each focusing primarily on the topics of each of the two mailing lists. The first meeting will be a Video Teleconference to discuss and advance the message format document. A goal of this meeting is to resolve enough outstanding issues to be able to post an Internet Draft before the July IETF Meeting in Atlanta. There are conferencing facilities available in the following cities: London, Boston, Washington, Los Angles, and San Fransisco Bay Area. The second meeting proposed will be a face to face meeting in Copenhagen, Denmark in conjunction with INET 91. This meeting will focus primarily on the SMTP extensions and the interaction and interoperation of 7 bit and 8 bit mail transport. This meeting is planned to allow a large community of 8 bit mail users to give direct input to this effort. This meeting is Copenhagen presents an opportunity to increase to participation of the many European participants. This is made possible by the relativly large attendence at this conference by individuals who have interests in the work of this group. Agendas will be distributed in future mail messages. If you are interested in attending a video teleconference, please let me know, and send me the open dates between May 13-25th. If you are able to addend a face to face meeting in conjection with INET 91, probably June 20th and 21st, please let me know. Greg Vaudreuil Chair, Internet Mail Extensions Working Group(s?) From owner-ietf-822@dimacs.rutgers.edu Thu Apr 25 19:40:36 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA00330; Thu, 25 Apr 91 18:55:30 EDT Received: from TWG.COM by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA00304; Thu, 25 Apr 91 18:54:57 EDT Received: from Obelix.twg.com by twg.com with SMTP ; Thu, 25 Apr 91 15:34:00 PST Received: from obelix.twg.com by Obelix.TWG.COM id aa16019; 25 Apr 91 15:33 PDT To: Nathaniel Borenstein Cc: ietf-822@dimacs.rutgers.edu Subject: multiple Content-Encoding:'s In-Reply-To: Your message of Thu, 25 Apr 91 10:38:51 -0400. Date: Thu, 25 Apr 91 15:33:35 -0700 From: David Herron Message-Id: <9104251533.aa16019@Obelix.TWG.COM> > I agree that the SYNTAX of a cascaded content-encoding header is not > problematic. The semantics are even pretty clear, too. The > implementation, though more complex than the simple content-encoding > field, is also not beyond most programmers' abilities. Ok, good.. ;-) > What I don't > see, however, is why the cascading is necessary. If it isn't necessary, > the fact that it isn't too terribly complex is irrelevant -- it's still > unnecessary! My general policy is to not cause things to be limited. But you have to be careful to keep things from being complicated, or hard to implement, etc. As for *needing* the multiple encodings .. I admit to not having any particular plan (right now) for use of multiple encodings. It may well be the sort of thing like "Nobody will ever need more than 64K of memory". That is, it may burn us later on. There is a convention used for file naming where things like file.tar file.tar.Z file.tar.Z.uu file.tar.Z.uu.xaa file.tar.Z.uu.xab file.tar.Z.uu.xac all make sense. This is what I intend to do with multiple content-encodings. I did come up with a couple after a few moments thought. First, of course, is Content-Type: audio u-law Content-Encoding: uuencode, compress Proposal: List the most recent encoding first in the list and the first encoding last in the list. Suppose you're on a system with record-oriented files. Since I'm at TWG that means: VMS. RMS files aren't streams of bytes. The draft-RFC talks about encodings as if they are encoding byte-streams. So to successfully encode an RMS file the record attributes, and all sorts of what-not which I don't care to understand, must be turned into a byte stream. Content-Type: X-DEC-RMS Content-Encoding: uuencode, compress, X-DEC-RMS-TO-STREAM Suppose I'm a nervous nelly and don't trust that the MTA's will in fact reliably deliver bytes. Content-Type: text/MAILASCII Content-Encoding: uuencode, compress, X-Checksum-Encapsulation Or Content-Type: X-DEC-RMS Content-Encoding: uuencode, compress, X-Checksum-Encapsulate, X-DEC-RMS-TO-STREAM Suppose the file is larger than is convenient or conventional to send through mail. (Sendmail tends to bounce >100,000 byte msgs) Content-Type: X-shar Content-Encoding: split part-001, uuencode, compress Content-Type: X-shar Content-Encoding: split part-002, uuencode, compress Content-Type: X-shar Content-Encoding: split part-003, uuencode, compress (X-shar meaning "shell archive") The UA could ask This message is a bit large, would you like to split it between multiple messages? And on reception Hmm.. we seem to be missing part-002 of this message. David From owner-ietf-822@dimacs.rutgers.edu Thu Apr 25 20:51:29 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA02652; Thu, 25 Apr 91 20:06:11 EDT Received: from FRIGGA.CLAREMONT.EDU by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA02638; Thu, 25 Apr 91 20:05:56 EDT Date: Thu, 25 Apr 1991 17:05 PDT From: "Ned Freed, Postmaster" Subject: Re: multiple body parts of different encodings To: vixie@pa.dec.com Cc: ietf-822@dimacs.rutgers.edu Message-Id: X-Envelope-To: ietf-822@dimacs.rutgers.edu X-Vms-To: IN%"vixie@pa.dec.com" X-Vms-Cc: IN%"ietf-822@dimacs.rutgers.edu" Paul A Vixie writes: > to that end :-), let me spec a little of this out: > message :: headers blank-line body > headers :: header [...] > body :: line [...] Sounds like RFC822 to me. I don't think anyone is proposing that RFC822 be changed in an incompatible way, only extended. Since RFC822 covers this spec as well as a lot of other stuff, why not just say we're sticking with the framework defined there? > i'd like to keep the name part of the header (everything to the left of the > colon) spec'd to 7-bit ASCII. i'd like all the reserved words in complex > headers like "received:" to 7-bit ASCII. actual text (like the subject > field or the full-name/comment parts of to/cc/from) should somehow be > allowed to be 8-bit. i don't know what to suggest for the stuff inside of > in to/cc/from. the restrictions on domain names (anything to > the right of an @) should be whatever DNS spec's, which is probably 7-bit > ASCII. While the business about holding header labels to 7 bit is not explicitly mentioned in RFC-XXXX, it is implicitly there, since the existing standards say they have to be 7 bit and RFC-XXXX does not extend this. > without saying anything about body parts or encodings, i'm already into > an 8-bit transport. if a message that has 8-bit data in its Subject: field > needs to be sent over a 7-bit transport, it has to be encoded. this > encoding needs to be something that a user or user-agent can make sense > of, since it may never reach another 8-bit transport and even if it does > i'm not sure it should be decoded back into 8-bit data. we can argue that > one. an encoding like \NNN where NNN is an octal number would suit those > of us in the UNIX(tm) community pretty well but we aren't the whole world > and no doubt a Norwegian recipient would rather not see a \221 where an > accented character would normally appear. This has been hashed out endlessly; to separate the requirements from the implied implementation, we obviously want these capabilities, and this sort of approach seems generally like the right direction to be heading. > we may want to consider a new > encoding enumeration which is painful to generate or tear down but which > has equivilences which are chosen to be readable in their encoded form. > like i said, this can be argued. See RFC-XXXX's scheme for doing this. > the headers will have to have some kind of magic cookie added to them > when a message (headers and/or body) has 8-bit data in it. this cookie > can be munched slightly when/if a transport or user-agent needs to > encode it into the "readable 7-bit" notation i mentioned earlier. i > believe that the presence of this magic cookie (really "a new header") > should tell anyone who cares about the distinction, that this message > conforms to the newer mail RFC's and all else that that may imply. RFC-XXXX uses one approach that meets these criteria; without exception all the counter-proposals for alternate approaches I've seen differ only in the details (Charset: header, charset in first Content-Type field, charset as a part of first Content-Type field, etc.) > let's try to agree on a framework with arguable variables, and then > argue about the variables. does anyone think that what i've said above > will work? (it's a restatement of what several other people have said, > so i know that *somebody* thinks it's reasonable). does anyone know a > reason why we cannot or should not start with the above framework? Not only is it a restatement, I think almost all the proposals and amendments currently being debated meet these criteria as well. In other words, I think there's tacit acceptance of your set of criteria already, and in fact there's general acceptance of considerably more than what you've laid out, and we're well into the discussion of how the various details get resolved. If anyone disagrees with this broad outline, I'd like to hear about it. (This assumes that the SMTP extensions discussion is separate, which presently it is -- there's no general acceptable of much of anything at that level.) > Paul Vixie > DEC Western Research Lab > Palo Alto, California, USA ...!decwrl!vixie ...!vixie!paul Ned From owner-ietf-822@dimacs.rutgers.edu Thu Apr 25 21:10:36 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA04206; Thu, 25 Apr 91 20:57:32 EDT Received: from srawgw.sra.co.jp by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA04202; Thu, 25 Apr 91 20:57:24 EDT Received: from srava.sra.co.jp by srawgw.sra.co.jp (5.64WH/1.4) id AA25353; Fri, 26 Apr 91 09:57:42 +0900 Received: from sran8.sra.co.jp by srava.sra.co.jp (5.64b/6.4J.6-BJW) id AA00916; Fri, 26 Apr 91 09:56:38 +0900 Received: from localhost by sran8.sra.co.jp (4.0/6.4J.6-SJ) id AA14960; Fri, 26 Apr 91 09:53:09 JST Return-Path: Message-Id: <9104260053.AA14960@sran8.sra.co.jp> Reply-To: erik@sra.co.jp From: erik@sra.co.jp (Erik M. van der Poel) To: ietf-822@dimacs.rutgers.edu Subject: Re: UUencode Date: Fri, 26 Apr 91 09:53:07 +0900 Sender: erik@sran8.sra.co.jp David Herron writes: > Am I the only person supporting uuencode? Nope. I would also like to support uuencode. As you say, it is very widely known, implemented, and installed. It is very easy for a receiver to apply uudecode even if he/she doesn't have a new super-duper UA. As far as pipable uuencoders are concerned, I would suggest that we specify in the RFC that only non-pipable uuencode format may be used. (And maybe add some other details about end-of-line spaces, etc.) This way, implementors of new UAs will be forced to use a portable type of uuencode, and receivers with old UAs will be able to use their ordinary uuencode program by hand. However, if there is a consensus that the new RFC should attempt to force users to get new UAs, maybe we should not include uuencode. Erik From owner-ietf-822@dimacs.rutgers.edu Thu Apr 25 22:10:37 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA05658; Thu, 25 Apr 91 21:55:06 EDT Received: from FRIGGA.CLAREMONT.EDU by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA05652; Thu, 25 Apr 91 21:55:00 EDT Date: Thu, 25 Apr 1991 18:54 PDT From: "Ned Freed, Postmaster" Subject: Re: Some comments to new draft To: david@twg.com Cc: ietf-822@dimacs.rutgers.edu Message-Id: X-Envelope-To: ietf-822@dimacs.rutgers.edu X-Vms-To: IN%"david@twg.com" X-Vms-Cc: IN%"ietf-822@dimacs.rutgers.edu" An important philosophical point needs to be made explicit here. There are two reasons for standardizing a given content type: (1) Functionality. Certain content-types clearly need to be supported by as many UAs as possible (note that "support" of a content-type by an MTA is something of a nonsequiter; even support in gateways for conversions is somewhat problematic). Minimal interoperability is a requirement, therefore support for some subset of all the content-types should also be a requirement. (2) Compatibility. A large class of content-types exist which, while universal support cannot be provided, universal compatibility can be. In other words, if I decide to send something produced by MicroSoft Word, it would be nice if someone else with MicroSoft word could receive it and interpret it correctly. We're getting confused here about which of these reasons apply to what content-types. This is our (Nathaniels and my) fault; we need to make this much clearer in the RFC. However, our position (speaking for Nathaniel here without his consent, I admit) is that if either one of these reasons apply to a given content-type, it warrants inclusion (perhaps not in this RFC, but in some RFC somewhere). Thus, the conclusion that we're requiring support for all of the content-types mentioned in the RFC is invalid. We aren't. What we are saying is "if you export material of this type, do it this way". We do need to indicate whether or not support for various content-types is required. What about required/recommended/optional indicator? I also need to tighten up the specification of how X.400 material gets translated. I thought that since X.400 immediately implies a representation format (via the Basic Encoding Rules), that it was obvious that the various chunks of ASN.1 referenced in the RFC get encoded using the BER and then treated as a byte stream to be further encoding in whatever printable form the Content-encoding: headers says to use. (Incidentally, does anyone have a feeling of how well varous compression schemes work on ASN.1?) We've obviously restricted outselves to a very limited subset of the possible content-types. There's a simple reason for doing this -- finite time and finite expertise. I'm not going to standardize the encoding for MicroSoft Word, for instance. For all I know there's a printable representation of MicroSoft Word documents in some markup format that should be used here. (I doubt it, but I'm not sure.) And is the represntation the same on PCs, Macs, and whatever else can run it? I sure don't know. This didn't stop us from including various content-types that we're not real authorities on (at least I'm not an authority on X.400, although I'm involved in the standards process and I've worked with X.400 fairly heavily in various forms). There are certain types of data that we feel really need to be in there, and we covered them as best we could. We're working with the ODA gurus to figure out how ODA will fit into all this, for example, since neither of us has any ODA expertise. And hopefully our mistakes will get corrected as the document continues to evolve. Ned P.S. My use of MicroSoft Word is entirely an example; I don't have strong feelings about this product one way or the other. From owner-ietf-822@dimacs.rutgers.edu Thu Apr 25 23:40:40 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA07196; Thu, 25 Apr 91 22:45:10 EDT Received: from FRIGGA.CLAREMONT.EDU by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA07190; Thu, 25 Apr 91 22:45:05 EDT Date: Thu, 25 Apr 1991 19:44 PDT From: "Ned Freed, Postmaster" Subject: Re: Some comments to new draft To: Dan.Oscarsson@dna.lth.se Cc: ietf-822@dimacs.rutgers.edu Message-Id: X-Envelope-To: ietf-822@dimacs.rutgers.edu X-Vms-To: IN%"Dan.Oscarsson@dna.lth.se" X-Vms-Cc: IN%"ietf-822@dimacs.rutgers.edu" Dan.Oscarsson writes: > Part 2: Content-Type header > There are to many of them. > There should be a few general ones, internation standards if possible. > Why have scribe,tex,troff,dvi,pbm,pgm,ppm? They are very special. > Why not then FrameMaker? > I am not sure that the pbm, p... are the best standard for images. > We should use an international standard (no fax standard, please) for > images. > There ought to be a few iso character set standards defined. > ISO 8859-1 and ISO 10646. > And MAILASCII, ISO 8859-1 and ISO 10646 should be the recommended > types for text bodies. I think anything that makes sense to interchange should be eligible for its own content-type (see my recent posting on this issue). There's no reason not to do FrameMaker. But I'm not competent to write down the details of the representation. If you are and you want to do it, by all means write it up in a companion RFC. I'd love to see a bunch of these RFCs in the future. > About the postscript type: > Why talk about laserpreps? This is something Apple is using? > Each postscript body should be self-containd. It may not depend on some > separate code having been loaded into the printer. Like it or not, laserpreps are a fact of life, and it is hard to ignore the fact that a majority of PostScript use is NOT self-contained because of laserprep (Macs accounts for the majority of PostScript use the last time I researched it -- it may not be true these days). I don't like it. In fact, I hate it, since it is a real mess to deal with in printer software. But my not liking it does not make the problem go away, and the bottom line is that this information is often vital to the correct interpretation of a document. Omitting it because of its inelegance merely forces people to put it back in using a nonstandard mechanism, and now our purity has caused a loss of interoperability. If you think by specifying that PostScript must be self-contained and cannot depend on laserpreps, I encourage you to espouse this view in the Mac community. You ain't seen hostile till you do something like this... > Part 3.1: Quoted-Printable > Why use hexadecimal in Style #1? > If style #1 is used for readability I prefer octal codes. > If style #1 is for efficiency each 4 bits could be coded as "A"+value, > this gives faster encoding/decoding. Yes, and if I fill out my tax return in base -36, I have to write a lot less (using a negative base lets you write any integer without a sign) ;-) But then nobody can read it, and I get in a lot of trouble. There are real advantages to sticking with a format that's readily comprehensible. I could have figured out style #1 without ever having seen the RFC. And who cannot afford a 16-byte lookup table and the time to consult it? > Dan Ned From owner-ietf-822@dimacs.rutgers.edu Fri Apr 26 00:40:40 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA10158; Fri, 26 Apr 91 00:13:46 EDT Received: from TWG.COM by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA10141; Fri, 26 Apr 91 00:13:02 EDT Received: from Obelix.twg.com by twg.com with SMTP ; Thu, 25 Apr 91 20:54:21 PST Received: from obelix.twg.com by Obelix.TWG.COM id aa17224; 25 Apr 91 17:09 PDT To: ietf-822@dimacs.rutgers.edu Subject: Re: UUencode In-Reply-To: Your message of Thu, 25 Apr 91 15:05:42 -0700. <9104251505.aa15308@Obelix.TWG.COM> Date: Thu, 25 Apr 91 17:09:41 -0700 From: David Herron Message-Id: <9104251709.aa17224@Obelix.TWG.COM> > I mention this because your stance on uuencode smacks of this > sort of social engineering. Just occurred to me that this might be read to sound as if it were a flame. It is not.. From owner-ietf-822@dimacs.rutgers.edu Fri Apr 26 06:10:45 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA16025; Fri, 26 Apr 91 06:07:07 EDT Received: from srawgw.sra.co.jp by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA16021; Fri, 26 Apr 91 06:06:56 EDT Received: from srava.sra.co.jp by srawgw.sra.co.jp (5.64WH/1.4) id AA27984; Fri, 26 Apr 91 19:07:12 +0900 Received: from sran8.sra.co.jp by srava.sra.co.jp (5.64b/6.4J.6-BJW) id AA06155; Fri, 26 Apr 91 19:06:08 +0900 Received: from localhost by sran8.sra.co.jp (4.0/6.4J.6-SJ) id AA15461; Fri, 26 Apr 91 19:02:38 JST Return-Path: Message-Id: <9104261002.AA15461@sran8.sra.co.jp> Reply-To: erik@sra.co.jp From: erik@sra.co.jp (Erik M. van der Poel) To: ietf-822@dimacs.rutgers.edu Subject: Re: support for all content types Date: Fri, 26 Apr 91 19:02:35 +0900 Sender: erik@sran8.sra.co.jp Randall Atkinson writes: > Greg Vaudreuil writes: > % This is very much a message format issue. RFC explicity addressed > % this issue. It said, "you must use ASCII". If it is not a 822 issue, > % then is it an topic for a separate implementor agreement? NETF > % specified use of latin-1 on there networks, NSFnet specifies ISO 10646 > % for the US scientific community, and Japan specifies 2022-n for > % internal use and ISO 10646 for external use.... This is a nightmare > % that the IETF/IAB/Internet community has never had to deal with. > > This seems to indicate that the RFC should at least include type > definitions for the sets in my paragraph [below] simply for reason of > inter-operability with the above networks. Wait a second. You aren't assuming that what Greg said about NETF, NSFnet and Japan is true, are you? Greg was just hypothesizing. Japan has not specified 2022 for internal use and 10646 for external use. Japan has not specified anything for internal use (i.e. within organizations), but it has specified a form of ISO 2022 for external use. > This seems to present a case for restricting the draft RFC's type > definitions to only the US-ASCII (X3.4-1986), ISO 646-N, ISO 8859-N, > and ISO DIS 10646 standards. This would be easier to implement as > well as addressing the problems of Internationalisation. Why not include the Japanese form of ISO 2022 in your list? It's certainly not any harder to `implement' (whatever *that* means) than 10646, and moreover, there's a large installed base of software in Japan that understands this encoding. Well, perhaps some of the non-Japanese on this list would object to an RFC that says that ISO 2022 *must* be supported, but I think it is important to remember what Ned wrote about this: # Thus, the conclusion that we're requiring support for all of the # content-types mentioned in the RFC is invalid. We aren't. What we are # saying is "if you export material of this type, do it this way". Erik From owner-ietf-822@dimacs.rutgers.edu Fri Apr 26 08:10:51 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA18996; Fri, 26 Apr 91 08:04:14 EDT Received: from dkuug.dk by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA18981; Fri, 26 Apr 91 08:03:47 EDT Received: by dkuug.dk (5.64+/8+bit/IDA-1.2.8) id AA14361; Fri, 26 Apr 91 14:03:38 +0200 Date: Fri, 26 Apr 91 14:03:38 +0200 From: Keld J|rn Simonsen Message-Id: <9104261203.AA14361@dkuug.dk> To: erik@sra.co.jp, ietf-822@dimacs.rutgers.edu Subject: Re: support for all content types X-Charset: ASCII X-Char-Esc: 29 Erik van der Poel writes: > Randall Atkinson writes: > > Greg Vaudreuil writes: > > % This is very much a message format issue. RFC explicity addressed > > % this issue. It said, "you must use ASCII". If it is not a 822 issue, > > % then is it an topic for a separate implementor agreement? NETF > > % specified use of latin-1 on there networks, NSFnet specifies ISO 10646 > > % for the US scientific community, and Japan specifies 2022-n for > > % internal use and ISO 10646 for external use.... This is a nightmare > > % that the IETF/IAB/Internet community has never had to deal with. > > > > This seems to indicate that the RFC should at least include type > > definitions for the sets in my paragraph [below] simply for reason of > > inter-operability with the above networks. > > Wait a second. You aren't assuming that what Greg said about NETF, > NSFnet and Japan is true, are you? Greg was just hypothesizing. Japan > has not specified 2022 for internal use and 10646 for external use. > Japan has not specified anything for internal use (i.e. within > organizations), but it has specified a form of ISO 2022 for external > use. What Greg wrote is also not true for what NETF decided. NETF decided to use 10646 compaction method 5, not latin-1. I wonder if the third example (NSFnet) was hypothetical too. > > > > This seems to present a case for restricting the draft RFC's type > > definitions to only the US-ASCII (X3.4-1986), ISO 646-N, ISO 8859-N, > > and ISO DIS 10646 standards. This would be easier to implement as > > well as addressing the problems of Internationalisation. > > Why not include the Japanese form of ISO 2022 in your list? It's > certainly not any harder to `implement' (whatever *that* means) than > 10646, and moreover, there's a large installed base of software in > Japan that understands this encoding. I second the requirement for Japanese 2022 support. Keld From owner-ietf-822@dimacs.rutgers.edu Fri Apr 26 09:40:44 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA20706; Fri, 26 Apr 91 09:13:19 EDT Received: from NRI.RESTON.VA.US by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA20702; Fri, 26 Apr 91 09:13:16 EDT Received: from NRI by NRI.NRI.Reston.VA.US id aa05661; 26 Apr 91 9:05 EDT To: Keld J|rn Simonsen Cc: erik@sra.co.jp, ietf-822@dimacs.rutgers.edu, gvaudre@nri.reston.va.us Subject: Re: support for all content types In-Reply-To: Your message of "Fri, 26 Apr 91 14:03:38 +0200." <9104261203.AA14361@dkuug.dk> Date: Fri, 26 Apr 91 09:05:39 -0400 From: Greg Vaudreuil Message-Id: <9104260905.aa05661@NRI.NRI.Reston.VA.US> > What Greg wrote is also not true for what NETF decided. > NETF decided to use 10646 compaction method 5, not latin-1. > I wonder if the third example (NSFnet) was hypothetical too. Just for clarification, these were all hypotheticals! Yikes, I should use made-up communities of users in the future to go with my made-up policies..... :-) Is ISO 2022 for Japanese a subset of ISO 10646? If so, I have no problem listing is as a "regional subset" of the "full" character set. If you send japanese, you can send it in ISO 10646, and ISO 2022, but the same implementation will show both. This is the interoperability problem I'm trying to address. Greg V. From owner-ietf-822@dimacs.rutgers.edu Fri Apr 26 10:10:47 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA21249; Fri, 26 Apr 91 09:41:54 EDT Received: from thumper.bellcore.com by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA21245; Fri, 26 Apr 91 09:41:52 EDT Received: from greenbush.bellcore.com by thumper.bellcore.com (4.1/4.7) id for ietf-822@dimacs.rutgers.edu; Fri, 26 Apr 91 09:41:49 EDT Received: by greenbush.bellcore.com (4.12/4.7) id for ietf-822@dimacs.rutgers.edu; Fri, 26 Apr 91 09:45:18 edt Received: from Messages.7.14.N.CUILIB.3.45.SNAP.NOT.LINKED.greenbush.mouseclub.sun4.40 via MS.5.6.greenbush.mouseclub.sun4_40; Fri, 26 Apr 1991 09:45:16 -0400 (EDT) Message-Id: <4c62vgO0M2Yt0BE4w=@thumper.bellcore.com> Date: Fri, 26 Apr 1991 09:45:16 -0400 (EDT) From: Nathaniel Borenstein To: David Herron Subject: Re: UUencode Cc: ietf-822@dimacs.rutgers.edu In-Reply-To: <9104251505.aa15308@Obelix.TWG.COM> References: <9104251505.aa15308@Obelix.TWG.COM> Actually, I'd cite this (the story of the "From" field format) as a positive example. My impression is that Dave's effort is (still in the process of) gradually succeeding. Newer mail agents seem more and more often to be using Dave's preferred form. Of course, I'm prejudiced since we followed his advice with Andrew... From owner-ietf-822@dimacs.rutgers.edu Fri Apr 26 10:18:51 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA21147; Fri, 26 Apr 91 09:39:01 EDT Received: from thumper.bellcore.com by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA21143; Fri, 26 Apr 91 09:38:59 EDT Received: from greenbush.bellcore.com by thumper.bellcore.com (4.1/4.7) id for ietf-822@dimacs.rutgers.edu; Fri, 26 Apr 91 09:38:51 EDT Received: by greenbush.bellcore.com (4.12/4.7) id for Neil.Katin@Eng.Sun.COM; Fri, 26 Apr 91 09:42:18 edt Received: from Messages.7.14.N.CUILIB.3.45.SNAP.NOT.LINKED.greenbush.mouseclub.sun4.40 via MS.5.6.greenbush.mouseclub.sun4_40; Fri, 26 Apr 1991 09:42:14 -0400 (EDT) Message-Id: Date: Fri, 26 Apr 1991 09:42:14 -0400 (EDT) From: Nathaniel Borenstein To: ietf-822@dimacs.rutgers.edu, Neil.Katin@eng.sun.com (Neil Katin) Subject: Re: Comments on Draft RFC In-Reply-To: <9104252202.AA02373@skylark.Eng.Sun.COM> References: <9104252202.AA02373@skylark.Eng.Sun.COM> Excerpts from mail: 25-Apr-91 Re: Comments on Draft RFC Neil Katin@Eng.Sun.COM (1315) > I think of body parts in the X.400 sense -- a body part is *not* > just a message, it is an entirely different data-type. The set > of headers that are "legal" in a message is much broader than > in a body part. In fact, "body part headers" is just a typographical > convenience to encode information; they really come from a different > name space and are interpreted on their own. Well, that's certainly not the way we've defined it up to now. Everything that has been said about the parts in a multipart message has been, in effect, that it is an 822 message in miniature. I see no problem with this. > I will freely admit that, if we were starting with a clean sheet of > paper we could define body parts in either way, but since X.400 > interoperability is (should be?) a major goal, it makes sense to > define the architecture of the message structure to be compatible > with X.400 body parts. > Does anyone actively want to do something that is not compatible with > X.400 in this area? Are the gains worth it? Absolutely. The power that comes from recursive encapsulation is, in my estimation, enormous. If X.400 can't support nested encapsulated messages, then the world needs something better than X.400. My understanding, however, is that X.400 could indeed support nested encapsulation, in which case a translator should be possible. Am I wrong? From owner-ietf-822@dimacs.rutgers.edu Fri Apr 26 10:40:45 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA21389; Fri, 26 Apr 91 09:48:14 EDT Received: from thumper.bellcore.com by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA21385; Fri, 26 Apr 91 09:48:10 EDT Received: from greenbush.bellcore.com by thumper.bellcore.com (4.1/4.7) id for ietf-822@dimacs.rutgers.edu; Fri, 26 Apr 91 09:48:04 EDT Received: by greenbush.bellcore.com (4.12/4.7) id for ietf-822@dimacs.rutgers.edu; Fri, 26 Apr 91 09:51:35 edt Received: from Messages.7.14.N.CUILIB.3.45.SNAP.NOT.LINKED.greenbush.mouseclub.sun4.40 via MS.5.6.greenbush.mouseclub.sun4_40; Fri, 26 Apr 1991 09:51:32 -0400 (EDT) Message-Id: Date: Fri, 26 Apr 1991 09:51:32 -0400 (EDT) From: Nathaniel Borenstein To: David Herron Subject: Re: multiple Content-Encoding:'s Cc: ietf-822@dimacs.rutgers.edu In-Reply-To: <9104251533.aa16019@Obelix.TWG.COM> References: <9104251533.aa16019@Obelix.TWG.COM> David Herron's message exemplifies exactly what I don't want to see happening with the Content-Encoding header. We have a fundamental difference of philosophy here: some of us see Content-Encoding as a very simple mechanism to solve the 7-bit-limited-line restrictions of SMTP. Others see it as an open-ended mechanism that might eventually solve, for example, the problem of splitting large mail into multiple pieces. The problem of sending 8-bit and binary data through the mail is a VERY important one. That's why this whole can of worms was opened up. To solve it properly, and to see that solution widely implemented, we need a simple solution. The simpler the solution, I believe, the more likely it is to see widespread implementation. David's notion of Content-Encoding might or might not be viewed as far more elegant, and its certainly more powerful, but I'd hate to see it become the sticking point that prevents binary mail from becoming a "standard" part of the internet infrastructure. Incidentally, all of David's examples can, I believe, be handled with Content-types, at the cost of the proliferation of lots more content-types. But this is much better, in my opinion, because content-types don't need to be universal. Content-types are useful as long as they are understood by a cooperating set of User Agents. I don't believe that this is as true of content-encodings, primarily because I think of content-encodings as something that users will often be totally unaware of. That is, I'll be aware that I'm sending you, say, an image, but I won't be aware of its base64 encoding. Therefore I, as a user, might reasonably be expected only to send you an image if I know you can receive it, but I can't reasonably be expected to worry about what encodings you can handle. That mechanism should be standard. From owner-ietf-822@dimacs.rutgers.edu Fri Apr 26 11:10:45 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA22062; Fri, 26 Apr 91 10:12:53 EDT Received: from thumper.bellcore.com by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA22058; Fri, 26 Apr 91 10:12:50 EDT Received: from greenbush.bellcore.com by thumper.bellcore.com (4.1/4.7) id for ietf-822@dimacs.rutgers.edu; Fri, 26 Apr 91 10:12:46 EDT Received: by greenbush.bellcore.com (4.12/4.7) id for ietf-822@dimacs.rutgers.edu; Fri, 26 Apr 91 10:16:19 edt Received: from Messages.7.14.N.CUILIB.3.45.SNAP.NOT.LINKED.greenbush.mouseclub.sun4.40 via MS.5.6.greenbush.mouseclub.sun4_40; Fri, 26 Apr 1991 10:16:16 -0400 (EDT) Message-Id: <4c63MkK0M2Yt0BEA4z@thumper.bellcore.com> Date: Fri, 26 Apr 1991 10:16:16 -0400 (EDT) From: Nathaniel Borenstein To: ietf-822@dimacs.rutgers.edu Subject: Re: Content-Label In-Reply-To: <9104251957.AA08467@vision.Eng.Sun.COM> References: <9104251957.AA08467@vision.Eng.Sun.COM> Excerpts from internet.ietf-822: 25-Apr-91 Re: Comments on Draft RFC Vincent Lau@eng.sun.com (1843) > The value in Content-Label is controlled by the user, but Message-ID > is not. The information in Content-Label is intended for human being. Now I'm really confused. What, in this case, is Content-Label actually for? I had assumed its primary purpose was to allow certain parts to reference each other to form complex structured multipart mail, but this statement implies that I've entirely missed your motivation. From owner-ietf-822@dimacs.rutgers.edu Fri Apr 26 11:30:30 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA22339; Fri, 26 Apr 91 10:23:05 EDT Received: from dkuug.dk by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA22320; Fri, 26 Apr 91 10:22:54 EDT Received: by dkuug.dk (5.64+/8+bit/IDA-1.2.8) id AA17866; Fri, 26 Apr 91 16:18:35 +0200 Date: Fri, 26 Apr 91 16:18:35 +0200 From: Keld J|rn Simonsen Message-Id: <9104261418.AA17866@dkuug.dk> To: gvaudre@nri.reston.va.us Subject: Re: support for all content types Cc: erik@sra.co.jp, ietf-822@dimacs.rutgers.edu X-Charset: ASCII X-Char-Esc: 29 Thanks Greg for your clarifications. I think your examples were OK in pointing ot the problems of interoperability. > Is ISO 2022 for Japanese a subset of ISO 10646? If so, I have no > problem listing is as a "regional subset" of the "full" character set. > If you send japanese, you can send it in ISO 10646, and ISO 2022, but > the same implementation will show both. This is the interoperability > problem I'm trying to address. The ISO 2022 encoding is not a subset of 10646. Actually 10646 can be seen as the new 2022 scheme, where different character sets can be chosen. 2022 is the definition of how to switch between character sets registered with the ECMA registry. 2022 define the C0 C1 G0 G1 G2 G3 mechanisms , and each of the ECMA registered character sets then have a code to be used with one or more of these switches. The ECMA registered character sets then comprises a lot of national ISO 646 7-bit variants, the ISO 8859 series, some ISO 6937 family character sets with non-spacing diacritics, some 16 bit character sets like the Japanese, the Chinese and the Korean, and also control character sets. 10646 is one big character set, canonically defined as a 4-octet character set, but with compaction forms to 1,2,or 3 octets and with also a dynamic compaction form (1-4 octets). I suppose that 10646 will have a designation code in the 2022 scheme, so they are able to interoperate in this way. Keld From owner-ietf-822@dimacs.rutgers.edu Fri Apr 26 11:40:45 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA23407; Fri, 26 Apr 91 10:54:20 EDT Received: from rutvm1.rutgers.edu by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA23403; Fri, 26 Apr 91 10:54:18 EDT Received: from RUTVM1.RUTGERS.EDU by RutVM1.Rutgers.Edu (IBM VM SMTP R1.2.1MX) with BSMTP id 5572; Fri, 26 Apr 91 10:56:17 EDT Received: from VM1.calc.ucl.ac.be by RUTVM1.RUTGERS.EDU (Mailer R2.07) with BSMTP id 8795; Fri, 26 Apr 91 10:56:17 EDT Received: by BUCLLN11 (Mailer R2.07) id 3840; Fri, 26 Apr 91 16:53:49 +0200 Date: Fri, 26 Apr 91 16:46:00 +0200 From: "Alain FONTAINE (Postmaster - NAD)" Message-Id: <910426.164600.+0200.af@sei.ucl.ac.be> Subject: RFC-XXXX, section 3.1, minor clarification needed. To: Nathaniel Borenstein , ietf-822@dimacs.rutgers.edu # Style #2: An 8 bit value from 160 through 255 may, alternately, be # represented by an ampersand character followed by the character This seems to tell that Style #2 is defined as handling 160 to 255 only. # Note that these two styles may be freely intermixed. Style #1 is # preferred for characters 128 through 159, because style #2 might # include control characters (e.g. TAB) that are altered by some MTA Therefore, 'preferred' does not seem to be the right word. Style #1 is the only solution, since Style #2 does not handle those values by definition. # It is also recommended #that the persistence of character codes less than 32 should not be #relied on, particularly the TAB, CR, and LF characters. Where such #characters would be required for representation in style #2, it is #recommended that style #1 be used. Again, Style #2 cannot be used at all, by definition. The word 'recommended' does not seem quite right. Unless, of course, my weak command of english puts me on the wrong track. Alain FONTAINE +--------------------------------+ Universite Catholique de Louvain | If your mail software barks at | Service d'Etudes Informatiques | my address, you may try : | Batiment Pythagore | | Place des Sciences, 4 | FNTA80@BUCLLN11.BITNET | B-1348 Louvain-la-Neuve, BELGIUM +--------------------------------+ phone +32 (10) 47-2625 From owner-ietf-822@dimacs.rutgers.edu Fri Apr 26 11:54:55 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA23724; Fri, 26 Apr 91 11:08:45 EDT Received: from rutvm1.rutgers.edu by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA23720; Fri, 26 Apr 91 11:08:43 EDT Received: from RUTVM1.RUTGERS.EDU by RutVM1.Rutgers.Edu (IBM VM SMTP R1.2.1MX) with BSMTP id 5588; Fri, 26 Apr 91 11:10:41 EDT Received: from VM1.calc.ucl.ac.be by RUTVM1.RUTGERS.EDU (Mailer R2.07) with BSMTP id 8833; Fri, 26 Apr 91 11:10:40 EDT Received: by BUCLLN11 (Mailer R2.07) id 3983; Fri, 26 Apr 91 17:06:47 +0200 Date: Fri, 26 Apr 91 16:54:16 +0200 From: "Alain FONTAINE (Postmaster - NAD)" Message-Id: <910426.165416.+0200.af@sei.ucl.ac.be> Subject: RFC-XXXX, 'content-encoding'. To: Nathaniel Borenstein , ietf-822@dimacs.rutgers.edu Given the fact that this encoding is done just to avoid mutilation in transport, I don't see any reason to have many options. HEX is wasteful and should be dropped. UUencode should not be reintroduced - how many times did have have to help poor users trying to recover binary objects mangled by mail gateways while in their uue suit... I do understand that Base64 has been retrieved from RFC1113, but does it really need to be so restrictive by specifying exactly 64 characters in each line ? One can of course specify that an encoder should put that many characters on each line, but a decoder could be more liberal in what it does accept. Suppose a line has been splitted from some reason ; should the decoder be allowed to refuse to decode because lines do not contain exactly the right number of characters ? On the other hand, if I were to design such a thing, I would certainly put some checksum. Of course, this checksum would be computed on the encoded byte values, and not on the binary values of the codes of the characters used for encoding. Don't forget that the characters used for the encoding are themselves encoded, and that this code can be changed in gateways. /AF From owner-ietf-822@dimacs.rutgers.edu Fri Apr 26 12:10:45 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA23928; Fri, 26 Apr 91 11:11:44 EDT Received: from NRI.RESTON.VA.US by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA23924; Fri, 26 Apr 91 11:11:39 EDT Received: from NRI by NRI.NRI.Reston.VA.US id aa09230; 26 Apr 91 11:04 EDT To: Nathaniel Borenstein Cc: David Herron , ietf-822@dimacs.rutgers.edu, gvaudre@nri.reston.va.us Subject: Re: multiple Content-Encoding:'s In-Reply-To: Your message of "Fri, 26 Apr 91 09:51:32 EDT." Date: Fri, 26 Apr 91 11:04:37 -0400 From: Greg Vaudreuil Message-Id: <9104261104.aa09230@NRI.NRI.Reston.VA.US> I think I understand, After all these notes..... Content-type: What this thing is... Content-Encoding: What you did to it to get it here.... and Content-conversion: What I did to make the transport conveniet. I proposed the Third header as a compromise between those who want to load up the content-encoding (I oppose because it make automated UA's hard) and recursive content-types (I'm not sure this solves the problem, and it is a real pain for old fashoned users). Example: Content-type: WordPerfect ; 5.1 Content-encoding: Base 64 Content-conversion: Privacy-Enhansed, compressed Time to shoot at a new target :-) Greg V. From owner-ietf-822@dimacs.rutgers.edu Fri Apr 26 12:40:46 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA25205; Fri, 26 Apr 91 11:46:45 EDT Received: from thumper.bellcore.com by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA25201; Fri, 26 Apr 91 11:46:43 EDT Received: from greenbush.bellcore.com by thumper.bellcore.com (4.1/4.7) id for ietf-822@dimacs.rutgers.edu; Fri, 26 Apr 91 11:46:36 EDT Received: by greenbush.bellcore.com (4.12/4.7) id for af%sei.ucl.ac.be@CUNYVM.CUNY.EDU; Fri, 26 Apr 91 11:50:07 edt Received: from Messages.7.14.N.CUILIB.3.45.SNAP.NOT.LINKED.greenbush.mouseclub.sun4.40 via MS.5.6.greenbush.mouseclub.sun4_40; Fri, 26 Apr 1991 11:50:01 -0400 (EDT) Message-Id: Date: Fri, 26 Apr 1991 11:50:01 -0400 (EDT) From: Nathaniel Borenstein To: ietf-822@dimacs.rutgers.edu, "Alain FONTAINE (Postmaster - NAD)" Subject: Re: RFC-XXXX, section 3.1, minor clarification needed. In-Reply-To: <910426.164600.+0200.af@sei.ucl.ac.be> References: <910426.164600.+0200.af@sei.ucl.ac.be> Excerpts from mail: 26-Apr-91 RFC-XXXX, section 3.1, mino.. "Alain FONTAINE @CUNYVM. (1526) > Unless, of course, my weak command of english puts me on the wrong track. No, more likely you read the English more carefully than anyone. Your points are well-taken. Your points about base64 are also good. I agree that the decoder should be more liberal in not requiring 64 characters per line, and I'll add prose to that effect. A checksum might have been a good idea, but it wasn't included in RFC 1113. If we settle on a compressed encoding, though, I assume it will have a checksum... From owner-ietf-822@dimacs.rutgers.edu Fri Apr 26 13:10:46 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA25587; Fri, 26 Apr 91 11:57:27 EDT Received: from thumper.bellcore.com by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA25583; Fri, 26 Apr 91 11:57:23 EDT Received: from greenbush.bellcore.com by thumper.bellcore.com (4.1/4.7) id for ietf-822@dimacs.rutgers.edu; Fri, 26 Apr 91 11:57:19 EDT Received: by greenbush.bellcore.com (4.12/4.7) id for ietf-822@dimacs.rutgers.edu; Fri, 26 Apr 91 12:00:50 edt Received: from Messages.7.14.N.CUILIB.3.45.SNAP.NOT.LINKED.greenbush.mouseclub.sun4.40 via MS.5.6.greenbush.mouseclub.sun4_40; Fri, 26 Apr 1991 12:00:46 -0400 (EDT) Message-Id: Date: Fri, 26 Apr 1991 12:00:46 -0400 (EDT) From: Nathaniel Borenstein To: Greg Vaudreuil Subject: Re: multiple Content-Encoding:'s Cc: David Herron , ietf-822@dimacs.rutgers.edu, gvaudre@nri.reston.va.us In-Reply-To: <9104261104.aa09230@NRI.NRI.Reston.VA.US> References: <9104261104.aa09230@NRI.NRI.Reston.VA.US> Excerpts from mail: 26-Apr-91 Re: multiple Content-Encodi.. Greg Vaudreuil@NRI.Resto (647) > I think I understand, After all these notes..... > Content-type: What this thing is... Yes... > Content-Encoding: What you did to it to get it here.... Yes... ("here" = "in this format") > Content-conversion: What I did to make the transport conveniet. No... :-) > I proposed the Third header as a compromise between those who want to > load up the content-encoding (I oppose because it make automated UA's > hard) and recursive content-types (I'm not sure this solves the problem, > and it is a real pain for old fashioned users). Now you've managed to press ALL my buttons :-) I'm pretty darned sure that recursive content-types do solve the problem: I have yet to hear any examples of how they don't. And I can't even *begin* to understand the "old-fashioned users" comment. I mean, if what you're talking about is what happens with existing UA's, let's face it, NONE of these schemes are going to produce something readable. If you compress the message, it doesn't matter whether the compression is done by 1. Content-type: compressed-encapsulated-message 2. Content-Encoding: compressed 3. Content-Encoding: base64, compressed 4. Content-conversion: compressed It just doesn't matter -- NO UA will be able to make it look nice unless it understands the standard we're defining. If recursive content-types don't solve the problem, tell me what they don't solve. If they cause a problem for people who still like listening to Beatles' albums (a reasonable definition of "old-fashioned people"), tell me why. I just don't see the problem! I have no particular objection to Content-Conversion beyond the fact that it strikes me as totally unnecessary! That's what I need to be convinced about. Feeling Just a Tad Frustrated, A Still Old-Fashioned Nathaniel with the "Help" CD playing at this moment From owner-ietf-822@dimacs.rutgers.edu Fri Apr 26 13:20:15 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA26726; Fri, 26 Apr 91 12:35:18 EDT Received: from RUTGERS.EDU by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA26719; Fri, 26 Apr 91 12:35:11 EDT Received: from nrtc.northrop.com by rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA25890; Fri, 26 Apr 91 12:35:04 EDT Received: from nma.com by nrtc.nrtc.northrop.com id aa23365; 26 Apr 91 8:34 PST Received: from odin.nma.com by nma.com id aa22916; 26 Nov 91 9:18 PST To: Nathaniel Borenstein Cc: David Herron , ietf-822@dimacs.rutgers.edu Subject: Re: multiple Content-Encoding:'s In-Reply-To: Your message of Fri, 26 Apr 91 09:51:32 -0500. Reply-To: Stef@ics.uci.edu From: Einar Stefferud Date: Fri, 26 Apr 91 09:17:30 MDT Message-Id: <6396.672682650@nma.com> Sender: stef@nma.com May I suggest that you really want to establish a small set of "Transfer-Encoding" attribute values, and not a large set of "Content-Encoding" attribute values, so you should simply rename the "Content-Encoding" attribute to be "Transfer-Encoding"! Thus you will be naming it to be what it means, instead of misleading us all with a false name. Just say what you mean, and mean what you say. Voila'...\Stef From owner-ietf-822@dimacs.rutgers.edu Fri Apr 26 13:40:48 1991 Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA27849; Fri, 26 Apr 91 13:07:42 EDT Received: from thumper.bellcore.com by dimacs.rutgers.edu (5.59/SMI4.0/R