[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Comments on draft-ietf-smtpext-transition-01.txt
Here are some comments, most of them editorial, on the Internet
Draft "Transition of Internet Mail from Just-Send-8 to
8Bit-SMTP/MIME" (December 16, 1992).
> Abstract
>
> Protocols for extending SMTP to pass 8 bit characters have been
> defined. [3] [4] The messages transported by the extended SMTP are
> required to be encoded in MIME. [1] [2] Several SMTP implementations
+++++++++++++
Actually "MIME" is what is described in [1] only. How to handle
8bit characters in the RFC 822 header of an incoming message is
an important problem, but it is not treated in this Internet
Draft. See also my last comment. (The 8-bit extension
document [4] has document [2] included in its list of
references, but doesn't refer to it in the main text, which
is somewhat peculiar.)
Consequently I think the reference to [2] in this abstract is
misleading and should be removed. Item [2] should be retained
in the list of references though, since the discussion of trace
information refers to it.
> adopted an ad-hoc mechanism for sending 8 bit data prior to these
+++++++++++++++++++
Change to "ad-hoc mechanisms". There are more than one. In
Sweden for example an SMTP extension with negotiated 8-bit
transport is in use since a couple of years, besides the
"just-send-8bit" implementations from some vendors.
> standards and with which the extended SMTP mail system must
++++
I would prefer the less categorical wording "should interoperate
if possible" here. Full interoperability is impossible to
achieve in some cases and may not be worth the effort in some
other cases.
> interoperate. This document outlines the problems in this
> environment and an approach to minimizing the cost of transition.
Add "as regards upgrading of non-MIME 8 bit messages" at the end
of this sentence. This draft doesn't suggest a solution for
incoming 7-bit messages in other character sets than US-ASCII,
it only describes the problem. Neither does it provide a
solution for the interoperability problems in case (5) in the
sender-receiver matrix (downgrading of MIME messages). Both of
this are important problems but may be left to a future RFC.
> 1. Terminology
>
> RFC 821 defines a 7 bit transport. A implementation which does not
+
> clear this bit upon receipt of octet with the high order bit set
++++ ++++++++
> and passes a message unaltered to the user is called 8 bit
+++++++++++
> transparent in this document. An implementation of the general
This definition is somewhat unclear. "this" is a forward
reference. It should apply also to intermediate MTAs that
don't passes the message to the receiving UA. I would suggest
this definition:
"A transport agent which does not clear the high order bit upon
receipt of octets with this bit set in SMTP messages is called
8 bit transparent in this document."
> SMTP Extensions document and the 8bit extensions protocols [3], [4]
++++++++++++++++++
The marker "[3]" should be placed directly after "SMTP
Extensions document". There is only one 8bit extension
protocol so this word should be in the singular form.
> which passes MIME messages using all 8 bits of an octet is called
> 8bit ESMTP. An implementation of extended SMTP which does not
> accept 8bit characters is called 7bit ESMTP.
>
> 2. The Problem
>
> - - -
>
> extend SMTP or RFC 822 to use non-ASCII character sets. The two
> common approaches are to send a 7 bit character set over current
+++++++++++++++++
To the benefit of the innocent reader it should be stated already
hear which 7 bit character sets are used in this way. Change
this to "a national variant of the ISO 646 7-bit character set".
> RFC822/SMTP or to extend SMTP and RFC822 to use 8bit ISO 8859
> character sets. ...
For pedagogical reasons the third approach used in JUNET and
mentioned later in the discussion of case (1) should be
introduced here. Insert the text:
"A third approach is used for Japanese mail. Japanese
characters are represented by pairs of octets with the high
order bit cleared. Switching between 14 bit character sets
and 7 bit character sets is indicated within the message by
ISO 2022 escape sequences."
> ... So long as these implementations can directly
> communicate and have a private agreement on the use of a specific
> character set, without benefit of tagging, basic mail service can
> be provided.
Actually it's not quite as bad as this text indicates. Basic
mail interoperability can be achieved in the absence of both
private agreement and explicit tagging, when US-ASCII is mixed
with one national ISO 646 variant, as the full-scale practical
experience of the Nordic countries shows. (As a matter of fact
this is one reason why it at all was possible for SMTP/RFC822
instead of X.400 to become the de facto email standard in
Northern Europe.) But this aspect is probably not necessary to
cover in this overview of the problem.
> In transitioning to the negotiated 8bit system with MIME messages,
> it is important that mail sent by a currently non-conforming user
> can be read by another such user. This functionality is reduced by
+++++++++
It's unclear to me if "such user" here means "a currently
non-conforming user" or what. In my opinion it's important
that mail can be read _both_ by ESMTP-conforming users and
by non-conforming users (after having passed through
intermediate ESMTP-conforming transport agents). I suggest
the word "such" is dropped.
> (1) Will work acceptably well with ISO 646 national variant ASCII
> or ISO 2022 character set shifting if an external "out of band"
> agreement to use a particular character set without tagging exists
++++++++++++++++++++++++++++++++++++++++++
> between the sender and the receiver.
In the case of ISO 2022 character set shifting there are of
course several character sets used and they are tagged
internally by escape sequences. It might be better to remove
the indicated part of this paragraph.
> (4) Will work if a reasonable upgrade path is provided via gateways
> and the indicated character set tag inserted by the gateway is
> correct and the receiver supports the character set chosen by the
> sender.
Since this document describes interoperability problems in
all the cases (1) - (5) but only provides a solution in one
case, (4), I suggest this addition:
"Such an upgrade path is described in section 3."
> (5) Because the ESMTP/MIME sender cannot know that the receiver
> will understand 8 bits, the sender will encode the text into
> base-64 or quoted-printable which may be considered "garbled" by
> the receiver.
This is the converse problem to case (4). Something should be
said about the fact that it is not attacked in this document,
maybe:
"To provide an useful downgrade path the gateway must have some
knowledge about the capabilities of the receiver. Possible
solutions fall outside the scope of this document."
> 3. Upgrade Path
>
> - - -
>
> A site may "Upgrade" to MIME en-masse by implementing MIME
> conversion for all messages leaving the site. The conversion can
> be done by adding a mime-version header and a content-type/text
> header with the character set in use in the site.
++++++++++++++++++++++++++++++++++++
Nowadays this simple case is becoming rare. Add at the end:
", in case use of other character sets at the site is negligible"
> Example:
>
> MIME-Version: 1.0
> Content-Type: Text/Plain; Charset = "ISO-8859-1"
Why quotes around ISO-8859-1? Since they are unnecessary, this
may be confusing for a naive reader.
> Content-Transfer-Encoding: 8bit
> Content-Description: Untagged text converted to MIME.
I don't think it's a good idea to include the
Content-Description: line in this example. This imformation
can be deduced by the receiving program from the trace
information. Content-Description: fields should, in my opinion,
be used to provide information that the human receiver can
understand and use. The information in this example would be
totally incomprehensible for most receivers.
> If no information is available, the gateway should upgrade the
+++++++
This statement is true also for a site whose MTA upgrades
outgoing messages from non-MIME-compliant UAs to MIME en-masse.
Also, it's not clear what kind of information is lacking in this
case. I suggest this rewording:
"If no information about the probable content type or character
set of an outgoing message is available, the transport agent
should ..."
> content by using the character set "unknown-8bit". Unknown-8bit
I never understood why the value is "unknown-8bit" instead of
plain "unknown". Will a "unknown-7bit" be defined by a future
RFC? The fact that the upgraded message contains octets with
the high order bit set is already reflected by the value of the
Content-Transfer-Encoding: field.
> states that the character set is only understandable by with external
+++++++
This sentence needs some improvement. I suggest:
"The unknown-8bit value of the charset parameter indicates only
that no reliable information about the character set(s) used in
the message was available."
> information. MIME specifies that a MIME message with no character
> set specified is defined to be US-ASCII.
This last sentence may be puzzling to a reader without an
intimate knowledge about MIME. It isn't even certain that it
is completely true. RFC 1341, section 7.1.1, says:
: Unlike some other parameter values, the values of the
: charset parameter are NOT case sensitive. The default
: character set, which must be assumed in the absence of a
: charset parameter, is US-ASCII.
So it doesn't _define_ these MIME-compliant messages to use
any particular character set, it only states what is to be
_assumed_ about the character set (lacking other information).
But I don't think it is necessary to tackle this complicated
question in this document. Very little would be lost be
omitting the last sentence.
> Appendix - The "unknown-8bit" Character Set
>
> - - -
>
> The interpretation of the "unknown-8bit" is up to the mail reader.
> It is assumed that the human user will be able to interpret the
> information and choose an appropriate character set or
> pre-processor.
The last sentence is unnecessarily strong. We can't assume
that _any_ human receiver will _always_ be able to choose an
appropriate character set. Better write: "It is assumed that
in many cases the human user will be ..."
> References
>
> - - -
>
> [4] M.T. Rose, E.A. Stefferud, D.H. Crocker. SMTP Service
> Extensions for 8bit cleanliness. Internet-Draft,
++++++++++++++++
The title has been changed to "SMTP Service Extension for
8bit-MIMEtransport".
Finally I want to say that it's a pity that the handling of
8-bit data in message header fields isn't treated in this
document. Of course octets with the high order bit set are
as frequent on the Subject: line as on the message body lines
in today's "8-bit SMTP" mail, so a gateway that wants to upgrade
them to 8-bit ESMTP must do something with them. The obvious
thing to do is to use the encoding of RFC 1342 when such
characters appear in places in header fields where RFC 822
specifies "text", "ctext", or "phrase". (If they appear in
other places, the message should probably be bounced.)
--
Olle Jarnefors, Royal Institute of Technology, Stockholm <ojarnef@admin.kth.se>