[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Comments on draft-ietf-smtpext-transition-01.txt



Here are some comments, most of them editorial, on the Internet 
Draft "Transition of Internet Mail from Just-Send-8 to 
8Bit-SMTP/MIME" (December 16, 1992).

> Abstract
> 
>     Protocols for extending SMTP to pass 8 bit characters have been
>     defined. [3] [4] The messages transported by the extended SMTP are
>     required to be encoded in MIME. [1] [2]  Several SMTP implementations
                                +++++++++++++
Actually "MIME" is what is described in [1] only.  How to handle 
8bit characters in the RFC 822 header of an incoming message is 
an important problem, but it is not treated in this Internet 
Draft.  See also my last comment.  (The 8-bit extension 
document [4] has document [2] included in its list of 
references, but doesn't refer to it in the main text, which 
is somewhat peculiar.)

Consequently I think the reference to [2] in this abstract is 
misleading and should be removed.  Item [2] should be retained 
in the list of references though, since the discussion of trace 
information refers to it.

>     adopted an ad-hoc mechanism for sending 8 bit data prior to these
              +++++++++++++++++++
Change to "ad-hoc mechanisms".  There are more than one.  In 
Sweden for example an SMTP extension with negotiated 8-bit 
transport is in use since a couple of years, besides the 
"just-send-8bit" implementations from some vendors.

>     standards and with which the extended SMTP mail system must
                                                             ++++
I would prefer the less categorical wording "should interoperate 
if possible" here.  Full interoperability is impossible to 
achieve in some cases and may not be worth the effort in some 
other cases.

>     interoperate.  This document outlines the problems in this
>     environment and an approach to minimizing the cost of transition.

Add "as regards upgrading of non-MIME 8 bit messages" at the end 
of this sentence.  This draft doesn't suggest a solution for 
incoming 7-bit messages in other character sets than US-ASCII, 
it only describes the problem.  Neither does it provide a 
solution for the interoperability problems in case (5) in the 
sender-receiver matrix (downgrading of MIME messages).  Both of 
this are important problems but may be left to a future RFC.

> 1. Terminology
> 
>     RFC 821 defines a 7 bit transport.  A implementation which does not
                                          +
>     clear this bit upon receipt of octet with the high order bit set
            ++++                  ++++++++
>     and passes a message unaltered to the user is called 8 bit
                                     +++++++++++
>     transparent in this document.  An implementation of the general

This definition is somewhat unclear.  "this" is a forward 
reference.  It should apply also to intermediate MTAs that 
don't passes the message to the receiving UA.  I would suggest 
this definition:

"A transport agent which does not clear the high order bit upon 
receipt of octets with this bit set in SMTP messages is called 
8 bit transparent in this document."

>     SMTP Extensions document and the 8bit extensions protocols [3], [4]
                                                       ++++++++++++++++++
The marker "[3]" should be placed directly after "SMTP 
Extensions document".  There is only one 8bit extension 
protocol so this word should be in the singular form.

>     which passes MIME messages using all 8 bits of an octet is called
>     8bit ESMTP.  An implementation of extended SMTP which does not
>     accept 8bit characters is called 7bit ESMTP.
> 
> 2. The Problem
> 
>     - - -
> 
>     extend SMTP or RFC 822 to use non-ASCII character sets.  The two
>     common approaches are to send a 7 bit character set over current
                                    +++++++++++++++++
To the benefit of the innocent reader it should be stated already 
hear which 7 bit character sets are used in this way.  Change 
this to "a national variant of the ISO 646 7-bit character set".

>     RFC822/SMTP or to extend SMTP and RFC822 to use 8bit ISO 8859
>     character sets. ...

For pedagogical reasons the third approach used in JUNET and 
mentioned later in the discussion of case (1) should be 
introduced here.  Insert the text:

"A third approach is used for Japanese mail.  Japanese 
characters are represented by pairs of octets with the high 
order bit cleared. Switching between 14 bit character sets
and 7 bit character sets is indicated within the message by 
ISO 2022 escape sequences."

>     ... So long as these implementations can directly
>     communicate and have a private agreement on the use of a specific
>     character set, without benefit of tagging, basic mail service can
>     be provided.

Actually it's not quite as bad as this text indicates.  Basic 
mail interoperability can be achieved in the absence of both 
private agreement and explicit tagging, when US-ASCII is mixed 
with one national ISO 646 variant, as the full-scale practical 
experience of the Nordic countries shows.  (As a matter of fact 
this is one reason why it at all was possible for SMTP/RFC822 
instead of X.400 to become the de facto email standard in 
Northern Europe.)  But this aspect is probably not necessary to 
cover in this overview of the problem.

>     In transitioning to the negotiated 8bit system with MIME messages,
>     it is important that mail sent by a currently non-conforming user
>     can be read by another such user.  This functionality is reduced by
                             +++++++++
It's unclear to me if "such user" here means "a currently 
non-conforming user" or what.  In my opinion it's important 
that mail can be read _both_ by ESMTP-conforming users and 
by non-conforming users (after having passed through 
intermediate ESMTP-conforming transport agents).  I suggest 
the word "such" is dropped.

>     (1) Will work acceptably well with ISO 646 national variant ASCII
>     or ISO 2022 character set shifting if an external "out of band"
>     agreement to use a particular character set without tagging exists
                       ++++++++++++++++++++++++++++++++++++++++++
>     between the sender and the receiver.

In the case of ISO 2022 character set shifting there are of 
course several character sets used and they are tagged 
internally by escape sequences.  It might be better to remove 
the indicated part of this paragraph.

>     (4) Will work if a reasonable upgrade path is provided via gateways
>     and the indicated character set tag inserted by the gateway is
>     correct and the receiver supports the character set chosen by the 
>     sender. 

Since this document describes interoperability problems in 
all the cases (1) - (5) but only provides a solution in one 
case, (4), I suggest this addition:
"Such an upgrade path is described in section 3."

>     (5) Because the ESMTP/MIME sender cannot know that the receiver
>     will understand 8 bits, the sender will encode the text into
>     base-64 or quoted-printable which may be considered "garbled" by
>     the receiver. 

This is the converse problem to case (4).  Something should be 
said about the fact that it is not attacked in this document, 
maybe:
"To provide an useful downgrade path the gateway must have some 
knowledge about the capabilities of the receiver.  Possible 
solutions fall outside the scope of this document."

> 3. Upgrade Path
> 
>     - - -
> 
>     A site may "Upgrade" to MIME en-masse by implementing MIME
>     conversion for all messages leaving the site.  The conversion can
>     be done by adding a mime-version header and a content-type/text
>     header with the character set in use in the site. 
                  ++++++++++++++++++++++++++++++++++++
Nowadays this simple case is becoming rare.  Add at the end:
", in case use of other character sets at the site is negligible"

>     Example:
> 
>        MIME-Version: 1.0
>        Content-Type: Text/Plain; Charset = "ISO-8859-1"  

Why quotes around ISO-8859-1?  Since they are unnecessary, this 
may be confusing for a naive reader.

>        Content-Transfer-Encoding: 8bit
>        Content-Description: Untagged text converted to MIME.

I don't think it's a good idea to include the 
Content-Description: line in this example.  This imformation 
can be deduced by the receiving program from the trace 
information.  Content-Description: fields should, in my opinion, 
be used to provide information that the human receiver can 
understand and use.  The information in this example would be 
totally incomprehensible for most receivers.

>     If no information is available, the gateway should upgrade the
                                          +++++++
This statement is true also for a site whose MTA upgrades 
outgoing messages from non-MIME-compliant UAs to MIME en-masse. 
Also, it's not clear what kind of information is lacking in this 
case.  I suggest this rewording:
"If no information about the probable content type or character 
set of an outgoing message is available, the transport agent 
should ..."

>     content by using the character set "unknown-8bit".  Unknown-8bit

I never understood why the value is "unknown-8bit" instead of 
plain "unknown".  Will a "unknown-7bit" be defined by a future 
RFC?  The fact that the upgraded message contains octets with 
the high order bit set is already reflected by the value of the 
Content-Transfer-Encoding: field.

>     states that the character set is only understandable by with external
                                                           +++++++
This sentence needs some improvement.  I suggest:
"The unknown-8bit value of the charset parameter indicates only 
that no reliable information about the character set(s) used in 
the message was available."

>     information. MIME specifies that a MIME message with no character
>     set specified is defined to be US-ASCII.

This last sentence may be puzzling to a reader without an 
intimate knowledge about MIME.  It isn't even certain that it 
is completely true.  RFC 1341, section 7.1.1, says:

:             Unlike some  other  parameter  values,  the  values  of  the
:             charset  parameter  are  NOT  case  sensitive.   The default
:             character set, which must be assumed in  the  absence  of  a
:             charset parameter, is US-ASCII.

So it doesn't _define_ these MIME-compliant messages to use 
any particular character set, it only states what is to be 
_assumed_ about the character set (lacking other information). 
But I don't think it is necessary to tackle this complicated 
question in this document.  Very little would be lost be 
omitting the last sentence.

> Appendix - The "unknown-8bit" Character Set
> 
>    - - -
> 
>    The interpretation of the "unknown-8bit" is up to the mail reader.
>    It is assumed that the human user will be able to interpret the
>    information and choose an appropriate character set or
>    pre-processor.

The last sentence is unnecessarily strong.  We can't assume 
that _any_ human receiver will _always_ be able to choose an 
appropriate character set.  Better write: "It is assumed that 
in many cases the human user will be ..."

> References
> 
> - - -
> 
>     [4]  M.T. Rose, E.A. Stefferud, D.H. Crocker.  SMTP Service
>          Extensions for 8bit cleanliness.  Internet-Draft, 
		          ++++++++++++++++
The title has been changed to "SMTP Service Extension for 
8bit-MIMEtransport".

Finally I want to say that it's a pity that the handling of 
8-bit data in message header fields isn't treated in this 
document.  Of course octets with the high order bit set are 
as frequent on the Subject: line as on the message body lines 
in today's "8-bit SMTP" mail, so a gateway that wants to upgrade 
them to 8-bit ESMTP must do something with them.  The obvious 
thing to do is to use the encoding of RFC 1342 when such 
characters appear in places in header fields where RFC 822 
specifies "text", "ctext", or "phrase".  (If they appear in 
other places, the message should probably be bounced.)

--
Olle Jarnefors, Royal Institute of Technology, Stockholm <ojarnef@admin.kth.se>