[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
The Next Step
Folks,
In this message I will attempt to distill the discussions on this
list, and focus them a bit.
1) I would like to follow up on Jan Michael Rynning's comment that
8bit/no line length restrictions are not binary. It is becoming
clear that one possible goal or sending binary via SMTP is not as
simple as it at first appeared to many of us. Besides the CRLF
translations, the issues of implementations being written assuming the
line length present real obstacles to deploying modified mailers.
We may wish to define a second new transmission mode, BINARY, to
open the door to multi-media mail. At this point, someone needs to
convince me and this list that this is worth the rather extensive
overhaul of the system. For me, this is becoming the 821/822 <=>
X.400 breaking point.
In all circumstances, any change to the standards that is a
violation of current specifications, must be negotiated. This may
be by version number, or by explicit feature negotiation. This is
just plain good sense, and is likely to be enforced by the IESG or
the IAB.
2) We need to define a new standard character set to replace US ASCII.
To answer John Klensin, it is of my own belief that using SMTP to
shift character sets in and out is inappropriate. That is much
more a message format level function. Now, I'm not opposed to
using a character set that itself escape-shifts pages in and out, or a
multi-byte character set that retains ascii compatibility. (whatever
that means)
a) My personal preference is to have a character set that is not limited to
western characters (latin-1), however, if someone makes a good
argument, I may be led to believe that multi-byte character sets
can( and should?) be encoded in the standard 7 or 8 bit character set.
b) With the 8 bit systems, a standard mechanism should be defined
in SMTP transport to convert into a 7 bit representation without data
loss. This is important. Information loss will prevent bits
systems from ever being used for efficient transport, or
multi-media, and encoded data because 7 bit systems will continue to
exist.
I have not heard much discussion on how to do this. From
an "old" UA point of view, most of the conversions I've heard
of that cause no info loss will be totally unintelligible. At
lease I can guess at missing letters in text with info loss.
Current ideas are Rynning's TEX-HEX and ISO 2022 (??). I would
welcome a summary of available encoding technology and ideas.
Implicit in these ideas is a determination on the primary type of use
this system will have, and whether is it optimized for
human-readability, or for efficiency of data transport. Specific
ideas on these trade-offs and encodings are solicited
3) I would like to put up the one paragraph strawman. If this is
acceptable, I will put it in the tentatively decided pile.
"Current changes to SMTP will include the elimination of the 7 bit
restriction in text, and the specification of a character set to
represents the 8 bit information. The specification of a Binary
mode for SMTP is the subject for possible future work. Binary data,
and character sets unsupported by the standard character set will
be handled by encoding defined in the message format documents"
Again I'm looking for specific ideas,
1) Is a decision to change the SMTP specification to use 8 bit
w/ no line length changes acceptable?
2) A summary(s) of available character sets, as well as an
evaluation of how (if at all) they handle non-western characters.
3) An encoding mechanism to convert from 7bit to 8 bit systems
with no data loss.
Thanks,
Greg Vaudreuil
Internet Mail Extensions Chair.