[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Whether 8-bit SMTP? And how?



> I don't want extended-SMTP to worry about character sets.  First of all,
> there's no need for this information to be kept in the message envelope unless
> SMTP implementations are going to try to convert between character sets.  This
> is a bad idea.  Consider that a message frequently passes through several
> SMTPs before reaching its destination.  Now consider that SMTP-1 might support
> character sets A, and B, SMTP-2, B, and C, and SMTP-3 , A and C.  Now a
> message in character set A sent through this path is converted from character
> set A to B, then from B to C, and finally, perhaps, from C to A (in a
> desperate attempt to restore the original).  Each of these conversions is
> likely to result in a loss of information.  It's far better if the
> intermediate SMTPs simply *copy bits*, and if any necessary conversions take
> place on the ends (during posting/delivery or by the UAs).

There are surely ways of doing character set conversions
which are guaranteed to not lose information. I think your
arguments are building on false assumptions.

I think your simple solutions "simply" leads to simple chaos.
A lot of people would simply not be able to read messages coming
from other places. We simply lose the current interoperability
of internet mail, IMHO.

> * How to convert from 8-bit to 7-bit?
> 
> Most of the discussions I've seen on this list with respect to 8-to-7 bit
> conversion have assumed that the conversion should take place with no loss of
> information, usually by having the sender-SMTP encode the entire message as
> 7-bit characters, to be decoded somewhere down the line.  Once again, this
> makes SMTP complicated.  Instead of having to deal with a single type of
> message, SMTPs would have to keep track of what kind of message is being sent.
> It has to know, for instance, whether a message being sent is already in 7-bit
> format, so it won't try to specially encode an already 8-bit clean message. 
> Furthermore, it should probably try to distinguish between a message that is
> 8-bits encoded in 7-bits, and another that is plain 7-bits, so it can perform
> the reverse encoding when receiving a message from a 7-bit-only system.  This
> requires that the receiving-SMTP parse the message header of an incoming
> message to determine whether the message is encoded -- it cannot do the
> conversion "on-the-fly"  (otherwise, how does a present-day SMTP
> implementation, that knows nothing about message encodings, tell the receiver
> SMTP how this message is encoded).

Still some false assumptions: This can be handled quite easily.
You could decode the 7-bit code into some intermediate form
and then encode it into 7-bit (or 8-bit) again. This could be done
on the fly.

A present-day dumb SMTP implementation could just forward the
header telling the encoding to the receiver SMTP, this is the default
for headers.

> My proposal is as follows:  if a sender-SMTP finds that the receiver-SMTP
> cannot accept 8-bit SMTP, it should (a) zero the 0x80 bit on all bytes sent,
> and (b) add a header something like:
> 
> Data-Conversion-Warning:  0x80 bit stripped while sending message from
>     smarthost.com to stupid-7-bit-only-host.edu.  Some information may be lost.

Can't we really do something better (sigh).

Keld Simonsen