[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Unknown character sets
On Sun, 22 Nov 1992, Jonathan Laventhol wrote:
> There are these cases where the receiver gateway gets a message
> and has to convert to MIME:
>
> 1) Receiver knows nothing, except 7-bit
> If the sender is in the Internet (whatever that
> might mean) it is forced by RFC-1341 to regard this as
> "Content-type: text/plain; Charset=US-ASCII",
> and it specifically notes that it might be
> some horrible uuencoded thing: we are still
> going to regard it as text -- because it
> *is* text to us (albeit funny text describing
> some binary file).
>
> If the sender isn't in the Internet, then
> the gateway is also a not-necessarily-822 to
> MIME gateway, in which case it might want
> (according to Greg's suggestion) to send
> "Content-type: text/plain; Charset=unknown".
> This seems fine by me; see stuff below about
> the MUA.
You are right when you say that the sender is enforced by RFC-1341 to
only send "Content-type: text/plain; Charset=US-ASCII", but that
is not the fact in reality. I think the only way of labeling a message
is to always use "Content-type: text/plain; Charset=unknown". It has
nothing to do with the type of connection the sender has to the Internet.
The problem is all of the 646-codes which are used by email in swedish,
and US-ASCII if the letter contains english. So, the Mail Gateway has to
differ between an email with swedish or english, and that can not be done,
except if you look for certain words and "guess" what the mail contains.
In fact, Peter and Olle are working on a program like that.
If all unknown email is labelled Charset=US-ASCII, you might be lying
about the contents, but if you write Chraset=unknown (or even better:
Charset=unknown-7bit) and then in your MIME-client treat that just like
Charset=US-ASCII, you'll get the same result, but in a more gentle way.
In the same way you can have Charset=unknown-8bit and treat that as
Charset=ISO-8859-1.
This is if you don't have any idea of the kind of character set the sender
has used.
Patrik F{ltstr|m
NADA, KTH
Stockholm, Sweden