[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Prohibition of EBCDIC in text/plain



At 2:34 PM 6/8/95, Patrik Faltstrom wrote:
>At 09.35 95-06-08, Harald.T.Alvestrand@uninett.no wrote:
>>This forbids, among others, ISO 10646 UCS-2 and EBCDIC as text/plain
>>character sets.
>
>I assumed that what is called UCS-2 is the same thing as what
>in "The Unicode Standard, Version 1.1, Appendix F", is called
>FSS-UTF, i.e. Filesystem Safe UCS Transformation Format.
>
>If this is true, I read the encoding rules in a way that all
>characters 0x00 to 0x7F is encoded as themselves (as one-byte
>characters) and that all other characters is encoded in
>two, three, four, five or six byte characters. All the bytes
>in the multibyte characters have their 8:th bit set.
>
>By using this encoding, this is to me actually an encoding which
>can be sent as a text/plain message, because a 'CR', 'LF'
>and NULL are encoded as themselves and those bit-patterns does
>not exist in the multibyte encoding of the other characters.
>
>If I am wrong, please let me know.
>

No, UCS-2 is the 16 bit form of Unicode. FSS-UTF is now called UTF-8, and
it's an official annex to ISO 10646. You are correct, UTF-8 is compatible
with the MIME text/plain content type. So is UTF-7 (see RFC 1642). However,
straight Unicode (UCS-2), EBCDIC, and other character sets which do not
contain US-ASCII as a subset are not compatible, unfortunately.

----------------------------
David Goldsmith
david_goldsmith@taligent.com
Senior Scientist
Taligent, Inc.
10201 N. DeAnza Blvd.
Cupertino, CA  95014-2233