[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Reviewing philosophies and assumptions
Some notes to Akira Katos message about the Japanese use of ISO
2022 code extension for mail:
* That Japanese mail only uses the G0 character set means that
the only way to switch between the three different character
codes are to use the three-character escape sequences
ESC 2/8 4/2 to ASCII
ESC 2/8 4/10 to JIS-Roman
ESC 2/4 4/2 to JIS-Kanji
(In hex the escape sequences are 1B 28 42, 1B 28 4A, 1B 24 42.)
Other ways of switching between character codes in ISO 2022,
e.g. the use of SO and SI, are not utilized.
* JIS-Roman is an ordinary ISO 646-conformant 7-bit code. It
has only one important difference from 7-bit ASCII: The yen
sign is available (at the expense of backslash).
* JIS-Kanji is a 7+7-bit character code. After the three-byte
escape sequence ESC 2/4 4/2 each pair of bytes is interpreted
as one of 6877 characters according to the Japanese standard
JIS C 6226-1983.
* Since bytes corresponding to control characters in ASCII are
not used after switching to JIS-Kanji by ESC 2/4 4/2, one has
to switch back to ASCII by ESC 2/8 4/2 before ending the
line by CR LF. This means that each line starts in the state