[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Reviewing philosophies and assumptions



Some notes to Akira Katos message about the Japanese use of ISO 
2022 code extension for mail:

*  That Japanese mail only uses the G0 character set means that 
   the only way to switch between the three different character 
   codes are to use the three-character escape sequences
   ESC 2/8 4/2    to ASCII
   ESC 2/8 4/10   to JIS-Roman
   ESC 2/4 4/2    to JIS-Kanji
   (In hex the escape sequences are 1B 28 42, 1B 28 4A, 1B 24 42.)
   Other ways of switching between character codes in ISO 2022, 
   e.g. the use of SO and SI, are not utilized.

*  JIS-Roman is an ordinary ISO 646-conformant 7-bit code.  It 
   has only one important difference from 7-bit ASCII:  The yen 
   sign is available (at the expense of backslash).

*  JIS-Kanji is a 7+7-bit character code.  After the three-byte 
   escape sequence ESC 2/4 4/2 each pair of bytes is interpreted 
   as one of 6877 characters according to the Japanese standard 
   JIS C 6226-1983.

*  Since bytes corresponding to control characters in ASCII are 
   not used after switching to JIS-Kanji by ESC 2/4 4/2, one has 
   to switch back to ASCII by ESC 2/8 4/2 before ending the 
   line by CR LF.  This means that each line starts in the state 
   "ASCII".