[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[ISSUE] UTF-8 CRLF



The last sentence of section 5.9 reads:

  Text data is stored with <CR><LF> text endings (i.e. network-normal
  line endings).  These should be converted to native line endings by
  the receiving software.

Suggest to add:

  For the 'u' UTF8 literal packet, the minimal UTF8 encoding for the
  <CR><LF> line endings SHOULD be used.  That is, 0x0D 0x0A and not
  0xC0 0x8D 0xC0 0x8A or other multibyte encodings.

Rationale:

This would be a kindness for those implementations that will not be
doing UTF8->local conversions.  Lacking a UTF8 decoder, those
implementations cannot tell that "0xC0 0x8D" or other encodings is
identical to 0x0D.  If senders are careful in this regard, then the
non-UTF8 implementations can at least get the line endings right.

David