[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Some text that may be useful for the update of RFC 2376
On Thu, 16 Mar 2000, Martin J. Duerst wrote:
> Please be careful. There are no multiple flavors of Unicode.
> There are multiple flavors of conversion tables between
> Japanese legacy encodings and Unicode.
I take your point, but it sidesteps the issue of
what someone is supposed to do when they type in Yen and
their "UTF-*"-labelled data comes out with the codepoint
for "/" being used. In that case, strictly they have
used the wrong mapping table and they have corrupted their
data; but if we can give them a way to escape into the
bliss of standard Unicode by labelling the variant encoding
they have effectively used.
The proposed Japanese Profile for XML, which Murata-san
has been the leading light, says that there needs to be
extra IANA-registered sets to cover this problem.
I don't think anyone is proposing the variant encoding
as a thing to be recommended. But, in the context of
XML, an IANA-registered name gives a way to uncorrupt
the data: rather than legitimizing the variant encoding
I think it makes it explicit that the data has a particular
problem that requires a particular remedy (i.e. transcoding
the couple of characters that are at issue.)
The other alternative is for the Unicode consortium
to make the code position occupied by "/" also be
occupied by yen, as a strange kind of variant
which only applies in Japanese-sourced data, I suppose.
Not very attractive!