[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: XML Guidelines -04




Just a short comment about UTF8 x UTF16 :

In my personal opinion I would consider 
the possibility of not supporting UTF-16 unfortunate.

I can see these  problems:

1) let say there is a communication where  end points are UTF-8 
restricted processors. But there is no guaranty that on the way is not 
an intermediary which is using a standard XML 
processor, accepting the message and re-emitting it again, maybe enhanced 
with some other information. If there is no XML declaration specifying 
encoding , the processor can choose which one to use for re-emitted result.

2) there can be a substantial penalty for Asian  and other 
communities not using ASCII related sets. I have seen an estimation that 
an average Chinese text uses about  3 bytes per one UTF-8 character
and so the size of data to be transmitted can rise by 50% just by using 
UTF-8 instead of UTF-16, and I suppose that this penalty may be much worse 
for some other language groups. As I expect that XML protocols will be 
often used for transfer of textual data, which can be quite large, this 
can be a very important criterion. 






-- 
******************************************
<firstName> Miloslav </firstName>
<surname>   Nic      </surname>

<mail>    nicmila@xxxxxxxxxxxx    </mail>
<support> http://www.zvon.org  </support>