[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Some text that may be useful for the update of RFC 2376



In message "Re: Some text that may be useful for the update of RFC 2376",
Rick Jelliffe wrote...

 >That is not the way I remember it. Application/xml was taken out at
 >one stage, and we had to lobby hard to get it put in. The reason
 >users want it is not to prevent xml from being spuriously displayed:
 >it is to ensure end-to-end integrity because the out-of-band approach
 >has so far failed to provide that integrity. 

I remember that you have claimed this several times, but I have never 
agreed.  

XML people thought that application/xml is required since text/* does 
not allow UTF-16.  After we started this mailing list, we have learned 
more from Ned.

 >And what should the type be if it has parts that are readable and
 >parts that are unreadable?  If the document is encoded with a lot
 >of numeric character references, it is unreadable as text/plain
 >but readable as text/xml: should we send documents that use 
 >many numeric character entities as application/xml?
 >
 >We need a way to ensure end-to-end integrity. 

I do not agree.  Why?

 >Out-of-band signalling of the encoding of a file to some extent a hack to
 >cope with formats that are not adequately self-describing.  I would have
 >no problem with removing text/xml entirely: we don't need to negotiate
 >encoding since everything can be resolved into Unicode 

Although XML is based on Unicode, we certainly require negotiation and 
transcoding.  Some XML processors can only handle US-ASCII, 8859-1, UTF-8, 
and UTF-16.  There are so many legacy encodings in the world.  Negotiation 
and on-the-fly transcoding is certainly a good thing for I18N.  If we hide 
encoding information from the protocol, such transcoding becomes hard or 
even impossible.

 >(and, in any case,
 >there is no mechanism currently for an XML parser to feed information
 >about which encodings it accepts to the HTTP system to set up the 
 >preferences in the first place.)

HTTP already has the accept-charset field.  I do not understand your claim.

Cheers,

----
MURATA Makoto  muraw3c@xxxxxxxxxxxxx