From: Jean-Marc Desperrier (jean-marc.desperrier@certplus.com)
Date: Thu Jul 25 2002 - 06:37:40 CDT
Charles Lindsey a dit :
>+ attempt to interpet the header according to whatever other character
>+ set can be deduced, or has been configued as a default by the reader.
>
configured.
>! NOTE: It is possible to determine, with a high degree of
>! accuracy, when a given text containing octets with the 8th bit
>! set was not encoded using UTF-8, and using this test to recover
>! such non-compliant texts is therefore commended where no other
>! harm could arise.
>
Detection that the texte was not encoded as UTF-8 has 100% accuracy.
The important point is that when the given text is not encoded as UTF-8
there's more than 99.9% accuracy in detecting, so this test works
properly to separate UTF-8 and non-UTF-8 encoded texts even in situation
where non UTF-8 encoded text is a lot more frequent than UTF-8 encoded
text.