From: Bruce Lilly (blilly@erols.com)
Date: Sun Feb 23 2003 - 10:43:08 CST
J.B. Moreno wrote:
> On 2/22/03 1:49 AM, Bruce Lilly at <blilly@erols.com> wrote:
>>Detailed in another message recently posted; summary one expects
>>about a 50% percent error rate with iso-8859-x in the mix; a bit
>>more with some other charsets as well.
>
>
> Everything in use on usenet (which definitely includes iso-8859-x), was part
> of "the mix", and as you've been told before, the test *was* done on data
> with short text.
And as expected, about 50% of the sequences which were valid
utf-8 sequences were in fact not utf-8 at all.