Re: Transformation of Non-ASCII headers

New Message Reply About this list Date view Thread view Subject view Author view

From: J.B. Moreno (planb@newsreaders.com)
Date: Sat Feb 22 2003 - 08:58:48 CST


On 2/22/03 1:49 AM, Bruce Lilly at <blilly@erols.com> wrote:

> Andrew Gierth wrote:
>
>> Failing to pay attention when your statistical error is explained to
>> you is a good sign that you're not really interested in the truth, and
>> you only want to extract figures that support your preselected
>> position.
>
> I have in fact paid quite close attention; I simply disagree.
>
>> Bruce> the ratio of false positives is due to the fact that coded
>> Bruce> utf-8 generates octet sequences which are not markedly
>> Bruce> different from other 8-bit charsets, especially on short
>> Bruce> texts.
>
> Detailed in another message recently posted; summary one expects
> about a 50% percent error rate with iso-8859-x in the mix; a bit
> more with some other charsets as well.

Everything in use on usenet (which definitely includes iso-8859-x), was part
of "the mix", and as you've been told before, the test *was* done on data
with short text.

-- 
J.B. Moreno


New Message Reply About this list Date view Thread view Subject view Author view


This archive was generated by hypermail 2b29.