[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: IMAP extensions needed for SPAM/HAM and WHITE/BLACK listing




On Mon Jul  6 09:37:06 2009, Iljitsch van Beijnum wrote:

On 6 jul 2009, at 10:24, Dave Cridland wrote:

But if you want this, I'd say that it needs to be a fractional thing, not a binary spam/no spam indication. For instance, the server could give something a spam score of 2 and the client also 2 and together that would be 4 so the message is presumed to be spam (assuming the spam threshold is 3), but in a binary system no spam OR no spam = no spam.

The notion of using two spam filters in concert to attempt to make an overall decision is basically flawed, because it fits into one or other case above - if you do go to the effort of having a range based spamminess from one, you'd have to use it as mere input into the other's decision process, since combining them naïvely would produce poorly weighted results.

Right. The filtering would have to be complimentary. So perhaps there should be an indication of the type of filtering done, too. The reason to combine server and client side approaches is that servers can easily do blacklists, and running any type of filtering on the server gives it the opportunity to reject the message during the SMTP session, creating a good error message for real senders but no useless bounces to innocent third parties in the case of spoofed senders. On the client is useful because the client can typically better afford to run CPU intensive stuff like baysian filters and the client probably has a better list of previous correspondents that it can use to whitelist.

Whitelists should, in principle, be easily comminicable to the server (and easily processed).

Bayesian filters on the client are only really useful if the same client is always used - this is often not the case (and, in my experience, is becoming rarer). Otherwise the quality of your spam filtering radically changes depending on the client you happen to be using at the time.

And basically, any filtering at all that can be done on the server saves a vast amount of bandwidth, and makes use of email on smaller devices (like mobile phones) instantly more useful.

The one massive advantage that a client has is direct contact with the user, and the eyeball remains the best spam filtering technology we have.

I'm not suggesting that clients must not do spam filtering - obviously they do, and it's often useful - but I think we should aim to make server-side filtering the best we can. A very simple uniform feedback mechanism of whether the user thinks a message is spam or not makes this very much easier.

Dave.
--
Dave Cridland - mailto:dave@xxxxxxxxxxxx - xmpp:dwd@xxxxxxxxxxxxxxxxx
 - acap://acap.dave.cridland.net/byowner/user/dwd/bookmarks/
 - http://dave.cridland.net/
Infotrope Polymer - ACAP, IMAP, ESMTP, and Lemonade