[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: IMAP extensions needed for SPAM/HAM and WHITE/BLACK listing




On Sun Jul  5 20:03:30 2009, Iljitsch van Beijnum wrote:
On 3 jul 2009, at 1:05, George Michaelson wrote:

As an example, at the moment if I wish to inform google that I have known spam in local folders, I have to go to the web interface and manually tag. If there was an IMAP extension, I could review my local baysian junk folder, remove all non-spam (and flag the senders as white-listed if need be), and request the rest to be flagged as spam back on the IMAP backed MS.

Doesn't moving the spam messages to the spam folder accomplish this already? That's what my client does.


There's also a proposal (expired?) to use keywords to signal spamminess.

But if you want this, I'd say that it needs to be a fractional thing, not a binary spam/no spam indication. For instance, the server could give something a spam score of 2 and the client also 2 and together that would be 4 so the message is presumed to be spam (assuming the spam threshold is 3), but in a binary system no spam OR no spam = no spam.

Of course, if both client and server use precisely the same criteria, you've simply halved your threshold.

There's two cases:

1) The server has some feedback-based spam detection mechanism, like a bayesian filter. You want to teach the server's filter about your explicit decisions.

2) Your client has a spam filter (or some sort), you want the server to tell your client about it's preliminary findings.

The notion of using two spam filters in concert to attempt to make an overall decision is basically flawed, because it fits into one or other case above - if you do go to the effort of having a range based spamminess from one, you'd have to use it as mere input into the other's decision process, since combining them naïvely would produce poorly weighted results.

A simple keyword approach isn't quite as good as ranges, but it does have the useful property that nearly every service already offers arbitrary keywords, and so is quite likely to already offer it.

Ranges, on the other hand, require the use of annotations, which is a rather more complex area.

Dave.
--
Dave Cridland - mailto:dave@xxxxxxxxxxxx - xmpp:dwd@xxxxxxxxxxxxxxxxx
 - acap://acap.dave.cridland.net/byowner/user/dwd/bookmarks/
 - http://dave.cridland.net/
Infotrope Polymer - ACAP, IMAP, ESMTP, and Lemonade