Beefy Boxes and Bandwidth Generously Provided by pair Networks Joe
go ahead... be a heretic
 
PerlMonks  

Re^2: Most of the email spam I get is:

by MarkusLaker (Beadle)
on Jan 05, 2005 at 00:47 UTC ( #419476=note: print w/ replies, xml ) Need Help??


in reply to Re: Most of the email spam I get is:
in thread Most of the email spam I get is:

My experiments with Bayesian filtering were a wash; after training ifile on my entire very large corpus of mail, I found that I had to continually go through the whole spam bin for false positives.

I did the same thing when I first came to Bayesian filtering, but that's not the way to get the best results out of it. Filtering is more accurate if you simply correct its mistakes as they occur than if you preload it with an existing corpus.

There's much more information about Bayesian filtering at Paul Graham's site.

Markus


Comment on Re^2: Most of the email spam I get is:
Re: Most of the email spam I get is:
by jonadab (Parson) on Jan 05, 2005 at 22:14 UTC
    Filtering is more accurate if you simply correct its mistakes as they occur

    If I have to correct false positives as the occur, this so-called "filtering" is no good to me at all, because it means I have to go through all the spam. Worse than useless. My existing filtering system is significantly better, because I am confident that 100.000% of everything filtered into the spam folders is, in fact, worthless junk. Additionally, *most* of my legitimate mail is filtered into various spam-free folders based on topic, list, sender or whatever. The only mail I have to sort by hand is the stuff that lands in my inbox (because none of my filters pick it up).

    I don't want to correct my filter's errors continually. If I have to do that, it's not doing its job at ALL; *I* would be doing 100% of the filter's job, then.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://419476]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others studying the Monastery: (7)
As of 2014-04-17 04:42 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    April first is:







    Results (439 votes), past polls