in reply to Re: From a SpamAssassin developer
in thread Bayesian Filtering for Spam
The results are (unsurprisingly to me) not as accurate as Paul describes on mixed types of messages.
The most important thing to remember about doing anything with probabilities is to not mix up your training and validation data sets. I get the feeling that Paul isn't doing that in calculating his statistics. I get zero false positives too when I validate against the training data set.
However, on the plus side, the amount of data stored by his system compared to the pure Bayesian one used in AI::Categorize is significantly smaller. So I'll probably switch over to using this one instead.
I'll post some of the code to the SpamAssassin list later today probably, in case someone wants to play with it some more.
|
---|
Replies are listed 'Best First'. | |
---|---|
Re: Re: Re: From a SpamAssassin developer
by Elian (Parson) on Aug 19, 2002 at 07:11 UTC | |
by Matts (Deacon) on Aug 19, 2002 at 16:24 UTC | |
Re: Re: Re: From a SpamAssassin developer
by Anonymous Monk on Aug 20, 2002 at 01:23 UTC |