We do Bayesian stats work as part of a different project and have a filter based on that (proprietary I'm afraid) although there is popmail Popfile on sourcforge which is OK.
Bayesian stat analysis is probably one step past Spam Assassin but still has the following inherent problems. These apply to all forms of spam filters. First if the filter is publically available (as it must effectively be to be used) then you can craft spam and test it against the filter(s). Regardless of what they are looking for and how they rate spam messages in the form:
Dear Name
RE: Your recent blah blah blah
Thanks for your enquiry. Blah blah blah. Please take the time to have
+a look at:
http://blah.com/cgi-bin/special_offer?name=Name&code=AGERSDGFTGER
I wish you all the best in your endeavour.
Kind Regards
John Smith
Director
Blah.com
Street Address
Phone Number
Fax Number
Mobile Number
BLAH Making it happen
http://blah.com
foo@blah.com
The information transmitted may be confidential, is intended only for
+the
person to which it is addressed, and may not be reviewed, retransmitte
+d,
disseminated or relied upon by any other persons. If you received this
message in error, please contact the sender and destroy any paper or
electronic copies of this message. Any views expressed in this email
communication are those of the individual sender, except where the
sender specifically states otherwise. Blah does not represent,
warrant or guarantee that the communication is free of errors, virus o
+r
interference.
are statistically next to impossible to pick. The problem with the basic mail protocol is that you can forge headers ie there is no way to validate the sending server. Given this you can more of less craft your emails so they will pass any Spam filter.
Messages like this are the new face of spam. Still spam but crafted to look like a standard valid (perhaps corporate) reply. It will be next to impossible to stop mail in this form.
As a result the challenge response/whitelist passthrough is probably the way it will end up in the medium term. Then of course the spammers will implement respond bots and the cycle will continue.
What is needed is a modification to the underlying protocol so that there is an inbuilt challenge response or security key of some form so that the recipient server can query the supposed sending server to see if it was really the source of the message. If you can do that you can work blacklists of spam servers far more effectively.
cheers
tachyon
s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print
|