Re: Re: Re: Re: Enough is Enough - Taking the fight back to the Internet scammers

Actually, I find it interesting to note that the Bayesian spam filter I use catches these types of emails All Day Long (tm). It seemed a curious thing to me, wondering how it was picking these out from more legitimate email. I started analyzing the emails and realized the highest spam words were being grabbed from the headers. Sure, headers can easily be modified, but most spammers apparently aren't that sophisticated. They use common tools with standard headers (usually advertising the tool they are using), which are very easy for the filter to catch.

As the spammers realize this and start using tools that are harder to catch, there will still be things like MTA versions and hostnames added to the emails along the path. Perhaps certain MTAs with bad default options will begin to stand out as likely spam targets. Perhaps certain IP blocks will begin to stand out, also. Who knows? The great thing is I won't have to think about this. The filter will figure it out automatically.

Comment on Re: Re: Re: Re: Enough is Enough - Taking the fight back to the Internet scammers

Replies are listed 'Best First'.

Re: Re: Re: Re: Re: Enough is Enough - Taking the fight back to the Internet scammers
by tachyon (Chancellor) on Oct 29, 2003 at 00:27 UTC

As you say the headers are extremely valuable (but only at the moment) Because the protocol LETs you forge them eventually this will become the norm, then with a suitably crafted body even Bayes won't cut it anymore.

When we look at some of the tokens that our Bayes widgets work with and find significant we often go 'huh?' The fact that we don't really understand WHY some of these tokens exist does not matter one wit. They are statistically significant and thus at the end of the day Just Work.

The problem is that as Bayes gets more popular the spammers will employ people to analyse how the filters are working and it is fairly easy to find chinks in the armour to slip a knife through to put it in Medieval terms. At that stage I suspect we will be up for a new protocol.

cheers

tachyon

s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print

[reply]


more useful options
	PerlMonks