Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

Re: Re: Enough is Enough - Taking the fight back to the Internet scammers

by castaway (Parson)
on Oct 28, 2003 at 08:28 UTC ( [id://302657]=note: print w/replies, xml ) Need Help??


in reply to Re: Enough is Enough - Taking the fight back to the Internet scammers
in thread Enough is Enough - Taking the fight back to the Internet scammers

I'm slowly getting the feeling that things like SpamAssassin arent enough. Recently Ive been getting several bits of spam that only turn up a 1.2->3.0 on the SA scale, and so still get delivered.. (I bet I get real mail that has a worse count than that..)..

So much so that I'm considering making a list from which I will accept mail, and getting everything else directed to a delete box, where it will be deleted if I dont add the address to my list..

(Hmm,m wonder if anyone has done this already..)

C.

  • Comment on Re: Re: Enough is Enough - Taking the fight back to the Internet scammers

Replies are listed 'Best First'.
Re: Re: Re: Enough is Enough - Taking the fight back to the Internet scammers
by tachyon (Chancellor) on Oct 28, 2003 at 11:29 UTC

    We do Bayesian stats work as part of a different project and have a filter based on that (proprietary I'm afraid) although there is popmail Popfile on sourcforge which is OK.

    Bayesian stat analysis is probably one step past Spam Assassin but still has the following inherent problems. These apply to all forms of spam filters. First if the filter is publically available (as it must effectively be to be used) then you can craft spam and test it against the filter(s). Regardless of what they are looking for and how they rate spam messages in the form:

    Dear Name RE: Your recent blah blah blah Thanks for your enquiry. Blah blah blah. Please take the time to have +a look at: http://blah.com/cgi-bin/special_offer?name=Name&code=AGERSDGFTGER I wish you all the best in your endeavour. Kind Regards John Smith Director Blah.com Street Address Phone Number Fax Number Mobile Number BLAH Making it happen http://blah.com foo@blah.com The information transmitted may be confidential, is intended only for +the person to which it is addressed, and may not be reviewed, retransmitte +d, disseminated or relied upon by any other persons. If you received this message in error, please contact the sender and destroy any paper or electronic copies of this message. Any views expressed in this email communication are those of the individual sender, except where the sender specifically states otherwise. Blah does not represent, warrant or guarantee that the communication is free of errors, virus o +r interference.

    are statistically next to impossible to pick. The problem with the basic mail protocol is that you can forge headers ie there is no way to validate the sending server. Given this you can more of less craft your emails so they will pass any Spam filter.

    Messages like this are the new face of spam. Still spam but crafted to look like a standard valid (perhaps corporate) reply. It will be next to impossible to stop mail in this form.

    As a result the challenge response/whitelist passthrough is probably the way it will end up in the medium term. Then of course the spammers will implement respond bots and the cycle will continue.

    What is needed is a modification to the underlying protocol so that there is an inbuilt challenge response or security key of some form so that the recipient server can query the supposed sending server to see if it was really the source of the message. If you can do that you can work blacklists of spam servers far more effectively.

    cheers

    tachyon

    s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print

      Actually, I find it interesting to note that the Bayesian spam filter I use catches these types of emails All Day Long (tm). It seemed a curious thing to me, wondering how it was picking these out from more legitimate email. I started analyzing the emails and realized the highest spam words were being grabbed from the headers. Sure, headers can easily be modified, but most spammers apparently aren't that sophisticated. They use common tools with standard headers (usually advertising the tool they are using), which are very easy for the filter to catch.

      As the spammers realize this and start using tools that are harder to catch, there will still be things like MTA versions and hostnames added to the emails along the path. Perhaps certain MTAs with bad default options will begin to stand out as likely spam targets. Perhaps certain IP blocks will begin to stand out, also. Who knows? The great thing is I won't have to think about this. The filter will figure it out automatically.

        As you say the headers are extremely valuable (but only at the moment) Because the protocol LETs you forge them eventually this will become the norm, then with a suitably crafted body even Bayes won't cut it anymore.

        When we look at some of the tokens that our Bayes widgets work with and find significant we often go 'huh?' The fact that we don't really understand WHY some of these tokens exist does not matter one wit. They are statistically significant and thus at the end of the day Just Work.

        The problem is that as Bayes gets more popular the spammers will employ people to analyse how the filters are working and it is fairly easy to find chinks in the armour to slip a knife through to put it in Medieval terms. At that stage I suspect we will be up for a new protocol.

        cheers

        tachyon

        s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print

      Tachyon said:
      As a result the challenge response/whitelist passthrough is probably the way it will end up in the medium term. Then of course the spammers will implement respond bots and the cycle will continue.

      But the beauty of that is that they can no longer hide their mail address. It has to be valid. Then you can blacklist it. Setting up numerous real respondbots is much more onerous than just formulating fake return addresses.

      The thing that gets me is: what are they thinking? If someone is trying to filter out their offers, how likely is it that that person will decide to become a customer when their efforts are thwarted?

      Earthlink has a rather ingenious system: they set up some fake accounts expressly to attract spam. When those accounts receive it, they analyze it and filter it out of clients' mailboxes. It works very well. In addition to that, there is whitelisting.

        Roy asked: The thing that gets me is: what are they thinking? If someone is trying to filter out their offers, how likely is it that that person will decide to become a customer when their efforts are thwarted?
        1. Most of the filtering is done by clueful sysadmins, they want to get clueless users.
        2. Apparently the necessary success rate to keep genuine advertising spammers in business is just over one in a million.
        3. Most spam nowdays has nothing to do with advertising dubious products, even if it pretends to be. It is a big pyramid scheme where they sell each other lists of addresses (and hopefully the whole thing will implode real soon). In these emails, all they want is for the sucker to view it in an HTML-aware mail client, to pull in a web bug and confirm the address is live. It doesn't matter if it's immediately deleted. In fact I'm increasingly seeing ones where the "click here" links don't even resolve.
      tachyon

      I think that you are talking about PopFile not PopMail and I am definately a convert. I was recenltly slammed with the last Win bug and PopFile was able to capture every email that came in. Do to a system upgrade I lost my training, but was able to retrain popfile to >95% accruacy in less than a week.

      As a side note, you can also use popfile to organize your email. For example I have folders for family, Spam, newsgroups.

      One other thing to be aware of concerning popfile. It classifies emails based on the email body as well as the headers, so altering the header does not fool it.

      I would recommend it to any one interested in filtering email. It is also one of the non-CGI Perl programs I use to show people that perl is more than a CGI program.
Re: Re: Re: Enough is Enough - Taking the fight back to the Internet scammers
by davis (Vicar) on Oct 28, 2003 at 10:03 UTC

    (Hmm,m wonder if anyone has done this already..)

    Yes, they have. Browse the Email Filters section on freshmeat. If you've got a user account there you should be able to search in that category for "whitelist" (you may be able to do this without an account, it's been a while).


    davis
    It's not easy to juggle a pregnant wife and a troubled child, but somehow I managed to fit in eight hours of TV a day.

      Spam Assassin already has whitelist functionality.


      If the information in this post is inaccurate, or just plain wrong, don't just downvote - please post explaining what's wrong.
      That way everyone learns.

        So it does. I was under the impression that the whitelist could only be used to prevent mail from "known" people getting marked as spam. It appears that by lowering the required score (for spam to get marked), or by using the "add all addresses to blacklist" along with the whitelist could achieve the desired results.


        davis
        It's not easy to juggle a pregnant wife and a troubled child, but somehow I managed to fit in eight hours of TV a day.
        Oops, it does? Then I just need to figure out how to use it...

        C.

Re: Re: Re: Enough is Enough - Taking the fight back to the Internet scammers
by Anonymous Monk on Oct 31, 2003 at 16:25 UTC
    There is one thing I've found that DOES effectively prevent Spam. Mailblocks uses a "verify" technique that works pretty much 100%. There are only two drawbacks: 1) it costs money/yr and 2) while they have ways of letting things like orders through, sometimes you just use your e-mail account and things that are NOT going to reply to their message end up in the pending box so you still have to keep your eye on it. But only sometimes. It has been the only way I've found to effectively combat spam. Maybe one day spammers will get by it, but for now, it works.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://302657]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others taking refuge in the Monastery: (5)
As of 2024-04-24 01:37 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found