Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

Just Another Discussion of Spam

by CloneArmyCommander (Friar)
on Mar 17, 2005 at 18:03 UTC ( #440464=perlmeditation: print w/replies, xml ) Need Help??

My e-mail account has recently been flooded with wave after wave of spam, and while I was in the process of deleting it all and reporting as spam, I got an idea.

Usually the e-mail addresses of the spam come as a long string of seemingly random numbers and letters @somecompany.com, but recently they have come as simple adresses at even netscape.com, but I have noticed that even then when an e-mail is written in reply, they bounce back (I expected that, but this is where my idea comes in). I am not familiar with how most spam filters work, but I figure it does not hurt to reflect on this idea and think about it.

If it never fails that replies to spam bounce back, would it be possible to somehow ping the address a few minutes after receiving to check if the address exists anymore, then if it returns a big "NO" delete the e-mail :)? Seems like something that could be done with Perl. Use mechanize to log into my account, then do all of the dirty work of deleting my spam.

What are your thoughts (or is this already done :)?

Replies are listed 'Best First'.
Re: Just Another Discussion of Spam
by brian_d_foy (Abbot) on Mar 17, 2005 at 18:53 UTC

    You can't trust the From address. Spammers have already figured out that they can put anything in there, even real addresses (including mine, this week). I've noticed quite a number of people sending me mail this week complaining about some spam they received. Those messages aren't bouncing.

    Responding to spam, even to see if it bounces, just makes the mess worse. Get a good filter, put your friends and family in a white list, and don't worry about the rest.

    --
    brian d foy <bdfoy@cpan.org>
      If you'd used SPF and your friends had paid attention to it, then those spam messages would have bounced.

      If you'd used SPF and your friends did not, you'd at least have something constructive to tell them that they should do to cut out some spams.

        It's not about me or my friends. It's about the thousands of people who don't know me. I don't get to decide how that plays, so it doesn't matter.

        --
        brian d foy <bdfoy@cpan.org>
Re: Just Another Discussion of Spam
by hardburn (Abbot) on Mar 17, 2005 at 18:50 UTC

    That would make a great attack vector for a DDoS. Send out mails with a faked From address, all pointing back to a single mail host. Then all the anti-spam gateways dutifully try to connect to the mail host at once.

    Some of the "solutions" to spam are worse than the actual problem.

    "There is no shame in being self-taught, only in not trying to learn in the first place." -- Atrus, Myst: The Book of D'ni.

Re: Just Another Discussion of Spam
by perrin (Chancellor) on Mar 17, 2005 at 18:23 UTC
      Sending email should be free
      I can't second this. If there would be a worldwide micropayment system then I would happily pay a cent or two for a email to save me from spam. (Hey you pay that ten times for sending a SMS.)

      If one had to pay for a email that would definitly rule out spammers. They just couldn't afford it anymore.

      Two big "if" I know, but I am allowed to dream ;-)


      holli, /regexed monk/

        I don't know how many people subscribe to the Perl.com newsletter I send out every other week. I don't want to think about micropayments for that. Don't fall into the mental trap of optimizing the next generation of SMTP based only on your experiences.

        I already pay for unlimited SMS messages in my recurring phone bill, just like I pay to send internet traffic in my digital cable bill and hosting bill. I pay for my gmail account by agreeing to let Google show me ads. I'm already paying to send email.

        Micropayments won't rule out spammers. I still get lots of bulk mail in my physical mailbox, and it costs money to send that stuff: more money then a couple cents and it often has a lower response rate than spam. Payments just make the margins smaller, and they'll make up for it by selling better targeted lists.

        --
        brian d foy <bdfoy@cpan.org>

        How do you implement that? Where does the money go?

        If you have it go to an authority, you're relying on a central authority to handle e-mail. It'll be abused faster than you can say "ICANN".

        If you have it go to the receiver, then you need a way of getting the payment to them. That requires a central authority (same problem as above), or having everyone accept credit card payments (impractical with the current credit card processing framework), or a cryptographic cash mechanism. But once you bring cryptography into play, you can solve the spam problem via cryptographic signatures, which makes the whole payment system moot.

        "There is no shame in being self-taught, only in not trying to learn in the first place." -- Atrus, Myst: The Book of D'ni.

        I can't second this. If there would be a worldwide micropayment system then I would happily pay a cent or two for a email to save me from spam. (Hey you pay that ten times for sending a SMS.)
        Yes, but for having it work for you, the people sending you email have to be willing to pay for email.

        Considering the the huge success of "sharing" cracked games, DVDs and CDs to avoid paying for them, or for VoIP to avoid costs, I don't think having to pay for one of the killer-applications of the internet, a service that has been free for over three decades, is going to be a success. Specially since it would be trivial to have a legal, cost-free, alternative (everyone has it already).

        If you don't want spam in your inbox, use your choice of SMTP blacklists (AKA "RBLs" and "RHSBLs") and a good filter -- I use DSPAM for those who get past my long list of 17 SMTP blacklists. I see one or two spam messages per month in my inbox, and my total cost was my time in configuration.

        If I didn't run my own server, I'd still use those blacklists, but via a homegrown bit of perl, then pipe the remaining messages into DSPAM. I recently set my dear old mom up like this, and it seems to work just fine.

        Nuthin' to it but to do it, and the price is right.

Re: Just Another Discussion of Spam
by BazB (Priest) on Mar 17, 2005 at 21:18 UTC

    There are a number of problems with your idea: the first one being that spammers regularly forge From: addresses.

    SMTP does support VRFY (verify) and EXPN (expand) commands to check that a given account exists on a machine, however this has long since been disabled by most administrators (myself included) ever since spammers started abusing the system to identify and harvest valid accounts. Have a read of RFC2821 - SMTP

    An alternative would be to try and send a test message to the accounts, but you'd be effectively spamming too.

    Even if you could verify the existence of an address, would you really want to delete all messages from an email address that no longer exists?
    I've got emails going back quite a few years, many of which were sent from email addresses I know are defunct after friends have changed ISP or webmail providers and so on.

    Spam Assassin does a pretty good job of filtering spam - I'd recommend using that (or an alternative tool), rather than your fairly coarse criteria.

    Cheers,

    BazB


    If the information in this post is inaccurate, or just plain wrong, don't just downvote - please post explaining what's wrong.
    That way everyone learns.

Re: Just Another Discussion of Spam
by Anonymous Monk on Mar 17, 2005 at 21:55 UTC
    Use Perl, specifically Spam::Assassin. Do not reply to spam, do not bounce spam, do not do nothing but tag and delete the spam. I get hundreds of bounce/refuse messages a day from mail servers that don't bother to distinguish between the machine sending spam and the address listed in the forged From: header. I don't send spam. So sending me something is just clogging my mailbox. I'm guessing that pinging a server would have the same problem--you're likely to impact innocent third parties more than you'll eliminate spam. The two most effective ways of dealing with spam so far are testing individual mails for keys like certain phrases, addresses, URLs, etc, and testing individual mails for the probability that they are spam using Bayesian-like statistical methods. Spam::Assassin puts them both together in one package. SA will probably catch a huge percentage of the spam before you have to read it.
Re: Just Another Discussion of Spam
by 5mi11er (Deacon) on Mar 17, 2005 at 23:04 UTC
    There are more problems, you can't 'ping' an email address, just the email host, and some email hosts lie behind firewalls which may mean you can't ping the machine, even though it may accept SMTP traffic. Even attaching to and trying to verify the email address leads you into many of the problems discussed above, mainly the forging of addresses.

    One of my old email addresses has been used for over a year as a forged from address. Because the old email address was supposed to forward to my new address, I was forced to begin using spambayes for outlook. I've been pretty happy with it, but need to upgrade. The version I'm currently using will very occasionally cause outlook to lockup (well, won't let me open any emails, and won't fully shutdown).

    -Scott

Re: Just Another Discussion of Spam
by bageler (Hermit) on Mar 17, 2005 at 20:04 UTC
    a combination of a couple bayesian filters works well for me.
Re: Just Another Discussion of Spam
by jhourcle (Prior) on Mar 20, 2005 at 05:09 UTC

    I'd suggest, that if you're interested in spending time to help rid systems of spam, that you look into the work that's already being done, rather than just try to jump in on your own. You'd probably be interested in the spamtools mailing list, as well as SPAM-L mailing list, and CAUCE (the coalition against unsolicited commercial email).

    As for your suggestion, it might've been useful if you were just doing a VRFY or EXPN on the address, but most SMTP servers have shut down those services, because they were being used to harvest the systems for addresses. Therefore, the only way to verify an address is to send an email, and generating more mail for every message that goes through is not a good idea.

Re: Just Another Discussion of Spam
by chas (Priest) on Mar 18, 2005 at 05:36 UTC
    There's an interesting idea at http://members.hostedscripts.com/antispam.html. This page contains a hundered or so randomly generated bogus email addresses and a link to a new such page - idea is to fill spammers's databases with the fake addresses. The script to produce the page is available elsewhere, but it isn't difficult to reproduce. I'm not sure how effective this really is.
    chas
      Well, you first have to make sure you aren't putting any valid domains on the page (or else, the mailer for that domain has to deal with the spam coming in). Second, it's not difficult for the spammer to check whether, for each harvested address, the domain exists. Sure, it will cost that some resources. But those resources will be cheap, and compared to the shear number of addresses they deal with, not significant.
Re: Just Another Discussion of Spam
by husker (Chaplain) on Mar 18, 2005 at 21:15 UTC
    SPF is an optional system that somewhat acts like you propose. SPF is an add-on to the DNS system, wherein email administrators say "if you get an email from my domain, then verify that it came from one of the following servers. If it didn't, it's a fake ... treat accordingly". It's not perfect, and it's not ubiquitous yet, but I think it is one component of a successful anti-spam strategy.

    Another component that interest me is greylisting which takes advantage of a mechanism already in place in the SMTP protocols. It, too, by itself, is not a panacea, but in conjunction with other methods may prove to be very effective with a small amount of effort on the administrator's part.

    Between these two methods, one might be able to weed out some percentage of the "laziest" spam, meaning your Bayesian filters and other layers of defense have fewer messages to process. SFP and greylisting can be the bouncers at the door, merely checking for valid ID's. Many will be rejected at the door. Those who are able to get past this check and into the club will be frisked by bayesian filters and whatnot...

Re: Just Another Discussion of Spam
by naChoZ (Curate) on Mar 18, 2005 at 03:48 UTC

    I use spambayes in proxy (pop) mode as well as sylpheed-claws' built-in support for spam assassin. I get very very little spam. spambayes has a nifty web interface that I like, too.

    --
    "This alcoholism thing, I think it's just clever propaganda produced by people who want you to buy more bottled water." -- pedestrianwolf

Re: Just Another Discussion of Spam
by fraktalisman (Hermit) on Mar 19, 2005 at 22:57 UTC

    I know, there is a great temptation to react to spam. Don't!

    I really answered spam messages several times, only to find out that the From: addresses were bogus, and that I was sending (sometimes quite insulting) answers to real people that never did me any harm.

    Spam is an issue that is really unnerving. But for the current spam I get, ther might be a new way of filtering: spell checking. Now that should be some kind of smart spell checking ... most spam uses deliberate spelling mistakes in the Subject: so SpamAssassin an the like would not filter it out easily. Anyway, those spelling mistakes are maybe not the same ones that real people make? We could increase the spam score of a message for some special kinds of spelling mistakes. I am not a linguist, but I am definitely annoyed by spam mails that don't even get their Subject in a proper spelling.

    But maybe my idea is no good after all, because in the end, I really prefer getting spam over losing one single genuine mail, that's also the reason why I don't let any scripts delete spam mails automatically.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlmeditation [id://440464]
Approved by kvale
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others taking refuge in the Monastery: (7)
As of 2019-12-12 15:44 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?