Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

(jcwren) Re: A Beginner's Guide to Using Mail::Audit and Mail::SpamAssassin

by jcwren (Prior)
on Dec 19, 2001 at 10:11 UTC ( #133027=note: print w/ replies, xml ) Need Help??


in reply to A Beginner's Guide to Using Mail::Audit and Mail::SpamAssassin

Excellent tutorial.

I would like to add a couple of things:

  • Mail::Audit 2.0 is broke. Sooner or later, your inbox will become corrupted. 1.11 is stable, and has given me no problems, however myself and two friends have had to back down from 2.0 to 1.11 to solve the inbox corruption problem. You can find Mail-Audit-1.11.tar.gz here (directory) or here (tarball).
  • The tutorial doesn't cover installing the Razor clients. These are necessary if you wish to make use of the Vipul database. This is the coolest part of Spam::Assassin, IMHO. A MD5 checksum of the mail is compared against a database of known spam. If it matches, it's automatically tossed. More importantly, as you get spam, you can cause it to be added to the database, which means other people never have to see it. The Razor::Clients package is not on CPAN, but is available here. Spam::Assassin automatically makes use of them if they are installed, otherwise it doesn't bother to mention it.
  • It is worth noting that when you are writing filters, once $item->accept() is called, the program ends. No further tests are included. The documentation says this, but it's not obvious at first glance. As such, while the subs in the example never return, it looks a little funky if you know this.
  • You can use the .procmailrc file, or, you can use the .forward file with the format | ~/mailscanner.pl Note that under certain systems, such as Redhat, sendmail runs programs under the rsh shell. To make this play, you have to put a symlink in /etc/smrsh to 'mailscanner.pl', or whatever you called your client. If you get a lot of mail, it avoids the small amount of additional overhead of spooling up procmail, only to pass it on.
  • This is a perl script. As such, when you make a change, you HAVE to 'perl -c mailscanner.pl' before walking away. If the scripts croaks, the MTA will send a reply to the originator of the email that the mail was undeliverable. When I was using procmail, a borked recipe was annoying, but not a problem. With Spam::Assassin, it's much more important to get it right.
  • It's important to put spam in a folder, and not drop it completely. Spam::Assassin isn't perfect, nor will your rules be. Mine are tuned pretty well, and rarely lets real spam through, but sometimes it kicks out good messages, because someone set a priority flag in Outlook, and had a few caps in the title. I get mail from a guy in Romania for product support on a C compiler that causes the problem. Frequently, I run tail -f ~/.audit_log in window somewhere, and keep an eye on what's rejecting as spam. As I see mail from people that I know I'll get again, I adjust the script, or easier, tune the .spamassassin.cf whitelist and blacklist (this files gets created automatically in your home directory the first time Spam::Assassin is run.)
  • There is an unsaid implication that the Vipul database will catch viruses. This may be the case for some, but it passed a Sircam laden message right on through. I scan the headers for the standard 'Snow White - The Real Story!' and a couple of others. Don't count on Spam::Assassin to protect you. Add your own countermeasures, and use standard anti-viral techniques, especially if you're going to be POP3/IMAP'ing the mail down to a Windows box.
  • procmail has a facility to check if the mail is of a certain size. This is something that's lacking in this package. Each line of the message is an array entry. If you want to know how long it is, you have to interate over the entire array, summing the length. This ought to be something the package provides as a method. I'm not sure what the implications of binary messages, attachements, etc are, so unlike my procmail recipes, I don't check for files of certain sizes.
  • After Spam::Assassin defangs mail (or rewrites the headers with the word SPAM everywhere), it is not clear at all if a message modified this way can or should be submitted to the Vipul database. I have found no clear answer on this, although I have not pursued it agressively. My personal policy is to only forward raw un-rewritten mails to Vipul, to make sure the MD5 checksum is for something people will actually get, and not a post-processed version. If someone knows the real answer, I'd like to know.

I think that's all the major points of running this. It's a great system, and it has seriously cut back on the crap I see.

--Chris

e-mail jcwren


Comment on (jcwren) Re: A Beginner's Guide to Using Mail::Audit and Mail::SpamAssassin
Select or Download Code
(shockme) Re: (jcwren) Re: A Beginner's Guide to Using Mail::Audit and Mail::SpamAssassin
by shockme (Chaplain) on Dec 19, 2001 at 11:19 UTC
    Excellent response. To address some of your points:
    • When I downloaded Mail::Audit approximately 2 weeks ago, v1.11 was the only downloadable version. I did not know v2.0 existed, nor was I aware of the added "inbox corruption" feature. Searching CPAN now shows only v2.0 for download, so props for the heads-up.
    • The Razor::Clients package really needs to be on CPAN. The documentation is extremely sparse on this. Again, props for the URL to get this material. I just installed it and it seems to be working well.
    • My original draft mentioned that, once you $item->accept(), all processing stops. However, on proof-reading and fact-checking, I could not locate this fact in the documentation. (Of course, now that it's too late, it's screaming itself from the page...) But, yes, this makes for very efficient processing, because once you've filed the email, you can forget about it.
    • I agree that all email should be filed and nothing dropped. Email filtering will never be "perfect" because the patterns are always in flux. Sooner or later, something would be lost in the void. That's why I emphasized the default of (in my case) the Bulk folder. If nothing else fits, it defaults to ~/mail/Bulk.
    • Spam::Assassin, implications aside, is not a virus scanner. It's a spam filter. While it may catch some viri, I think it foolhardy to rely upon it for anything other than filtering spam.
    • Again, the documentation is sparse concerning what should be submitted to the Vipul database. I don't know whether they have filters to "un-filter" the SPAM messages. Hopefully someone else will have a more definitive word on this. Until then, I think your suggestion is right way to go.

    Thanks for the great info, jcwren.

    If things get any worse, I'll have to ask you to stop helping me.

Re: (jcwren) Re: A Beginner's Guide to Using Mail::Audit and Mail::SpamAssassin
by $code or die (Deacon) on Dec 21, 2001 at 22:36 UTC

    How does version 2.0 corrupt the Inbox?

    I can't see any reported bugs for version two on rt.cpan.org. It might be worthwhile reporting the bug.

    simon

    Update (Jan-7-02): A bug and patch was submitted yesterday, I am not sure if it is the same one that jcwren speaks of. I'd imagine that the next version will fix this now Simon is aware of it.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://133027]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others taking refuge in the Monastery: (8)
As of 2014-12-23 00:10 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (133 votes), past polls