Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

Handling Incoming Mail

by Ian the Terrible (Beadle)
on Dec 11, 2001 at 10:38 UTC ( #130876=perlquestion: print w/ replies, xml ) Need Help??
Ian the Terrible has asked for the wisdom of the Perl Monks concerning the following question:

When I first learned that sendmail could pipe incoming messages to a program, a number of interesting possibilities occurred to me. One that I'm especially interested in exploring is the idea of stuffing messages from a mailing list into a database, thus giving me ways to search by sender, subject, full-text...

So as a first step, I whopped up a simple script that just prints STDIN to a temp file, so I could see exactly what was coming in and get a handle on how to parse it.

This led to exploration of MIME types, base64 encoding, and various other things.

My question: Is this a reasonable way to be doing this, or would it be easier (from the standpoint of processing the mail) to let the mail flow into a mailbox, and then parse it from there?

One way gives real time processing; the other would obviously have to be a periodic cron job.

I poked around briefly on CPAN, and was overwhelmed by the sheer number of modules that deal with email in one fashion or another. Anybody got favorites?

--Ian

Comment on Handling Incoming Mail
Re: Handling Incoming Mail
by dws (Chancellor) on Dec 11, 2001 at 10:45 UTC
    I poked around briefly on CPAN, and was overwhelmed by the sheer number of modules that deal with email in one fashion or another. Anybody got favorites?

    I like Mail::Audit. It's a flexible starting point if you're looking to write a procmail replacement.

Re: Handling Incoming Mail
by blakem (Monsignor) on Dec 11, 2001 at 13:15 UTC
Re: Handling Incoming Mail
by BazB (Priest) on Dec 11, 2001 at 15:00 UTC

    I'm not an expert when it comes to Perl by any means, but I'd personally let the mail go into a mailbox and parse it from there, rather than grab it from STDIN.

    Let the mail delivery system worry about getting the messages into the right place, and more importantly, written to the filesystem - so that you've got a chance of recovering some information if something goes wrong (i.e. power outage, OS crash).

    Once you're sure that you successfully inserted the information into your database, then you can delete the message from the mail spool.

    As far as I can make out, your current setup is

    • STDIN
    • Temp file(s)
    • email parser (which will create its own temp files - outputting to core might be a bad idea if your incoming emails might be large).
    • The database.

    In my head you could let the mailsystem handle the first 2 steps - I'd start by slurping the mailspool into a parser.
    Baz.

      Yep. I second this. You *really* don't want a simple error in your script resulting in lots of bounce messages going back to list admins.

      Of course, what you are describing isn't a million miles away from using procmail/rules on the server to filter different lists into different folders and then using a protocol like IMAP to run queries against the folders to find certain messages. Some IMAP server (Cyrus?) use a database for message storage so you should get better-than-grep performance.

      There a million ways to skin this cat. The tricky bit of the feline is the bit before the mail has been delivered, since errors then will result in lost mail and/or bounces going back (if your MTA considers that it hasn't discharged its duties it feels honour bound to confess to others).

Re: Handling Incoming Mail
by hatter (Pilgrim) on Dec 11, 2001 at 19:16 UTC
    Mail::Audit and Mail::ListDetector will almost certainly save you a lot of work in the long run. However if you'd rather use perl that you already understand, then you can always investigate procmail and just pipe the necessary mails through your program just like you're already doing from STDIN. That's going to be a bit more efficient and a bit less prone to newbie errors than extracting it from a mailbox once the system has delivered it, and has the advantage of also being realtime.

    the hatter
Re: Handling Incoming Mail
by Fastolfe (Vicar) on Dec 12, 2001 at 02:03 UTC
    Mail::Audit looks pretty slick. In the past, though, whenever I was doing automated mail handling, I would use MIME::Parser to parse the message, and MIME::Entity to construct a reply.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://130876]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others scrutinizing the Monastery: (13)
As of 2014-09-22 12:43 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (191 votes), past polls