Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

Parse a Unix Mailbox for unknown users or hosts

by maksl (Pilgrim)
on Apr 08, 2003 at 10:37 UTC ( [id://248875]=CUFP: print w/replies, xml ) Need Help??

Dear fellow monks,

After sending out a newsletter to subscribed readers, some emails or their hosts no longer exists. therefor i need a fast and quick parser for above messages not taking in account any "mailbox is full" or other garbage autoreply messages.

Limbic~Region pointed to Mail::MboxParser. It's a fast read only access to an unix mailbox and the module is easy to understand!

$msg->header->{from} or $msg->header->{to} is no use for the job because they both show the server where the newsletter was sent from. The body has to be parsed for unknown users or hosts. The print statements prints out the wrong email adress preceded with the corresponding message: user unknown or host unknown.

My use was slightly different, as i connected to the database for the unknown users in order to flag their emails as unsubscribed and left the unknown hosts for handcraft as shown here.

So here is the code :)

#!/usr/bin/perl -w use strict; use Mail::MboxParser; my $mbox = '/var/spool/mail/www37'; my $mb = Mail::MboxParser->new($mbox, decode => 'ALL'); print "Total messages: ", $mb->nmsgs, "\n"; # iterating through the mailbox while (my $msg = $mb->next_message) { my $body = $msg->body($msg->find_body); foreach my $line (split /\n/, $body) { my (undef, undef, $a) = split / /, $line if $line =~ / +Host unknown/i && $line=~/^550/; if ( defined $a ) { $a =~ s/^<(.+)>\.\.\.$/$1/; print "Host unknown: ", $a, "\n"; # ... } my $b = $line if $line =~ /User unknown/i && $line =~/ +^550/; if ( defined $b ) { $b =~ s/550 5.1.1 (.+) \(.+/$1/; $b =~ s/550 5.1.1 <(.+)>.+/$1/; $b =~ s/550 <(.+)>.+/$1/; print "User unknown 550: ", $b, "\n"; # ... } my $c = $line if $line =~ /^<<< 554/; if (defined $c) { $c =~ s/.+to (.+) cannot.+/$1/; $c =~ s/.+account \((.+)\) .+/$1/; print "User unknown 554: ", $c, "\n"; #.. } } }

again cpan is a great help and this job was fast and fun
please point out any inaccuracy! thx maksl
Perhaps I was a bit fast looking only at the 550 error, leaving out any 551 or others ..
View the regexes pointing to 550 only as proposal. There were also some obscure qmail answers, which even as human i didn't understand ;) ..

Update:
Important, because of the many existant yahoo email accounts: Added lines with variable $c for 554 error: yahoo likes to answer with this error .. sample: <<< 554 delivery error: dd Sorry your message to foo@yahoo.com cannot be delivered. This account has been disabled or discontinued [#102]. - mta210.mail.scd.yahoo.com

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: CUFP [id://248875]
Approved by broquaint
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others scrutinizing the Monastery: (3)
As of 2024-06-22 06:15 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?
    erzuuli‥ 🛈The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.