Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw

Valid Email Filter

by neilwatson (Curate)
on Mar 25, 2002 at 14:10 UTC ( #154093=sourcecode: print w/ replies, xml ) Need Help??

Category: E-Mail Programs
Author/Contact Info Neil Watson
Description: The marketing department wants to send emails to people who sign up on the company website. But who actually gives their real address? You try to explain this to them but, you're outranked. At least save yourself many of the annoying bounces and deferred messages that can clog your mail queue. This script will check a list of email address (one per line) and see if they are valid by RFC compliance and by performing an MX DNS query. This scirpt is NOT perfect. No email validation script can be. For most purposes it will suffice.

#checks for valid email address
#usage validemail <file containing email addresses>

use warnings;
use strict;
use Mail::CheckUser;
use Parallel::ForkManager;
use Fcntl qw/:flock :seek/;
use FileHandle;

my $pm=new Parallel::ForkManager(20);
my $addrfile = $ARGV[0] || die
"Usage validemail <file containing email addresses>
Will return two files: goodmails.csv and badmails.csv.
If these files exits already they will be deleted.";
my ($is_valid, $host, $x, @mx, $add, @adds);

#custom words that make emails invalid to you
my @custom = qw/

my $regex = join "|", @custom;
$regex = qr/$regex/;

open (EMAILS, "$addrfile");

#remove troublesome windows /r characters
#and leading whitespace
 while (<EMAILS>){
        $_ =~ s/\015//;
    $_ =~ s/^\s*//;
        chomp $_;
    push @adds, $_;
close (EMAILS);

#warning, I will delete existing files
open (BADADDR, ">badmails.csv") || die;
open (GOODADDR, ">goodmails.csv") || die;

#to prevent strange writing problems
#when running threads we must set the
#write buffers see
#to prevent strange writing problems
#when running threads we must set the
#write buffers see
autoflush BADADDR  1;
autoflush GOODADDR 1;

#remove custom regexes
$x = 0;
while ($x <= $#adds){
    if ($adds[$x] =~ m/$regex/){
        splice @adds, $x, 1;
#when using Mail::UserCheck set
#these variables

#timeout on DNS and SMTP network checks
$Mail::CheckUser::Timeout = 10;

foreach $add (@adds){ 
    $pm->start and next;

    if (Mail::CheckUser::check_email($add)){
        writeaddr(*GOODADDR, $add); #address is good
        writeaddr(*BADADDR, $add); #address is bad


close (BADADDR);
close (GOODADDR);

sub writeaddr{
    my $FH = $_[0];
    my $address = $_[1];
    flock $FH, LOCK_EX or die "Flock failed: $!\n";
    seek  $FH, 0, 2 or die "Seek failed: $!\n";
    print $FH "$address\n";
    flock $FH, LOCK_UN or die "unFlock failed: $!\n";

Comment on Valid Email Filter
Download Code
Replies are listed 'Best First'.
Re: Valid Email Filter
by miyagawa (Chaplain) on Mar 25, 2002 at 17:17 UTC
    Wow, glad to see people use Email::Valid::Loose :) Email::Valid::Loose is a subclass of Email::Valid which allows .(dot) before @ (at mark).

    Tatsuhiko Miyagawa

      One suggestion for the code .... change open (EMAILS, "$addrfile"); to open (EMAILS, $addrfile) or die "Can't open $addrfile: $!\n"; ALWAYS check to see if 'open' worked - print some descriptive message if it doesn't.
Re: Valid Email Filter
by gav^ (Curate) on Mar 26, 2002 at 04:48 UTC
    I find something like this tends to help. It seems AOL users just can't get it right:
    for ($email) { s/@\./@/; s/@aol\.con/; s/aol$/; }


Back to Code Catacombs

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: sourcecode [id://154093]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (3)
As of 2015-11-29 07:12 GMT
Find Nodes?
    Voting Booth?

    What would be the most significant thing to happen if a rope (or wire) tied the Earth and the Moon together?

    Results (748 votes), past polls