Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer

Benchmarking email address validation methods

by rob_au (Abbot)
on Feb 02, 2004 at 11:36 UTC ( #325830=perlquestion: print w/replies, xml ) Need Help??
rob_au has asked for the wisdom of the Perl Monks concerning the following question:

In performing code review for an internal project, a question was raised as to the better method to validate email addresses against RFC822. Two techniques from separate projects were examined and benchmarked, one which uses Email::Valid::Loose and the other, Mail::RFC822::Address, the results of which follow.

The question which I have however is as to whether this benchmark valid? Is the benchmark code accurately measuring the relative validation techniques or is this comparison in some manner flawed? I am most interested in this, as the Benchmark results received differ from what I would have expected, particularly given the size and complexity of the $Addr_spec_re regular expression from Email::Valid::Loose.

The benchmark code:

use Benchmark; use Email::Valid::Loose; use Mail::RFC822::Address qw( valid ); my $iter = 100000; my @results = (); timethese($iter, { 'Mail::RFC822::Address' => <<EOS, \$result[0] = valid('per\@p'); \$result[1] = valid(''); \$result[2] = valid('email\'); \$result[3] = valid('email\@email\'); EOS 'Email::Valid::Loose' => <<EOS } ); (\$result[4]) = 'per\@p' =~ /\^($Email::Valid::Loose::Addr_spec_re)\$/ +; (\$result[5]) = '' =~ /\^($Email::Valid::Loose::Addr_spec_re)\$/; (\$result[6]) = 'email\' =~ /\^($Email::Valid::Loose::Addr_ +spec_re)\$/; (\$result[7]) = 'email\@email\' =~ /\^($Email::Valid::Loose +::Addr_spec_re)\$/; EOS

And the benchmark results:

Benchmark: timing 100000 iterations of Email::Valid::Loose, Mail::RFC8 +22::Address... Email::Valid::Loose: 6 wallclock secs ( 7.26 usr + 0.00 sys = 7.26 +CPU) @ 13774.10/s (n=100000) Mail::RFC822::Address: 28 wallclock secs (27.03 usr + 0.00 sys = 27.0 +3 CPU) @ 3699.59/s (n=100000)


perl -le "print unpack'N', pack'B32', '00000000000000000000001010111101'"

Replies are listed 'Best First'.
Re: Benchmarking address validation methods
by Abigail-II (Bishop) on Feb 02, 2004 at 12:37 UTC
    Well, first of all, your benchmark is extremely limited. It's only testing 4 simple strings. No long addresses, no comments (let alone nested comments), no routing addresses, no ip-addresses, no special characters. The least you could do is actually benchmark against the examples mentioned in RFC822.

    Second, there's another module for email validation, one that uses Parse::RecDescent instead of regular expressions. It's called RFC::RFC822::Address.


Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://325830]
Approved by broquaint
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others surveying the Monastery: (2)
As of 2018-01-19 04:53 GMT
Find Nodes?
    Voting Booth?
    How did you see in the new year?

    Results (215 votes). Check out past polls.