Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Benchmarking email address validation methods

by rob_au (Abbot)
on Feb 02, 2004 at 11:36 UTC ( #325830=perlquestion: print w/ replies, xml ) Need Help??
rob_au has asked for the wisdom of the Perl Monks concerning the following question:

In performing code review for an internal project, a question was raised as to the better method to validate email addresses against RFC822. Two techniques from separate projects were examined and benchmarked, one which uses Email::Valid::Loose and the other, Mail::RFC822::Address, the results of which follow.

The question which I have however is as to whether this benchmark valid? Is the benchmark code accurately measuring the relative validation techniques or is this comparison in some manner flawed? I am most interested in this, as the Benchmark results received differ from what I would have expected, particularly given the size and complexity of the $Addr_spec_re regular expression from Email::Valid::Loose.

The benchmark code:

use Benchmark; use Email::Valid::Loose; use Mail::RFC822::Address qw( valid ); my $iter = 100000; my @results = (); timethese($iter, { 'Mail::RFC822::Address' => <<EOS, \$result[0] = valid('per\@p'); \$result[1] = valid(''); \$result[2] = valid('email\@domain.com'); \$result[3] = valid('email\@email\@domain.com'); EOS 'Email::Valid::Loose' => <<EOS } ); (\$result[4]) = 'per\@p' =~ /\^($Email::Valid::Loose::Addr_spec_re)\$/ +; (\$result[5]) = '' =~ /\^($Email::Valid::Loose::Addr_spec_re)\$/; (\$result[6]) = 'email\@domain.com' =~ /\^($Email::Valid::Loose::Addr_ +spec_re)\$/; (\$result[7]) = 'email\@email\@domain.com' =~ /\^($Email::Valid::Loose +::Addr_spec_re)\$/; EOS

And the benchmark results:

Benchmark: timing 100000 iterations of Email::Valid::Loose, Mail::RFC8 +22::Address... Email::Valid::Loose: 6 wallclock secs ( 7.26 usr + 0.00 sys = 7.26 +CPU) @ 13774.10/s (n=100000) Mail::RFC822::Address: 28 wallclock secs (27.03 usr + 0.00 sys = 27.0 +3 CPU) @ 3699.59/s (n=100000)

 

perl -le "print unpack'N', pack'B32', '00000000000000000000001010111101'"

Comment on Benchmarking email address validation methods
Select or Download Code
Replies are listed 'Best First'.
Re: Benchmarking address validation methods
by Abigail-II (Bishop) on Feb 02, 2004 at 12:37 UTC
    Well, first of all, your benchmark is extremely limited. It's only testing 4 simple strings. No long addresses, no comments (let alone nested comments), no routing addresses, no ip-addresses, no special characters. The least you could do is actually benchmark against the examples mentioned in RFC822.

    Second, there's another module for email validation, one that uses Parse::RecDescent instead of regular expressions. It's called RFC::RFC822::Address.

    Abigail

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://325830]
Approved by broquaint
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others studying the Monastery: (18)
As of 2015-07-07 17:19 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (92 votes), past polls