Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

Re^2: Slowdown when using Moose::Util::TypeConstraints with Regexp::Common

by kevbot (Hermit)
on Sep 02, 2008 at 20:38 UTC ( #708592=note: print w/ replies, xml ) Need Help??


in reply to Re: Slowdown when using Moose::Util::TypeConstraints with Regexp::Common
in thread Slowdown when using Moose::Util::TypeConstraints with Regexp::Common

I added a couple other tests (AddressC and AddressD) to the mix. Basically, I wanted to take Regexp::Common out of the code to see if a standard regexp showed the same behavior. So, I grabbed the regular expression from Regexp::Common and placed it in the code directly. In AddressC, I put it directly in the subtype (similar to Address A):

subtype USZipCodeC => as Value => where { $_ =~ /(?-xism:^(?:(?:(?:USA?)-){0,1}(?:(?:(?:[0-9]{3})(?: +[0-9]{2}))(?:(?:-)(?:(?:[0-9]{2})(?:[0-9]{2}))){0,1}))$)/; }; has 'zip_code' => (is => 'rw', isa => 'USZipCodeC');

For AddressD, I compile the regexp (similar to AddressB):

my $zip_re = qr/(?-xism:^(?:(?:(?:USA?)-){0,1}(?:(?:(?:[0-9]{3})(?:[0- +9]{2}))(?:(?:-)(?:(?:[0-9]{2})(?:[0-9]{2}))){0,1}))$)/; subtype USZipCodeD => as Value => where { $_ =~ $zip_re; }; has 'zip_code' => (is => 'rw', isa => 'USZipCodeD');
So, I would expect that AddressB and AddressD would be the fastest. Here the result of a benchmark on my WinXP/Strawberry Perl 5.10 setup:
Rate A D C B A 2960/s -- -42% -42% -42% D 5061/s 71% -- -1% -1% C 5099/s 72% 1% -- -0% B 5108/s 73% 1% 0% --

The method using Regexp::Common directly in the Subtype is the slowest of all the techniques (AddressA). Basically, the other three are tied. So, the compiled regexp doesn't seem to make a difference as long as I avoid Regexp::Common.


Comment on Re^2: Slowdown when using Moose::Util::TypeConstraints with Regexp::Common
Select or Download Code
Re^3: Slowdown when using Moose::Util::TypeConstraints with Regexp::Common
by CountZero (Bishop) on Sep 02, 2008 at 22:33 UTC
    Basically all your versions, except AdressA, now use a single static regex, so they get compiled once at compile time, hence few or no differences. The Regexp::Common version cannot be optimized in that way, unless you force it to become static by using qr//. Such is the price you pay for flexibility.

    CountZero

    A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

      Thank you for your replies. They have been helpful. It was not obvious to me that the AddressC regexp would be compiled once at compile time...but it makes sense to me now.

      Actually, using Regexp::Common w/o qr// should result in the regex being compiled the first time that it is used. Each time it is used after that, a string comparison should be used to see if the regex needs to be recompiled. So it should be only compiled once and only an extra string compare will be done each time it is run.

      - tye        

        Yes, but the problem is creating said string. The $RE{...} expression dives down in a multi-level hash, which is tied on each level, and at the bottom, it calls a function.

        It may not be so slow as compiling a regexp, but it is the most likely cause of AddressA being the slowest of the four.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://708592]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others romping around the Monastery: (11)
As of 2014-07-28 09:48 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (195 votes), past polls