Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation

Re^7: Speeding up named capture buffer access

by ikegami (Pope)
on Dec 01, 2009 at 17:56 UTC ( #810428=note: print w/replies, xml ) Need Help??

in reply to Re^6: Speeding up named capture buffer access
in thread Speeding up named capture buffer access

I don't know how it compares for speed — probably slower due to the sub calls — but here's an alternative.

use strict; use warnings; use re 'eval'; # Should be scoped better. sub rc($) { my $ofs = @- + shift; return substr($_, $-[$ofs], $+[$ofs] - $-[$ofs]) } sub compile_pat { qr/$_[0]/ } my @s_months = qw( Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec ); my $s_months = compile_pat join '|', @s_months; local our %s_months = map { $s_months[$_] => $_+1 } 0..$#s_months; my @pats = ( qr/ (\d{4})-(\d{2})-(\d{2}) (?{[ rc-3, rc-2, rc-1 ]})/x, qr/ (\d{2})($s_months)(\d{4}) (?{[ rc-1, $s_months{rc-2}, rc-3 ]})/x, ); my $pat = compile_pat join '|', @pats; for (qw( 2009-12-01 01Dec2009 01-12-2009 )) { local our ($y,$m,$d); if (/$pat(?{ ($y,$m,$d) = @{$^R} })/) { printf("%s => %04d-%02d-%02d\n", $_,$y,$m,$d); } else { printf("%s => [No match]\n", $_); } }

Bonus: $pat can be calculated once and stored in a file.

Replies are listed 'Best First'.
Re^8: Speeding up named capture buffer access
by SBECK (Chaplain) on Dec 01, 2009 at 18:08 UTC

    This will take a bit of work to put in, and I'm not sure what the performance will be like, but it's worth at least trying.

      Just code the equivalent of mine in your method, and benchmark that. It should give a pretty accurate idea.

        I created a very simple example that matched some simple times with a few different formats, and benchmarked it using named capture buffers, alternative capture group numbering, and your method with embedded code. The alternative capture method is about 40% faster than the named buffers (so I'm going to be implementing it for at least some of the regular expression matching where the order of the matches stays the same). The embedded code was significantly slower (7x) than the named buffers. It's possible I wasn't doing it as efficiently as possible, but given that:

            it appears to be significantly slower
            it's listed as an experimental feature
            the regular expressions are more complicated
        I think that I'll pass on it for the time being. But it was fun to experiment with something new!

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://810428]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (4)
As of 2021-06-19 15:31 GMT
Find Nodes?
    Voting Booth?
    What does the "s" stand for in "perls"? (Whence perls)

    Results (93 votes). Check out past polls.