Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

Re: Re: Returning regexp pattern that was used to match

by crabbdean (Pilgrim)
on May 03, 2004 at 14:52 UTC ( #350029=note: print w/ replies, xml ) Need Help??


in reply to Re: Returning regexp pattern that was used to match
in thread Returning regexp pattern that was used to match

Well I need to know what clause matched, unless there is some other way to do this?

UPDATE: My other thought was to do this (which also works and gives me what I want):

for ( keys %{$self->{actions}} ) { process($self, $file, $dir, $_) if ($fullfile =~ /$_/) +; next; }
.. but I'm concerned about speed. If its doing this for ever file on a terabyte server I'm worried about the time consumption. What do you think?

Dean
The Funkster of Mirth
Programming these days takes more than a lone avenger with a compiler. - sam
RFC1149: A Standard for the Transmission of IP Datagrams on Avian Carriers


Comment on Re: Re: Returning regexp pattern that was used to match
Download Code
Re: Returning regexp pattern that was used to match
by Abigail-II (Bishop) on May 03, 2004 at 15:01 UTC
    Well, you could of course always not construct a big regexp, but just loop over the keys and apply each regexp, doing something when it matches.

    Abigail

Re: Re: Re: Returning regexp pattern that was used to match
by diotalevi (Canon) on May 03, 2004 at 15:36 UTC

    If you are concerned about speed than contrive to have precompiled those patterns before testing them. I had you directly testing against the keys in the pattern. Another idea might be to do this.

    use vars qw' %CACHED_RX '; sub do_this { my $self = shift; my %rx = %{ shift() }; for ( keys %rx ) { my $rx = $CACHED_RX{$_} ||= qr/$_/; if ( $self->{'find'} =~ $rx ) { } } }
Re: Returning regexp pattern that was used to match
by Abigail-II (Bishop) on May 03, 2004 at 15:38 UTC
    but I'm concerned about speed. If its doing this for ever file on a terabyte server I'm worried about the time consumption. What do you think?
    Just the fact that you hide a loop as regexp alternatives doesn't mean it's suddenly orders of a magnitude faster. In fact, it might as well be that splitting the regexp in smaller chunks is faster, because the optimizer kicks in.

    Here's a benchmark:

    #!/usr/bin/perl use strict; use warnings; use Benchmark qw /cmpthese/; our @regexes = ( '.*\.jpg$', '.*\.png$', 'Perl', '\.mozilla/abigail', ); our @words = `find /home/abigail`; # 38517 files. our ($c1, $c2); cmpthese -60 => { single => 'my $regex = join "|" => @regexes; $c1 = 0; for my $w (@words) { $c1 ++ if $w =~ /$regex/ }', many => '$c2 = 0; WORD: for my $w (@words) { for my $r (@regexes) { $c2 ++, next WORD if $w =~ /$r/ } }', }; die "Unequal\n" unless $c1 == $c2; __END__ s/iter single many single 4.86 -- -74% many 1.28 281% --
    Now, for your particular data set results might be different. But don't assume alternatives are necessarely slower.

    Abigail

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://350029]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (10)
As of 2014-12-26 11:16 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (171 votes), past polls