Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much

Comment on

( #3333=superdoc: print w/replies, xml ) Need Help??
but I'm concerned about speed. If its doing this for ever file on a terabyte server I'm worried about the time consumption. What do you think?
Just the fact that you hide a loop as regexp alternatives doesn't mean it's suddenly orders of a magnitude faster. In fact, it might as well be that splitting the regexp in smaller chunks is faster, because the optimizer kicks in.

Here's a benchmark:

#!/usr/bin/perl use strict; use warnings; use Benchmark qw /cmpthese/; our @regexes = ( '.*\.jpg$', '.*\.png$', 'Perl', '\.mozilla/abigail', ); our @words = `find /home/abigail`; # 38517 files. our ($c1, $c2); cmpthese -60 => { single => 'my $regex = join "|" => @regexes; $c1 = 0; for my $w (@words) { $c1 ++ if $w =~ /$regex/ }', many => '$c2 = 0; WORD: for my $w (@words) { for my $r (@regexes) { $c2 ++, next WORD if $w =~ /$r/ } }', }; die "Unequal\n" unless $c1 == $c2; __END__ s/iter single many single 4.86 -- -74% many 1.28 281% --
Now, for your particular data set results might be different. But don't assume alternatives are necessarely slower.


In reply to Re: Returning regexp pattern that was used to match by Abigail-II
in thread Returning regexp pattern that was used to match by crabbdean

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?

    What's my password?
    Create A New User
    and all is quiet...

    How do I use this? | Other CB clients
    Other Users?
    Others meditating upon the Monastery: (5)
    As of 2018-03-23 00:23 GMT
    Find Nodes?
      Voting Booth?
      When I think of a mole I think of:

      Results (287 votes). Check out past polls.