Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Comment on

( #3333=superdoc: print w/ replies, xml ) Need Help??

Tachyon,
I have a similar problem, over six hundred complex regexen to match against a busy logfile and related messages to issue depending on which regexps were matched. much the same problem as the OP

If I understand the regexp engine caching the compiled version of a regex if it is not going to change then I think this should be a reasonably efficient approach. Am I on the right tracks ? and is the /o unrequired as I have already interpolated the variable when the regex is first called ?

#!/usr/local/bin/perl -w use strict; my ($i, $compile_me, @names)=(1, "{my \@matches;", "no match"); while (<DATA>) { next if /^\s*$/; last if /END CONFIG/; chomp; my ($name, $reg)=split; push @names, $name; $compile_me.="push \@matches, $i if /$reg/o;"; $i++; } $compile_me.="\@matches}"; while (<DATA>) { chomp; print "\nmatches found for $_\n"; my @matches=eval $compile_me; foreach (@matches) { print $names[$_] , "\n" } } __DATA__ Fred_and_Friends fr.d Paul_and_co paul some_numbers \d{2} freud_likes_fred fr END CONFIG freud fred NaNa pauline 12312sdfsdf 2
Update with speed test

I have now run a comparative test over 300^H^H^H, sorry 416 lines of log, with my 672 pattern matches. First using the eval of a string containing all the regexen and returning match index numbers as above. Second is my old naive code holding an array with the regexen and doing a foreach through it against each line. I did not use the /o for the reasons given above it works fine without it

>time ./Monitor.fast real 0m1.49s user 0m0.68s sys 0m0.58s >time ./Monitor real 0m19.47s user 0m14.69s sys 0m0.50s >

I think the numbers speak for themselves

Cheers,
R.


In reply to Re^2: Matching against list of patterns by Random_Walk
in thread Matching against list of patterns by Eyck

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?
    Username:
    Password:

    What's my password?
    Create A New User
    Chatterbox?
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others taking refuge in the Monastery: (8)
    As of 2015-07-08 07:01 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?

      The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









      Results (96 votes), past polls