Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

Re: convert string into a match pattern

by dave_the_m (Monsignor)
on Dec 20, 2017 at 14:10 UTC ( [id://1205908]=note: print w/replies, xml ) Need Help??


in reply to convert string into a match pattern

In addition to quotemeta, note also that if you want to match many strings against each line in a file, it's much more efficient to combine all the strings into a single pattern, so something like the following:
my @strings = ( 'nnxx.yy2 = 234 abc', 'foo bar = 39 baz *', .... ); my $pattern = join '|', map { my $s = quotemeta; $s =~ s/(=\s*)\d+/$1\\d+/g; $s } @strings +; my $qr = qr/^($pattern)$/; while (<>) { print if /$qr/; }
In the code above I only convert numbers into \d+ if preceded by '='. I've also added ^ and $ anchors to the pattern, which you may or may not want.

Dave.

Replies are listed 'Best First'.
Re^2: convert string into a match pattern
by Anonymous Monk on Dec 22, 2017 at 07:30 UTC

    Hello Dave

    Many Thanks for your comment to make if more efficient. One additional question:

    If I do it in your way and combine the search for all strings, is there a simlple possibility to get also the string index in case of a match ?

    e.g. 0 for a match with 'nnxx.yy2 = 234 abc', 1 for a match with 'foo bar = 39 baz *',....

    Thanks

      is there a simlple possibility to get also the string in index in case of a match
      Not that I can immediately think of. Less simple techniques depend on whether you expect most lines to match at least one of the strings, or for most lines to be rejected. If the latter, then you can use my suggested join'|' technique to quickly reject most lines, then use a slower technique only on the matching lines to find which string matched. For example you could generate a second pattern which includes captures, e.g. /(string1)|(string2)|..../ and apply it to matched lines, then see what is the first non-undef value out of $-[1], $-[2], .., $-[N]. This pattern is less efficient than /string1|string2|.../ as it isn't internally compiled into a trie and thus has to check every string in in turn, which is slow for many strings.

      Dave.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1205908]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (3)
As of 2024-03-29 04:33 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found