Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

realizing AND in regex?

by LanX (Canon)
on Sep 13, 2012 at 09:59 UTC ( #993441=perlquestion: print w/ replies, xml ) Need Help??
LanX has asked for the wisdom of the Perl Monks concerning the following question:

Hi

I'm using the regex engine to identify delimited fields matching certain conditions.

Thanks to Perl's internal trie optimization of OR-conditions¹ it's far faster than using LIKE in mysql especially with hundreds of patterns to check

$text = 'A0 peter Z0 ... A42 peter, paul and mary Z42 ... A99 mary Z9 +9'; my @or_matches = ( $text =~ m/A(\d+)[^Z]*(peter|mary)[^Z]*Z/g ); print "@or_matches \n"; __END__ 0 peter 42 mary 99 mary

But now I got the requirement to find fields which match multiple regex at the same time ... and AFAIK the regex grammar doesn't have an AND operator

The best guess I have is using zero-look-ahead assertions:

$text = 'A0 peter Z0 ... A42 peter, paul and mary Z42 ... A99 mary Z9 +9'; my @and_matches =( $text =~ m/ A(\d+)[^Z]* ( (?=mary) [^Z]* peter | (?=peter) [^Z]* mary ) [^Z]*Z /xg ); print "@and_matches \n"; __END__ 42 peter, paul and mary

Well, already rather complicated for just two patterns ... and I doubt that it's fast ... any better suggestions?

Cheers Rolf

UPDATE:

Ok the following is already much better since it avoids or-chaining all possible orders of patterns just by anchoring the look-ahead at field-start.

print @and_matches =( $text =~ m/ A(\d+) ( (?= [^Z]* mary ) (?= [^Z]* peter ) [^Z]* ) Z\1 /xg );
Footnotes:

¹) >5.10 IIRC

Comment on realizing AND in regex?
Select or Download Code
Re: realizing AND in regex?
by BillKSmith (Chaplain) on Sep 13, 2012 at 12:26 UTC

    It is certainly easier and posibly faster to use separate regexes. Use perl's logic rather than the regex engine to combine the results.

    Bill
Re: realizing AND in regex?
by pvaldes (Chaplain) on Sep 13, 2012 at 22:39 UTC

    I got the requirement to find fields which match multiple regex at the same time

    while (<DATA>){ if (/foo/ && /bar/){ print $_ } } __DATA__ fobur barfoo foobar fobar bar foo

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://993441]
Approved by chacham
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others imbibing at the Monastery: (5)
As of 2014-12-29 04:47 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (184 votes), past polls