Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

Re: Delimited Backtracking with Regex

by ikegami (Pope)
on Apr 21, 2006 at 02:06 UTC ( #544756=note: print w/replies, xml ) Need Help??


in reply to Delimited Backtracking with Regex

Take advantage of regexp backtracking to find all the possibilities.

my $str = "TXXXABCDGXXXCCCDTGYYYCCCYYYCC"; local our @matches; $str =~ m/ (XXX.*YYY) # Search and capture (?{ push @matches, $1 }) # Save result (?!) # Try again /x; print "$_\n" foreach @matches;

outputs

XXXABCDGXXXCCCDTGYYYCCCYYY XXXABCDGXXXCCCDTGYYY XXXCCCDTGYYYCCCYYY XXXCCCDTGYYY

Update: Followed japhy's suggestion

Replies are listed 'Best First'.
Re^2: Delimited Backtracking with Regex
by japhy (Canon) on Apr 21, 2006 at 02:14 UTC
Re^2: Delimited Backtracking with Regex
by ikegami (Pope) on Apr 21, 2006 at 02:26 UTC
    I've been asked why I used a package variable instead of a lexical variable. It's because regexps close around the lexicals that exist when they are first run.
    # pass 1 2 3 # --- --- --- sub test { my @matches; '' =~ / (?{ push @matches, 'a' }) (?{ print(scalar(@matches), "\n") }) # 1 2 3 /xg; print(scalar(@matches), "\n"); # 1 0 0 } test() for 1..3;

    A variable called @matches is created everytime test is called. The regexp always uses the variable from the first call.

      You can still do the same thing using lexicals (if you are allergic to using symbol-table variables). The only thing to be careful is to reuse the same push statement and same target array every time. This is adapted from the code in Re^3: Regexes: finding ALL matches (including overlap):
      { my @matches; my $push = qr/(?{ push @matches, $1 })/; sub match_all_ways { my ($string, $regex) = @_; @matches = (); $string =~ m/($regex)$push(?!)/; return @matches; } } print match_all_ways( "TXXXABCDGXXXCCCDTGYYYCCCYYYCC", qr/XXX.*YYY/ );
      Technically, $push is not needed -- you could just include (?{push @matches, $1}) in the m// statement inline. However, I like this way as it makes it a much more obvious that this part of the regex is only compiled once.

      blokhead

Re^2: Delimited Backtracking with Regex
by neversaint (Deacon) on Apr 21, 2006 at 02:24 UTC
    Dear ikegami,
    To confirm my understanding on the use of "local our".
    You mean in this context right?
    foreach my $ss (@str_set) { local our @matches; $ss =~ m/ (XXX.*YYY) # Search and capture (?{ push @matches, $1 }) # Save result (?!) # Try again /x; print Dumper \@matches ; }
    Namely if I use "my" instead it will return:
    $VAR = [ # with something ] # and then empty.... $VAR = []; $VAR = []; etc...


    ---
    neversaint and everlastingly indebted.......

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://544756]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others pondering the Monastery: (5)
As of 2020-11-29 19:54 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?