http://www.perlmonks.org?node_id=544756


in reply to Delimited Backtracking with Regex

Take advantage of regexp backtracking to find all the possibilities.

my $str = "TXXXABCDGXXXCCCDTGYYYCCCYYYCC"; local our @matches; $str =~ m/ (XXX.*YYY) # Search and capture (?{ push @matches, $1 }) # Save result (?!) # Try again /x; print "$_\n" foreach @matches;

outputs

XXXABCDGXXXCCCDTGYYYCCCYYY XXXABCDGXXXCCCDTGYYY XXXCCCDTGYYYCCCYYY XXXCCCDTGYYY

Update: Followed japhy's suggestion

Replies are listed 'Best First'.
Re^2: Delimited Backtracking with Regex
by japhy (Canon) on Apr 21, 2006 at 02:14 UTC
Re^2: Delimited Backtracking with Regex
by ikegami (Pope) on Apr 21, 2006 at 02:26 UTC
    I've been asked why I used a package variable instead of a lexical variable. It's because regexps close around the lexicals that exist when they are first run.
    # pass 1 2 3 # --- --- --- sub test { my @matches; '' =~ / (?{ push @matches, 'a' }) (?{ print(scalar(@matches), "\n") }) # 1 2 3 /xg; print(scalar(@matches), "\n"); # 1 0 0 } test() for 1..3;

    A variable called @matches is created everytime test is called. The regexp always uses the variable from the first call.

      You can still do the same thing using lexicals (if you are allergic to using symbol-table variables). The only thing to be careful is to reuse the same push statement and same target array every time. This is adapted from the code in Re^3: Regexes: finding ALL matches (including overlap):
      { my @matches; my $push = qr/(?{ push @matches, $1 })/; sub match_all_ways { my ($string, $regex) = @_; @matches = (); $string =~ m/($regex)$push(?!)/; return @matches; } } print match_all_ways( "TXXXABCDGXXXCCCDTGYYYCCCYYYCC", qr/XXX.*YYY/ );
      Technically, $push is not needed -- you could just include (?{push @matches, $1}) in the m// statement inline. However, I like this way as it makes it a much more obvious that this part of the regex is only compiled once.

      blokhead

Re^2: Delimited Backtracking with Regex
by neversaint (Deacon) on Apr 21, 2006 at 02:24 UTC
    Dear ikegami,
    To confirm my understanding on the use of "local our".
    You mean in this context right?
    foreach my $ss (@str_set) { local our @matches; $ss =~ m/ (XXX.*YYY) # Search and capture (?{ push @matches, $1 }) # Save result (?!) # Try again /x; print Dumper \@matches ; }
    Namely if I use "my" instead it will return:
    $VAR = [ # with something ] # and then empty.... $VAR = []; $VAR = []; etc...


    ---
    neversaint and everlastingly indebted.......