Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

Perl Pattern Matching & RegEx's

by jaiieq (Novice)
on Mar 07, 2013 at 13:31 UTC ( #1022226=perlquestion: print w/ replies, xml ) Need Help??
jaiieq has asked for the wisdom of the Perl Monks concerning the following question:

Say I have the following string: AAABCDAAADCBAAABBDAAA

I need to extract all instances of AAA(anything)AAA, so I used the following to try and do that:

my $string = 'AAABCDAAADCBAAABBDAAA'; my @matches = $string =~ /AAA\w+AAA/g;

The only result returned is the full string, whereas I need...:

AAABCDAAA AAADCBAAA AAABBDAAA AAABCDAAADCBAAA AAADCBAAABBDAAA AAABCDAAADCBAAABBDAAA
Any ideas?

Comment on Perl Pattern Matching & RegEx's
Select or Download Code
Replies are listed 'Best First'.
Re: Perl Pattern Matching & RegEx's
by choroba (Canon) on Mar 07, 2013 at 14:29 UTC
    The main problem is matches in /g cannot overlap. This can be solved by using look-ahead, though:
    #!/usr/bin/perl use warnings; use strict; use feature qw(say); my $string = 'AAAbcdAAAdcbAAAbbdAAAxAAAA'; my $delimiter = 'AAA'; my @positions; push @positions, pos($string) while $string =~ /(?=$delimiter)/g; for my $from (@positions) { for my $to (grep $_ - length $delimiter > $from, @positions) { say substr($string, $from, $to - $from) . $delimiter; } }

    Update: Typo fixed. Thanks jaiieq, damn netbooks.

    لսႽ ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ
      This is exactly what I needed. Which also gives me the ability to easily change the delimiter to say 'AA' and produce the output I need. Thank you.

      There is a small typo in your code as you have a quote after $delimiter in the say line

      Wow, substr is so much better here than split/join!
Re: Perl Pattern Matching & RegEx's
by Dallaylaen (Monk) on Mar 07, 2013 at 14:16 UTC

    Why not split the string into chunks delimited by AAA, and then combine the chunks as you want and join them back?As in:

    #!/usr/bin/perl -w use strict; my $string = shift || 'AAABCDAAADCBAAABBDAAA'; my @between = split /AAA/, $string, -1; pop @between; shift @between; for (my $i = 0; $i<@between; $i++) { for (my $j = $i; $j<@between; $j++) { print join "AAA", "", @between[ $i .. $j ], "\n" }; };

    This won't solve the problem if your string contains AAAA, though.

    UPDATE: This substr-based solution is much better, it doesn't suffer from AAAA problem and probably uses less memory, too.
      This looks to be exactly what I was looking for. I am going to try it on a few other test cases and see how it works. Thank you!
Re: Perl Pattern Matching & RegEx's
by Athanasius (Abbot) on Mar 07, 2013 at 14:28 UTC

    Here is a regex-based solution. As Anonymous Monk has pointed out, the non-greedy quantifier ? is an important component. But to get all the matches, you need to loop:

    #! perl use strict; use warnings; my %matches; my $s = 'zzAAABCDAAADCBAAABBDAAA'; my $t = $s =~ s/^[^A]*?(AAA.*)/$1/r; while ($t =~ /^AAA.+?AAA/) { my $u = $t; while ($u =~ /^(AAA.+?AAA)/) { my $match = $1; $match =~ s/\|/AAA/g; ++$matches{$match}; $u =~ s/(AAA.+?)AAA/$1\|/; } $t =~ s/^AAA.+?(AAA.*)/$1/; } print $_, "\n" for sort keys %matches;

    Output:

    0:15 >perl 563_SoPW.pl AAABBDAAA AAABCDAAA AAABCDAAADCBAAA AAABCDAAADCBAAABBDAAA AAADCBAAA AAADCBAAABBDAAA 0:23 >

    The inner loop finds successively longer matches by changing the AAA at the end of each match into a non-word character. The outer loop truncates the search string by removing everything up to the second AAA.

    Hope that helps,

    Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,

Re: Perl Pattern Matching & RegEx's
by Anonymous Monk on Mar 07, 2013 at 14:00 UTC

    You want to use  +? as in \w+? , see perlfaq6, perlrequick

    The akward to use and outdated YAPE::Regex::Explain can help explain

    the problem (\w+)

    the solution (\w+?)

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1022226]
Approved by marto
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others browsing the Monastery: (13)
As of 2015-07-29 12:27 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (263 votes), past polls