Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

Regex help

by Anonymous Monk
on Dec 11, 2013 at 07:01 UTC ( [id://1066550]=perlquestion: print w/replies, xml ) Need Help??

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi all,

In the code below, I'm trying to match and capture a small letter character that is flanked by exactly 4 capital letters on its left and right respectively. My regex captures the letter "s" on Line 1 (at position 12) and at Line 2, it captures only the letter "c" (position 6) but not "f" (position 11).

#Each line of $text is independent of the other and not joined as a co +ntinuous line. $text = q~ adfRadfaUYBGsQWERaeYETEWoyMSn nbPOIVcRCVVfOOPQbHbnRIIqWweRT ~; $result = ""; while( $text =~ /[a-z]+[A-Z]{4}([a-z]{1})[A-Z]{4}[a-z]+/g) { $result .= $1; } print $result; # prints sc but should print scf

How do I modify my code to match the "f" on the second line that is also flanked by exactly 4 capital letters on its side?

Thanks in anticipation!

Replies are listed 'Best First'.
Re: Regex help
by Your Mother (Archbishop) on Dec 11, 2013 at 07:09 UTC

    A way–

    my $text = <<""; adfRadfaUYBGsQWERaeYETEWoyMSn nbPOIVcRCVVfOOPQbHbnRIIqWweRT for my $match ( $text =~ /(?<=[A-Z]{4})([a-z])(?=[A-Z]{4})/g ) { print $match, $/; }

      That will also extract single lc alphas that are preceded or followed by more than four uc alphas:

      >perl -wMstrict -le "my $text = 'XXXXXaXXXXX'; ;; for my $match ( $text =~ /(?<=[A-Z]{4})([a-z])(?=[A-Z]{4})/g ) { print $match, $/; } " a

      If AnonyMonk wants single lc alphas that are preceded and followed by exactly four uc alphas (and also concatenated into a string), here's one way:

      >perl -wMstrict -le "my $text = qq{XXXXaXXXXbYYYYYcYYYYYdXXXXeXXXXfgXXXX\nXXXXhXXXXiYYYYY}; print qq{[[$text]]}; ;; my $result = join '', $text =~ m{ (?<= (?<! [[:upper:]]) [[:upper:]]{4}) [[:lower:]] (?= [[:upper:]]{4} (?! [[:upper:]])) }xmsg; print qq{'$result'}; " [[XXXXaXXXXbYYYYYcYYYYYdXXXXeXXXXfgXXXX XXXXhXXXXiYYYYY]] 'aeh'

      (If some look-around is good, more is better!)

      Update: Here are the beginnings of a test bed for playing with this and other regexen:

      >perl -wMstrict -le "for my $text (qw( XXXXaXXXX XXXXaXXXXxyXXXXbXXXXxZZZxZZZxYYYYY XXXXxZZZ ZZZxXXXX XXXXxYYYYY YYYYYxXXXX XXXXxyXXXX XXXXxyXXXXxyXXXX YYYYYaYYYYY ZZZaZZZ) ) { my $result = join '', $text =~ m{ (?<= (?<! [[:upper:]]) [[:upper:]]{4}) [[:lower:]] (?= [[:upper:]]{4} (?! [[:upper:]])) }xmsg; print qq{'$text' -> '$result'}; } " 'XXXXaXXXX' -> 'a' 'XXXXaXXXXxyXXXXbXXXXxZZZxZZZxYYYYY' -> 'ab' 'XXXXxZZZ' -> '' 'ZZZxXXXX' -> '' 'XXXXxYYYYY' -> '' 'YYYYYxXXXX' -> '' 'XXXXxyXXXX' -> '' 'XXXXxyXXXXxyXXXX' -> '' 'YYYYYaYYYYY' -> '' 'ZZZaZZZ' -> ''

        Thank you so much! I tried yours and it works like a champ.

        Thanks everyone for helping :)

Re: Regex help
by mendeepak (Scribe) on Dec 11, 2013 at 08:56 UTC

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1066550]
Approved by hdb
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others sharing their wisdom with the Monastery: (4)
As of 2024-04-24 20:37 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found