Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

Why my Regex doesn't work

by flappygoat (Initiate)
on Apr 11, 2017 at 13:40 UTC ( [id://1187643]=perlquestion: print w/replies, xml ) Need Help??

flappygoat has asked for the wisdom of the Perl Monks concerning the following question:

Hi everyone. I want to have a regex that matches all occurrences of a letter sandwiched between 2 same letters(the letter in the middle is different than the two outer letters) and changes the middle letter so that all 3 letters are the same. For instance it would change CCHHHCHHCHC into CCHHHHHHHCC. I only work with 3 chars C,H,E but I wrote for a general case, but it won't work the way I want. my regex is

s/(\w)[^\1]\1/$1$1$1/gi

I don't understand why it matches HHH if I wrote that the middle letter cannot be the same as first. Why is this wrong and how to make it work?

Replies are listed 'Best First'.
Re: Why my Regex doesn't work
by AnomalousMonk (Archbishop) on Apr 11, 2017 at 14:01 UTC

    A \n regex backreference is not active in a character class;  [^\1] is equivalent to  [^1]   (Update: Nope. Not quite. See Update below.)

    Try:

    c:\@Work\Perl\monks>perl -wMstrict -le "my $s = 'CCHHHCHHCHC'; print qq{'$s'}; ;; $s =~ s/(\w) (?!\1). \1/$1$1$1/xgi; print qq{'$s'}; " 'CCHHHCHHCHC' 'CCHHHHHHHHC'
    (note the use of the  /x modifier — for clarity only).

    Update: In fact,  \1 in a character class is an octal escape sequence:

    c:\@Work\Perl\monks\flappygoat>perl -wMstrict -le "my $s = qq{\1\x01\o{001}\cA}; print 'match' if $s =~ /[^\1]/; print 'count: ', $s =~ tr/\1//; " count: 4
    (no match; nothing printed). Wonderful what you can find out if you actually test stuff.


    Give a man a fish:  <%-{-{-{-<

      The OP wanted CCHHHHHHHCC, not CCHHHHHHHHC. Maybe that's a typo, or maybe he wants all the surrounded characters substituted simultaneously, as in a cellular automaton. In that case,
      s/(?<=(\w))(?!\1).(?=\1)/$1/g;

        Yeah. The problem seems underspecified.

        For example, what output would the OP want from CHCHC?

        AnomalousMonk's snippet outputs CCCHC, while anonymonk's gives CCHCC...

        thank you so much! That is exactly what I needed. I wonder tho if it's possible to do without "lookaround".
      Thank you. That is what I was trying to code even tho I see now that it has a problem, it won't match the last 3 chars CHC, but the other comment solved it.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1187643]
Approved by Athanasius
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others contemplating the Monastery: (3)
As of 2024-04-18 23:10 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found