http://www.perlmonks.org?node_id=163907

I was helping a coworker with some regular expressions, and I realized he didn't understand several of the nuances of s///g. So I thought I would write out a few examples. For fun, I gave Tye a quiz to see if he could determine (without Perl) what the value of $_ would be after each regex. He aced it. (Of course.) Can you do the same?

For each regex, $_ starts out with '1234*5678'.
   s/(.\d)\d/$1/g;
   s/(.\d)\d+/$1/g;
   s/(.\d)\d\d?/$1/g;
   s/(?<=.)(\d)\d/$1/g;
   s/(?<=\d)(\d)./$1/g;
   s/(?<=\d)(\d)\d/$1/g;
   s/(?<=\D)(\d)\d/$1/g;
   s/(?<!^)\d+(\d)/$1/g;

Here is some code to help you check your answers:

#!perl use strict; my $nums = "1234*5678"; my @regexes = ( q/(.\d)\d/, q/(.\d)\d+/, q/(.\d)\d\d?/, q/(?<=.)(\d)\d/, q/(?<=\d)(\d)./, q/(?<=\d)(\d)\d/, q/(?<=\D)(\d)\d/, q/(?<!^)\d+(\d)/, ); $^A = ""; for my $regex ( @regexes ) { my $target = $nums; $target =~ s/$regex/$1/g; formline "\$_ = '$nums'; \@<<<<<<<<<<<<<<<<<<<<<<< ". "# Result: \$_ eq '$target'\n", "s/$regex/\$1/g;"; print $^A; $^A = ""; }
By the way, Tye isn't sure that all the cases are as obvious as one might think. In fact, we are pretty sure that some of the behavior demonstrated is undocumented. Enjoy!

Replies are listed 'Best First'.
(MeowChow) Re: RegEx Challenge
by MeowChow (Vicar) on May 04, 2002 at 02:14 UTC
    I'll see your challenge and raise another: without looking, who can tell me the value of $_ after:
    [/.(?{$_.=$&})/g]
    or how about:
    /.(?{$_.=$&})^/
    Yes, this is a trick question.
       MeowChow                                   
                   s aamecha.s a..a\u$&owag.print
Re: RegEx Challenge
by japhy (Canon) on May 03, 2002 at 21:18 UTC
    As a regex fanatic (who aced the test), I'm curious what behaviors you feel aren't documented.

    _____________________________________________________
    Jeff[japhy]Pinyan: Perl, regex, and perl hacker, who'd like a (from-home) job
    s++=END;++y(;-P)}y js++=;shajsj<++y(p-q)}?print:??;

      The look-behind. It looks at the original scalar rather then the modified. It makes sense, but it could really go either way.
        Ah yes, that. I've witnessed people getting bitten by that.

        _____________________________________________________
        Jeff[japhy]Pinyan: Perl, regex, and perl hacker, who'd like a (from-home) job
        s++=END;++y(;-P)}y js++=;shajsj<++y(p-q)}?print:??;