Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

Scope of regular expression variables

by FauxPasIII (Initiate)
on Aug 29, 2000 at 22:46 UTC ( #30185=perlquestion: print w/replies, xml ) Need Help??

FauxPasIII has asked for the wisdom of the Perl Monks concerning the following question:

What is the scope of the back-referencing variables in
regular expressions ? Here's a code sample I was just
working on:

1>  if ($_ =~ /ORG\:(.+)/) {
2>      if ($1 =~ /^(.+)\;(.+)\;(.+)$/) {
3>          $card{'org'}=$1; $card{'div'}=$2; $card{'dept'}=$3;
4>      } elsif ($1 =~ /^\;(.+)\;(.+)$/) {

After writing this, it occured to me that the $1 from the
match on line 1 ought to be stomped by all the subsequent
matches, and yet it is not... so exactly what is the scope
of $1..$9, and why, exactly, does this code work ? ;-)

Thanks !

Replies are listed 'Best First'.
Re: Scope of regular expression variables
by Shendal (Hermit) on Aug 29, 2000 at 22:54 UTC
    According to the camel book (page 65 in 2nd edition for those of you keeping score at home),
    The variables $1, $2, $3 ... are automatically localized, and their sc +ope extends to the end of the enclosing block or eval string, or to t +he next successful pattern match, whichever comes first.
    So, your $1 is "stomped" within the local scope of the enclosing braces. To elaborate:
    if ($_ =~ /ORG\:(.+)/) { # $1 in scope A if ($1 =~ /^(.+)\;(.+)\;(.+)$/) { # $1 now in scope B $card{'org'}=$1; $card{'div'}=$2; $card{'dept'}=$3; } # $1 now in scope A again elsif ($1 =~ /^\;(.+)\;(.+)$/) { # $1 now in scope C
    Update: See merlyn's clarification below. Seems either (a) I didn't understand the book correctly, or (b) he (or Larry or Tom) should've written it clearer. ;-)
    Hope that helps,
    Shendal
      Uh, not quite. It's because it's a failed match. Apparently, the block that localizes the match variables doesn't begin until after the open curlies (and does not include the boolean expression of the if). Check this:
      $_ = "abcde"; if (/(.)(.)/) { # first two if (/(.)(.)$/) { # last two, by your theory won't upset print "inner: $1 $2\n"; } # by your theory, should see first two again print "outer: $1 $2\n"; }
      for which I get
      inner: d e outer: d e
      But if you add a block:
      $_ = "abcde"; if (/(.)(.)/) { # first two { # ADDED if (/(.)(.)$/) { # last two, by your theory won't upset print "inner: $1 $2\n"; } } # ADDED # by your theory, should see first two again print "outer: $1 $2\n"; }
      we do in fact get the expected:
      inner: d e outer: a b

      -- Randal L. Schwartz, Perl hacker

        It works that way for manual localization in the conditional as well. That's fairly counterintuitive considering that my() does the opposite.
        ($one, $two) = (1,2); { if (local ($one, $two) = qw(abc def)) { print "local inner : $one $two\n"; } print "local outer : $one $two\n"; } print "local outside: $one $two\n"; my ($a, $b) = (1,2); { if (my ($a, $b) = qw(abc def)) { print "my inner : $a $b\n"; } print "my outer : $a $b\n"; } print "my outside: $a $b\n"; __END__ local inner : abc def local outer : abc def local outside: 1 2 my inner : abc def my outer : 1 2 my outside: 1 2

        The behavior of the regex vars is similar to what is obtained by using "local" inside the if() conditional.

        I've modified Merlyn's code to show that the value of regex vars is sustained through a function call.

        sub PrintRegExVar { print "Inside called sub: $1 $2\n"; } $_ = "abcde"; if (/(.)(.)/) { # first two if (/(.)(.)$/) { # last two # # This prints "d e" # print "inner: $1 $2\n"; &PrintRegExVar; } # # This also prints "d e" # print "outer: $1 $2\n"; &PrintRegExVar; } $_ = "abcde"; if (/(.)(.)/) { # first two { # ADDED EXTRA BLOCK if (/(.)(.)$/) { # last two # # This prints "d e" # print "inner: $1 $2\n"; &PrintRegExVar; } } # ADDED EXTRA BLOCK # # This prints "a b" # print "outer: $1 $2\n"; &PrintRegExVar; } __END__ inner: d e Inside called sub: d e outer: d e Inside called sub: d e inner: d e Inside called sub: d e outer: a b Inside called sub: a b
        I'm not sure whether I disagree or not. To wit:
        $_ = "abcde"; if (/(.)(.)/) { # first two if (/(.)(.)$/) { print "inner: $1 $2\n"; } print "outer: $1 $2\n"; } print "outer: $1 $2\n";
        $1 and $2 aren't localized within the if block -- unless the regex is also within the same enclosing block.

        (At least, that makes it more explicit.)

RE: Scope of regular expression variables
by nuance (Hermit) on Aug 29, 2000 at 22:58 UTC

    The behaviour you are seeing is caused by the fact that $1 and friends are only overwritten by a successfull match or substitution that has backreferences.

    So your code says check this expressions against $1, if that matched then and only then will $1, $2 & $3 be given the new values. If it didn't match, then there is no statement to overwrite $1, so your that your else clause is checking the same value of $1.

    Nuance

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://30185]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others lurking in the Monastery: (2)
As of 2022-05-28 01:39 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Do you prefer to work remotely?



    Results (98 votes). Check out past polls.

    Notices?