Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot

Localized Backreferences, If Statements & Blocks

by doran (Deacon)
on Sep 15, 2000 at 03:50 UTC ( #32591=perlquestion: print w/replies, xml ) Need Help??
doran has asked for the wisdom of the Perl Monks concerning the following question:

I had (apparently mistakenly) thought that a backreference to a parenthesized substring in a regexp would be undefined or at least equal to '' if it didn't match, so that in:
#!/usr/bin/perl -Tw use strict; $|++; my $valid ='realgoodname'; my $invalid ='bad name'; my ($test1,$test2); if ($valid =~ /^([a-z]+)$/i){ $test1=$1; } { $invalid =~ /^([a-z]+)$/i; $test2=$1; die "Invalid test2" unless $test2; } print "Valid word\t= $valid\nInvalid word\t= $invalid\n\n"; print "Valid test\t= $test1\nInvalid test\t= $test2\n"; exit();
$test2 would always be left without a value (causing the script would die in this case). I certainly thought this given the $test1=$1; line was in a block after that if statement. Plus I created another block by throwing that pair of braces around the taint-checking/untainting of $invalid and $test2. This feeling was re-enforced when I checked the Camel book which sez:

"The variables $1, $2, $3, ... are automatically localized, and their scope...extends to the end of the enclosing block or eval string, or to the next successfull pattern match, whichever comes first."

Because of this I felt relatively safe in throwing a pair of braces around these tests, thinking that the backreference would never escape my enclosing block.

However, it seems that the block after the if statement isn't enough. In the above example, $test2 is left with 'realgoodname' as a value. If I throw another block around the if statement, it seems to fix it.

I guess my question is: How is it that $1 (in this case) can still be 'seen' outside of it's enclosing block? True, the second pattern match wasn't successful, but I sure thought that closing brace after the if statement would have prevented the previous value of $1 from carrying over to the next use of $1.

If this is clearly documented someplace, feel free to point me in the correct direction. -Thanks

Replies are listed 'Best First'.
(Ovid) Re: Localized Backreferences
by Ovid (Cardinal) on Sep 15, 2000 at 04:20 UTC
    Hmm, you get the results you expect if you place an extra pair of braces around the if construct.
    #!/usr/bin/perl -w use strict; $|++; my $valid ='realgoodname'; my $invalid ='bad name'; my ($test1,$test2); { if ($valid =~ /^([a-z]+)$/i){ $test1=$1; } } { $invalid =~ /^([a-z]+)$/i; $test2=$1; die "Invalid test2" unless $test2; } print "Valid word\t= $valid\nInvalid word\t= $invalid\n\n"; print "Valid test\t= $test1\nInvalid test\t= $test2\n"; exit();
    Apparently, stuff inside of the braces on an if statement is not localized. Therefore, the behaviour of $1 does not fit any of the conditions you specified from the Camel. I learn something new about Perl every day :)


    Update: Duh! The regex wasn't in the block. It's amazing how simple these things often turn out to be. Thanks, chromatic.

    Join the Perlmonks Setiathome Group or just go the the link and check out our stats.

      It's not the scope of the if block that matters here, it's the scope of the regex. $1 is localized to the same scope. When you put the regex in that block, you enforced a more specific scope, and the behavior is more what you expect.

      Apparently, the interpreter doesn't treat a regex any differently if it's in an if statement or not -- consistency is good.

      In the original post, the regex was in package scope (not in an enclosing block). That means the end of the scope is the end of the file -- or the next successful backreference match which will assign something else to $1. Since the next regex doesn't match, and $1 is used before the end of the file, it still contains the last successful capture.

        Ahhhhhhhhh, I see the light.
        Thank you Brother...
      Here is another new thing then. To quote Dan Sugalski about $1 etc in response to my claiming once that they were local:
      No they aren't. They're mutant pseudo-scalars attached to the underlying optree, and you may well get a $1 from the last time you matched that regex. (I've got a trivial program that shows the problem with a recursive call)
      (His "trivial program" causes threaded Perl to core-dump.)
Re: Localized Backreferences
by Carl-Joseph (Scribe) on Sep 16, 2000 at 02:03 UTC

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://32591]
Approved by root
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others browsing the Monastery: (6)
As of 2018-08-15 13:04 GMT
Find Nodes?
    Voting Booth?
    Asked to put a square peg in a round hole, I would:

    Results (160 votes). Check out past polls.