Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris

$1 trap

by chunlou (Curate)
on Aug 10, 2003 at 07:31 UTC ( #282572=perlquestion: print w/replies, xml ) Need Help??
chunlou has asked for the wisdom of the Perl Monks concerning the following question:

Consider the following code.

print chg("word", qr/(\d)/); # print "nope" "2" =~ /(2)/; print chg("word", qr/(\d)/); # print "yes" sub chg { $_[0] =~ $_[1]; return $1 ? "yes\n" : "nope\n"; }

"2" =~ /(2)/ sets $1 to 2, which inadvertently affects chg() since chg() checks $1 for valid condition but $1 was set from outside.

If I'm using someone else's code where it might be using $1 for various things that I might not be aware of, is there anything I can do to avoid such a trap? It's kind of hard to track down such things in a large script. Thanks.

Replies are listed 'Best First'.
Re: $1 trap
by Abigail-II (Bishop) on Aug 10, 2003 at 09:10 UTC
    This is a well-known "trap", and can easily happen without creating such a sub. The problem is your assumption that a failed regular expression would somehow undefine $1. But it doesn't. The proper way of writing your sub is:
    sub chg { $_ [0] =~ $_ [1] && $1 ? "yes\n" : "no\n"; }


      sub chg { $_ [0] =~ $_ [1] && $1 ? "yes\n" : "no\n"; }

      Isn't the && $1 redundant here? or am I missing something out?

      He who asks will be a fool for five minutes, but he who doesn't ask will remain a fool for life.

      Chady |

        The OP's code was actually returning "yes" if $1 contained a true value. So, the following would've returned "no":

        chg("w0rd", qr/(\d)/); # that's a zero, not capital O

        Black flowers blossom
        Fearless on my breath

Re: $1 trap
by pzbagel (Chaplain) on Aug 10, 2003 at 07:59 UTC

    Well, for one, you should only trust $1 if your regex suceeds. Remember, the $1..$x variables are set after each successful regex with capturing parantheses. However, they are not cleared when a regex doesn't match. You could put the return statements in an if-block. Even better, check out this code. No $1 to worry about.

    print BLAH::chg("word", qr/(\d)/); # print "nope" "2" =~ /(2)/; print BLAH::chg("word", qr/(\d)/); # print "nope" print BLAH::chg("3", qr/(\d)/); # print "yes" + package BLAH; sub chg { return $_[0] =~ $_[1] ? "yes\n" : "nope\n"; } # Alternatively, you can do: # # if($_[0] =~ $_[1]) # { # return "yes\n"; # } # return "nope\n"; __OUTPUT__ nope nope yes


Re: $1 trap
by ihb (Deacon) on Aug 10, 2003 at 17:47 UTC
    As of Perl 5.7.0 this behaviour has changed:
    The regular expression captured submatches ($1, $2, ...) are now more consistently unset if the match fails, instead of leaving false data lying around in them.

      Not to imply that its actually done consistently though. Just more consistently.

Re: $1 trap
by shotgunefx (Parson) on Aug 10, 2003 at 21:41 UTC
Re: $1 trap
by chunlou (Curate) on Aug 10, 2003 at 18:18 UTC

    Thanks for the replies.

    Are there modules or debugging techniques that can help track down such a logical $1 error? Consider the code below.

    print chg("ctag\n", qr/t(g)/); # print "ctag" "g" =~ /(c)/; print chg("ctag\n", qr/t(g)/); # print "ctag" print chg("ctag\n", qr/t(g)/); # print "ctag" "g" =~ /(g)/; print chg("ctag\n", qr/t(g)/); # print "ctga" sub chg { my ($str, $pat) = @_; $str =~ /$pat/; eval "\$str =~ tr/a$1/$1a/" if $1; return $str; }

    The code will work most of the time except when something like "g" =~ /(g)/ occurs. If I'm using someone else's code or even my own (I can't force someone or myself to write bug-free code), since the logical error occurs only once a while and even if it happens it's not necessarily noticeable, when such an error does occur, $1 is probably not the first thing coming to my mind when trying to debug it, especially when there're no warning messages pointing me to it.

    Are these similar issues whenever using such global variables as $_?

    Update: Maybe I should rephrase my question as not just how to avoid this $1 mistake in the first place but how to pinpoint one afterwards should an error already be happening.

      Not to sound facetious, but several other posters and I basically answered this question(which is almost identical to your original post) by saying: "Don't do that!" I know it's Perl and TIMTOWTDI, but sometimes the way you are doing it is wrong. Yet again the correct way to write the sub above is:

      sub chg { my ($str, $pat) = @_; eval "\$str =~ tr/a$1/$1a/" if $str =~ /$pat/; return $str; }

      As far as debugging other people's code which is coded like this. I suppose firing up the perl debugger and stepping through the program may shed some light, although something as obscure as your example which only fails with some singularly weird input may be very difficult to figure out. However, if you just shuffle around your house hitting yourself in the head with your Camel book and repeating the mantra, "Never check $1 blindly, make certain my regex succeeded." You will, after much meditation and contemplation, literally "beat" it into your head so whenever you see $1 in someone's code, you will instantly make sure it is properly initialized before it is used.


Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://282572]
Approved by gmax
Front-paged by gmax
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others chilling in the Monastery: (7)
As of 2017-06-24 19:55 GMT
Find Nodes?
    Voting Booth?
    How many monitors do you use while coding?

    Results (562 votes). Check out past polls.