in reply to Matching against $_ behaves differently than matching against a named scalar?

Other monks have come close but have not said this specifically; choroba illustrated but did not explain why using a lexical instead of $_ produces a different result.

In Perl, the regex capture variables ($1, $2, etc.) are implicitly local to every containing block, but retain their values within a block until the next successful match replaces them. Introducing a lexical in the loop header implicitly introduces another block scope, which means that the regex capture variables are implicitly reset on every loop iteration, (strictly, each loop iteration has its own set of regex capture variables) but your second example is also subtly different because you forgot to test defined(my $line = <$fh>), so a line that evaluates to a false value will cause that loop to terminate early.

The regex match itself returns a boolean value indicating success in Perl, and standard practice is to test that return value to determine if the regex matched, rather than relying on the truth of the capture variables.

Here's a slightly different example to illustrate:

open (my $fh, '<text.txt'); while (<$fh>) { print "$1 $2" . "\n" if /^([^ ]+) ([^ ]+)/; }

The exact rules for the regex capture variables are prickly, with lots of sharp edges, so good practice is to consider the regex capture variables only valid after a successful match until the next match is attempted and to have unspecified values at all other times.

Edited by jcb: As davido pointed out, the defined test is implicit when an I/O operator is used in a loop test.

Replies are listed 'Best First'.
Re^2: Matching against $_ behaves differently than matching against a named scalar?
by davido (Cardinal) on Apr 21, 2020 at 03:08 UTC

    I want to clarify something based on documentation from perlop:

    while (my $line = <STDIN>) { print $line }

    In these loop constructs, the assigned value (whether assignment is automatic or explicit) is then tested to see whether it is defined. The defined test avoids problems where the line has a string value that would be treated as false by Perl; for example a "" or a "0" with no trailing newline.

    So in this case the defined test doesn't need to be done explicitly, it's already being done implicitly.


      You are correct. I had forgotten about that particular bit of DWIM, since I do not rely on it in my own code.