http://www.perlmonks.org?node_id=902039

Here's a link to a recent blog post of mine. Nothing novel, but I think it usefully collects and distills what can be found by Googling.

http://dencklatronic.blogspot.com/2011/04/underscore-while-and-angle-brackets-in.html

In reading the guidelines for posting here, I didn't see anything discouraging posting links to external content like this, but if this is inappropriate, somebody let me know and let me know whether I should not post like this at all, or whether posting is okay but I should re-post the content rather than a link.

Update: Based on feedback received, I have the content below. It is somewhat "PerlMonkified."

This post discusses some pitfalls of the Perl construct "while (<>)". We'll refer to it as WAB (While Angle Bracket).

WAB sets $_ but does not localize *_ (the underscore glob). This can cause undesired interactions with other constructs that set $_. These constructs include "for ()", "foreach ()", "map", and "grep".

In general, if a WAB is dynamically enclosed by one of these other constructs, it will try to stomp on the enclosing $_. If $_ is not a constant, it will succeed in stomping on it. If $_ is a constant, recent Perls will die with "Modification of a read-only value attempted".

The program below, perl-underscore.pl, shows this. Its WAB stomps on the $_ set by the enclosing "for ()". What's more, since $_ is just an alias to the members of the list given to "for ()", the WAB stomps on the list, too!

#!/usr/bin/perl use warnings; use strict; use Data::Dumper; sub f { while ( <> ) {} } my $a = 1; for ( $a ) { f(); print Dumper( $_, $a ); }

The command "true | ./perl-underscore.pl" gives the following output.

$VAR1 = undef;
$VAR2 = undef;

The effect is more dramatic if the list given to "for ()" contains constants. If we modify perl-underscore.pl with the following patch and run it under a recent Perl, it dies with "Modification of a read-only value attempted".

12c12
< for ( $a )
---
> for ( 1, $a )

There are various ways to avoid WAB's behavior. One way is to explicitly localize *_. For example, we could modify perl-underscore.pl with the following patch.

8c8
< sub f { while ( <> ) {} }
---
> sub f { local *_; while ( <> ) {} }

The output would then be as follows.

$VAR1 = 1;
$VAR2 = 1;

We can also just stop using WAB. For example, we could modify perl-underscore.pl with the following patch.

8c8
< sub f { while ( <> ) {} }
---
> sub f { while ( my $f = <> ) {} }

That concludes the main body of this post. Some additional notes appear below, for the more curious.

Some additional notes

Though WAB's behavior is often undesirable, it is far from undocumented. See, for example, I/O Operators in the official Perl documentation.

Constructs other than WAB that set $_ work fine together because they localize *_. These constructs include "for ()", "foreach ()", "map", and "grep".

It is not sufficient to localize $_, i.e. to do "local $_". We need to localize the entire glob for underscore, i.e. we need to do "local *_". This is needed in case $_ is currently aliased to a magic constant like $1. In such a case, doing "local $_" gets fresh storage for $_ but still leaves it as a constant, i.e. read-only.

In recent versions of Perl, you can use "my $_" to achieve an effect similar to "local *_". The effect is only similar, not identical, because this makes the scope of $_ lexical rather than dynamic.

We could continue to use WAB but still avoid undesired interactions if we stopped using the other constructs that set $_, or started using them in a WAB-defensive way. This feels a little like "blaming the victim," but who said programming was fair?

In recent versions of Perl, a WAB-defensive way to use any of these constructs is to precede them with "my $_". The dynamically-enclosed code may have to be changed because this makes the scope of $_ lexical rather than dynamic.

If "my $_" is unavailable or undesirable, we can use alternate forms of "for" and "foreach" that do not set $_. For example, we could modify perl-underscore.pl with the following patch.

12c12
< for ( $a )
---
> for my $i ( $a )
15c15
<     print Dumper( $_, $a );
---
>     print Dumper( $i, $a );

Unlike "for ()" and "foreach ()", "map" and "grep" don't have alternate forms that would allow us to avoid setting $_. But, if "my $_" is unavailable or undesirable, we can use "map" and "grep" in the following WAB-defensive way.

  1. Capture $_ in a "my" variable before any WAB has a chance to change it and then use that "my" variable instead of $_.
  2. Copy the list to be operated on to a temporary to make WAB's attempts to stomp (a) succeed (b) be invisible.

E.g. imagine "f" might use WAB in the code below.

map { f(); g( $_ ) } @a;

We might WAB-defend this code as follows.

{ map { my $x = $_; f(); g( $x ) } ( my @tmp = @a ) }

This is cumbersome, but might be the best choice if it is costly to change "f".

I conclude this post by listing some relevant links below.