Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

How do closures and variable scope (my,our,local) interact in perl?

by ELISHEVA (Prior)
on Jun 16, 2009 at 11:37 UTC ( #771986=perlquestion: print w/ replies, xml ) Need Help??
ELISHEVA has asked for the wisdom of the Perl Monks concerning the following question:

I'm really puzzled. Recently when reading through the perlref article, I came across a little example of closures. The example first confused me, and then made more sense, and then confused me again. Here is how.

At first, I was confused because, according to my thinking, each of the generated functions should print out the same value: <FONT COLOR='violet'>....</FONT>. After all, 'violet' is the last known value of $name.. Here is the example:

for my $name (@colors) { no strict 'refs'; # allow symbol table manipulation *$name = sub { "<FONT COLOR='$name'>@_</FONT>" }; }

Then it occured to me that I didn't look carefully enough: $name was local to the loop (it is declared for my $name ....) so maybe the value set by one loop iteration didn't affect the value in the next. Problem solved?

Well, no. To test this idea out I decided to experiment a bit by moving the declaration to outside of the for loop. Just for kicks I did this using both my and our to declare it. This resulted in some odd behavior that I'm having trouble understanding. When our is used to declare $name the closure sometimes complains of an undefined variable. When my is used to declare $name, perl sometimes refuses to change the value of the variable. Is this normal? is it a bug? I would be grateful to any fellow monks that can help me form a mental model to explain this behavior. I don't feel I understand this issue well enough to make a judgment either way.

I've put together an annotated script with some examples:

use strict; use warnings; # declaration of $name # pick one and comment out the other my $name; #our $name; # why is red printing out red rather than violet since # $name is always changing? Shouldn't it be picking up the # most recent value of $name, i.e. 'violet'? my @colors = qw(red blue green yellow orange purple violet); for $name (@colors) { no strict 'refs'; # allow symbol table manipulation *$name = sub { "<FONT COLOR='$name'>@_</FONT>" }; } # And another oddity: when we declare <c>our $name</c> *both* # red and violet complain about uniinitialized values even # though we have clearly set <c>$name</c> inside the for loop. # No such problems when we declare <c>$name</c> using <c>my</c>. print "----- \$name is constant inside sub ------\n"; print "red: ${\(red())}\n"; print "violet: ${\(violet())}\n"; # is it because our lack of use of $name within the generated # sub caused it to be compiled as a constant? # no: here we use $name, but it *still* prints out the value # at sub definition time, rather than the current value. for $name (@colors) { no strict 'refs'; # allow symbol table manipulation no warnings 'redefine'; #ignore noise about redefinitions *$name = sub { my $sOutput = "<FONT COLOR='$name'>@_</FONT>"; $name = 'something wacky and wonderful'; #change the value return $sOutput; }; } # when we declare <c>our $name</c> red, but not violet # complains about uninitialized values. There are no # complains when we declare <c>my $name</c>. Hmmm... print "----- \$name given new value inside sub ------\n"; print "red: ${\(red())}\n"; print "violet: ${\(violet())}\n"; # so maybe the closure picks up the value at the time of # definition and always uses that as the initial value even # if the value changes after subdefinition? # # but if so, why do red and violet use their own last value # setting of the value of name? Didn't we just redefine the # subroutines? shouldn't they be using the value at the time # of definition? print "----- sub used inside loop that defined it ------\n"; for $name (@colors) { no strict 'refs'; # allow symbol table manipulation no warnings 'redefine'; #ignore noise about redefinitions # why isn't $name getting set to the array eleemnt value # when the array element is red or violet? # Note: it gets set properly when we declare # <c>our $name</c> but not when we declare <c>my $name</c> print "trying out $name: "; *$name = sub { my $sOutput = "<FONT COLOR='$name'>@_</FONT>"; $name = 'something wacky and wonderful'; #change the value return $sOutput; }; print $name->() . "\n"; } print "----- sub used outside of loop that assigned it ------\n"; print "red: ${\(red())}\n"; print "green: ${\(green())}\n"; print "violet: ${\(violet())}\n";

Here is the output generated when $name is declared using my $name.

----- $name is constant inside sub ------ red: <FONT COLOR='red'></FONT> violet: <FONT COLOR='violet'></FONT> ----- $name given new value inside sub ------ red: <FONT COLOR='red'></FONT> violet: <FONT COLOR='violet'></FONT> ----- sub used inside loop that defined it ------ trying out something wacky and wonderful: <FONT COLOR='something wacky + and wonderful'></FONT> trying out blue: <FONT COLOR='blue'></FONT> trying out green: <FONT COLOR='green'></FONT> trying out yellow: <FONT COLOR='yellow'></FONT> trying out orange: <FONT COLOR='orange'></FONT> trying out purple: <FONT COLOR='purple'></FONT> trying out something wacky and wonderful: <FONT COLOR='something wacky + and wonderful'></FONT> ----- sub used outside of loop that assigned it ------ red: <FONT COLOR='something wacky and wonderful'></FONT> green: <FONT COLOR='something wacky and wonderful'></FONT> violet: <FONT COLOR='something wacky and wonderful'></FONT>

And here is the same script run when my $name is commented out and replaced by our $name.

----- $name is constant inside sub ------ Use of uninitialized value in concatenation (.) or string at ... red: <FONT COLOR=''></FONT> Use of uninitialized value in concatenation (.) or string at ... violet: <FONT COLOR=''></FONT> ----- $name given new value inside sub ------ Use of uninitialized value in concatenation (.) or string at ... red: <FONT COLOR=''></FONT> violet: <FONT COLOR='something wacky and wonderful'></FONT> ----- sub used inside loop that defined it ------ trying out red: <FONT COLOR='red'></FONT> trying out blue: <FONT COLOR='blue'></FONT> trying out green: <FONT COLOR='green'></FONT> trying out yellow: <FONT COLOR='yellow'></FONT> trying out orange: <FONT COLOR='orange'></FONT> trying out purple: <FONT COLOR='purple'></FONT> trying out violet: <FONT COLOR='violet'></FONT> ----- sub used outside of loop that assigned it ------ red: <FONT COLOR='something wacky and wonderful'></FONT> green: <FONT COLOR='something wacky and wonderful'></FONT> violet: <FONT COLOR='something wacky and wonderful'></FONT>

Many thanks in advance, beth

Comment on How do closures and variable scope (my,our,local) interact in perl?
Select or Download Code
Re: How do closures and variable scope (my,our,local) interact in perl?
by duelafn (Priest) on Jun 16, 2009 at 11:56 UTC

    The "our" case makes sense to me if the for loop is localizing the variable inside the loop. Thus when you call the subs outside the loop $name is undefined until you set it to something different (outside the loop). Here is the output if you comment out the second set of tests:

    ----- $name is constant inside sub ------ Use of uninitialized value $name in concatenation (.) or string at /tm +p/test.pl line 17. red: <FONT COLOR=''></FONT> Use of uninitialized value $name in concatenation (.) or string at /tm +p/test.pl line 17. violet: <FONT COLOR=''></FONT> ----- sub used inside loop that defined it ------ trying out red: <FONT COLOR='red'></FONT> trying out blue: <FONT COLOR='blue'></FONT> trying out green: <FONT COLOR='green'></FONT> trying out yellow: <FONT COLOR='yellow'></FONT> trying out orange: <FONT COLOR='orange'></FONT> trying out purple: <FONT COLOR='purple'></FONT> trying out violet: <FONT COLOR='violet'></FONT> ----- sub used outside of loop that assigned it ------ Use of uninitialized value $name in concatenation (.) or string at /tm +p/test.pl line 52. red: <FONT COLOR=''></FONT> green: <FONT COLOR='something wacky and wonderful'></FONT> violet: <FONT COLOR='something wacky and wonderful'></FONT>

    I'm not so sure what is happening in the my case.

    Good Day,
        Dean

      Thanks. In a private message Corion pointed me to perlsyn which has this to say about foreach loops and localization:
      If the variable is preceded with the keyword my, then it is lexically scoped, and is therefore visible only within the loop. Otherwise, the variable is implicitly local to the loop and regains its former value upon exiting the loop. If the variable was previously declared with my, it uses that variable instead of the global one, but it's still localized to the loop. This implicit localisation occurs only in a foreach loop.

      As you have observed, apparently even without the use of my, the variable used for iteration in a foreach loop is implicitly localized. That is why it has a value inside the loop but not outside of it.

      However, this is supposed to happen for both my and our variables and it appears to be happening for only for the our variables. Hmmm.

      Best, beth

Re: How do closures and variable scope (my,our,local) interact in perl?
by shmem (Canon) on Jun 16, 2009 at 12:23 UTC

    Declaring a variable with our effectively creates a package variable (i.e. a typeglob entry) and bequeaths a lexical scope upon the short name symbol. A for() loop aliases it's loop variable, so that our $name loop-variable behaves like a local. That's why declaring $name with our results in $name being undefined after the for() loop run, hence the generated subroutines output an empty $name.

    Why does the my-variable version work as expected? - Because the for() loop aliases the my variable as - a my variable. Now, for those the compiler inserts an opcode which clears it after (or before?) every iteration (on ENTER or LEAVE), so that you have a brand new $name (with its own private storage!) each time through the loop.

    If you close over a localized our() variable, as per your example, all your generated subs share the same local, which isn't visible outside.

    Perl is trying really hard to dwim here, and imho does a good job ;-)

    update:

    Actually, there's more to it, since the loop variable gets an alias of the list it iterates over. Consider:

    my $name; for $name (qw(red blue green yellow orange purple violet)) { no strict 'refs'; # allow symbol table manipulation no warnings 'redefine'; #ignore noise about redefinitions *$name = sub { my $sOutput = "<FONT COLOR='$name'>@_</FONT>"; $name = 'something wacky and wonderful'; #change the value return $sOutput; }; } print "red: ${\(red())}\n"; __END__ Modification of a read-only value attempted at - line 7.

    Here the for() iterates over a list of literals, and those are read-only. But! ...

    our $name; for $name (qw(red blue green yellow orange purple violet)) { no strict 'refs'; # allow symbol table manipulation no warnings 'redefine'; #ignore noise about redefinitions *$name = sub { my $sOutput = "<FONT COLOR='$name'>@_</FONT>"; $name = 'something wacky and wonderful'; #change the value return $sOutput; }; } print "red: ${\(red())}\n"; __END__ red: <FONT COLOR=''></FONT>

    ...why? In this case, the sub closes over the localized $name, which is undefined after the loop finished. That's why you get

    ----- sub used inside loop that defined it ------ trying out something wacky and wonderful: <FONT COLOR='something wacky + and wonderful'></FONT>

    inside the last loop - you had modified $colors[0] in the previous loop.

      Wow. @colors was modified. Didn't think of that. That explains a lot. So it all boils down to a subtle difference between localizing and aliasing:

      • our $name. Inside a foreach loop, the subroutine closes over the $main::name. It gets the localized value whenever $main::name is localized and the global value when not, just as does $main::name. Any assignment to $main::name within the closure changes the global variable at whatever localization level it happens to be running in.
      • my $name;. Inside the foreach loop, the subroutine closes over whatever $name happens to be aliased to, in this case, $colors[0] when $name eq 'red' and $colors[-1] when $name eq 'violet'. No matter where the subroutine runs, any assignment to $name within the closure changes the thing aliased, i.e. an element of @colors

      In the above quote from perlsyn it says both my and our are localized and doesn't make a distinction between aliasing and localizing. In your opinion is this a documentation bug? a perl bug? or neither?

      Thanks, beth

      Update: LanX's comment below is helpful here. He points out that lexical variables (i.e. my $name) can't be localized, so temporary aliasing is a way of faking it. Inside the loop itself, temporary aliasing is pretty much indistinguishable from localizing, but the differences between the two (localization and aliasing) become much more noticable if the variable is captured by a closure.

        I'd say neither, since

        qwurx [shmem] ~ > perl -le 'local my $foo' Can't localize lexical variable $foo at -e line 1.

        localizing in that context doesn't mean a local opcode is involved; rather, a local instance of whatever thing the loop variable is will be allocated. So localizing means that, for the loop variable, but aliasing is what happens to the current element of the list iterated over. Your code is a fine example for my/local, space/time (was: Re: The difference between my and local). The aliasing happens no matter what scoping rules apply to the loop variable:

        our $name; for $name (qw(red blue green yellow orange purple violet)) { $name = "foo"; } __END__ Modification of a read-only value attempted at - line 3.

        But since an our localized variable works in time, at calling time $name is just the localized instance of $main::name, and the list for which it was used to iterate over has gone.

        t gets the localized value whenever $main::name is localized and the global value when not

        No. When $name is executed, it gets the current value of $main::name. There's nothing conditional about it.

        Any assignment to $main::name within the closure changes the global variable at whatever localization level it happens to be running in.

        Again, it simply changes the variable $main::name. There's nothing conditional about it. Localisation just means a backed-up value will be assigned to the variable later.

        Inside the foreach loop, the subroutine closes over whatever $name happens to be aliased to,

        It captures the variable, whether it's an alias or not.

        No matter where the subroutine runs, any assignment to $name within the closure changes the thing aliased, i.e. an element of @colors

        If the captured variable is an alias, yes. That's what an alias is. It's got nothing to do with captures.

        In the above quote from perlsyn it says both my and our are localized and doesn't make a distinction between aliasing and localizing. In your opinion is this a documentation bug? a perl bug? or neither?

        They're independent.

        • "for my $x" creates(*) $x and aliases it.
        • "my $x; for $x" localises $x and aliases it.
        • "for our $x" localises $x and aliases it.

        I don't see anything wrong in perlsyn. Which statement is giving you pause?

        * — In practice, my vars aren't actually created at declaration and destroyed at scope exit, but that's how they're specified to behave. In reality, it might actually simply be a localisation in this case.

Re: How do closures and variable scope (my,our,local) interact in perl?
by Zarchne (Novice) on Jun 16, 2009 at 14:47 UTC
    I believe the key here is to understand that when a closure is created, it makes its own copy of the lexical environment; that's what distinguishes a "closure" from an ordinary procedure (sub). So whatever lexical variables are in scope now have an independent existence in the closure.

      That's not entirely true:

      my $name; $name = 'red'; *$name = sub { "<FONT COLOR='$name'>@_</FONT>" }; $name = 'blue'; *$name = sub { "<FONT COLOR='$name'>@_</FONT>" }; $name = '<none>'; print red(), "\n"; print blue(), "\n"; __END__ <FONT COLOR='<none>'></FONT> <FONT COLOR='<none>'></FONT>

      ... but when the same is done in a loop, with a lexical variable declared beforehand, the result changes:

      my $name; for $name ('red', 'blue') { *$name = sub { "<FONT COLOR='$name'>@_</FONT>" }; }; $name = '<none>'; print red(), "\n"; print blue(), "\n"; __END__ <FONT COLOR='red'></FONT> <FONT COLOR='blue'></FONT>

      The behaviour changes again, if you use a global:

      # global $name; for $name ('red', 'blue') { *$name = sub { "<FONT COLOR='$name'>@_</FONT>" }; }; $name = '<none>'; print red(), "\n"; print blue(), "\n"; __END__ <FONT COLOR='<none>'></FONT> <FONT COLOR='<none>'></FONT>
        .. but when the same is done in a loop, with a lexical variable declared beforehand, the result changes:

        because a loop is a block of it's own which changes. IIRC the essence of closures is that whenever the outer block is entered (decided at run-time!) the my variables are associated to another lexpad (please correct my terminology if I name something wrong).

        so with

        for my $var ( 1,2,3) { my $x =sub {print $var } }

        the opcode for $var points for each run into different instances of the lexpads of the for loop.

        UPDATE ----

        it's more like this in pseudocode

        while ( $LEXPAD{for-block}={}; "$LEXPAD{for-block}"->{var} = (1,2,3)- +>next() ) { my $x =sub { print "$LEXPAD{for-block}"->{var} } }

        ----- UPDATE

        OTOH there is nothing like a "closure with packagevars", they always point to the same symboltable (decided at compile-time!)

        Now the extra complexity comes because ELISHEVA declares the loop variable in advance with my / our, which implies localising, i.e. saving and restoring the variable at runtime. (Nota bene: Normally there is nothing like local() with lexvars)

        Cheers Rolf

      Could you please give an example? IMHO closures mean exactly the contrary of what I understand you saying.

      The problem here is that ELISHEVA is adding to the very different concepts of package vs lexical variables the dark forces of for-loops creating local aliases interfering differently with compile-time and run-time behavior of local variables.

      Cheers Rolf

      UPDATE: OK, I think the source of misunderstandings is what people mean when they say "closure". Do you mean the outer or the inner sub/block ?

        Perhaps I'm being too simple-minded, but please consider this variant:
        use strict qw(vars subs); my $name; { my $name = 'red'; *$name = sub { "<FONT COLOR='$name'>@_</FONT>" }; } { my $name = 'blue'; *$name = sub { "<FONT COLOR='$name'>@_</FONT>" }; } $name = '<none>'; print red(), "\n"; print blue(), "\n"; __END__ <FONT COLOR='red'></FONT> <FONT COLOR='blue'></FONT>
        Without the additional block delimiters ({}) and my declarations, the variable $name indeed always refers to the same storage (which, I admit, was not clear in my own mind when I wrote the above comment). Closures (red() and blue() here) only get a copy of a portion of their lexical environment (that is, become "closures", properly so-called) when they escape a region where that portion is (lexically, of course) in scope. Without a block that the thread of execution leaves, no closure is created and the code is indistinguishable from a regular sub. Note that red() and blue() (i.e. &red and &blue) exist outside the blocks because they are created in the package's (here, main) symbol table -- as using a type glob (*) always implies. Here's the code with another wrinkle to illustrate:
        use strict qw(vars subs); { my $name; { my $name = 'red'; *$name = sub { "<FONT COLOR='$name'>@_</FONT>" }; } { my $name = 'blue'; *$name = sub { "<FONT COLOR='$name'>@_</FONT>" }; } *foo = sub { print red($name), "\n"; print blue($name), "\n"; }; $name = 'another wrinkle'; } foo(); print red('hi mom'); __END__ <FONT COLOR='red'>another wrinkle</FONT> <FONT COLOR='blue'>another wrinkle</FONT> <FONT COLOR='red'>hi mom</FONT>
Re: How do closures and variable scope (my,our,local) interact in perl?
by ikegami (Pope) on Jun 16, 2009 at 16:24 UTC
    Simplified example:
    $ perl -le'$x="green"; for our $x ("red", "blue") { push @a, sub { $x +} } print $_->() for @a' green green $ perl -le'for my $x ("red", "blue") { push @a, sub { $x } } print $_- +>() for @a' red blue

    Package variables never go out of scope. Each anon sub in the first command of the example therefore refers to the same variable, $main::x. A for loop with a package variable as its iterator will backup the value of the variable and restore it when the loop is exited to avoid clobbering its parent, so when the anon subs in the first command are executed, they display $main::x which was restored to 'green'.

    On the other hand, a new my variable is created each time the scope that contains it is executed(*). That means that my variables that exist outside of a sub need to be captured by the sub to ensure that they are still around the sub is executed. For named subs, this is done when the sub is compiled. For anon subs, this is done when the sub expression is executed. Even if the variable goes out of scope, it will be kept alive and the sub will see the variable that existed when it was captured.

    For example, even though $x went out of scope at the end of the file in the following code, set and get keep it alive and can use it to exchange data.

    package Module; my $x; sub set { $x = shift } sub get { $x } 1;
    use Module; Module::set(123); print Module::get(); # 123

    When the anon subs in the second command of the top example are executed, they the current value of the variable they captured. One of those subs captured a variable that has 'red' for value. The other captured a variable that has 'blue' for value.

    * — The implementation varies slightly due to optimisation.

Re: How do closures and variable scope (my,our,local) interact in perl?
by LanX (Canon) on Jun 16, 2009 at 17:52 UTC
    Thanx thats fun! 8 )

    One can even combine it with loop-var-aliasing!

    lanx$ perl -e ' $a='a';$b='b'; my $var; for $var ( $a,$b ) { *$var=sub {print "$var \n" } };$var="x"; a();$b="y";b(); ' a y lanx$ perl -e ' $a='a';$b='b'; for $var ( $a,$b ) { *$var=sub {print "$var \n" } };$var="x"; a();$b="y";b(); ' x x

    Obfuscators, here we are! ;-)

    Well seriously, I think as a rule of thumb you discovered, that a "localised" lexvar my $var;for $var(){} has the same "closuring"-effects like a  for my $var () {} .

    I bet they act exactly the same...

    Cheers Rolf

      a "localised" lexvar my var;for $var(){} has the same "closuring"-effects like a for my $var () {} .

      A capture is really a form of aliasing. A slot in the new sub's pad is aliased to the SV of the captured variable. Or three, if another variable already refers to that SV.

      $ perl -MDevel::Peek -e' my $foo = "foo"; my $x; Dump($x); Dump($foo); for $x ($foo) { Dump($x); $f = sub { Dump($x) } } $f->() ' SV = NULL(0x0) at 0x814f69c SV = PV(0x814fb00) at 0x814ecdc SV = PV(0x814fb00) at 0x814ecdc SV = PV(0x814fb00) at 0x814ecdc

      All three names ($foo, loop's $x and sub's $x) all reference (are all aliased to) the same SV (0x814ecdc).

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://771986]
Approved by Corion
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others surveying the Monastery: (8)
As of 2014-08-01 22:34 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Who would be the most fun to work for?















    Results (50 votes), past polls