Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

closure clarity, please

by 7stud (Deacon)
on Nov 24, 2009 at 05:14 UTC ( #809000=perlquestion: print w/ replies, xml ) Need Help??
7stud has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks,

I have a little problem:

use warnings; use 5.010; for my $num (1 .. 5) { my $a = $num; {say "in block: $a"}; sub test { say "in function: $a"; } test() } --output:-- in block: 1 in function: 1 in block: 2 in function: 1 in block: 3 in function: 1 in block: 4 in function: 1 in block: 5 in function: 1

I assume a closure is what's causing the sub to print the same output over and over again. However, the only thing I can find in the docs about closures is in relation to anonymous subs, yet this program is using a named sub.

Is the closure problem an artifact of perl trying to be efficient and not redefining the function every time through the loop? Why don't the docs mention that a named sub can create a closure? Could someone give a blow by blow description of what's happening?

That closure problem sprang up when sorting a hash inside a loop. The following program sorts two hashes by their values, which are integers. The hashes are stored in an array, so a for loop is used to step through the array. Then one hash is sorted each time through the loop, and its results are immediately displayed:

use strict; use warnings; use 5.010; my %h1 = ( 'a' => 5, 'b' => 8, 'c' => 1 ); my %h2 = ( 'a' => 200, 'b' => 150, 'c' => 100 ); my @AoH = (\%h1, \%h2); for my $href (@AoH) { my %hash = %$href; my @sorted_keys = sort by_val keys %hash; sub by_val { $hash{$a} <=> $hash{$b} }; for my $key (@sorted_keys) { say "$key = $hash{$key}"; } say "=" x 20; } --output:-- c = 1 a = 5 b = 8 #ok, the first hash is sorted perfectly. ==================== c = 100 a = 200 b = 150 #but what happened here? ====================

The problem can be cured by using a block with sort instead of defining a sub. This example shows that you should not be indifferent to using a block v. defining a sub when using sort().

edit: I also wanted to ask about this variation:

use strict; use warnings; use 5.010; for my $num (1 .. 5) { say "start of for loop: $num"; {say "in block: $num"}; sub test { say "in function: $num"; } test() } --output:-- start of for loop: 1 in block: 1 Use of uninitialized value $num in concatenation (.) or string at 3per +l.pl line 11. in function: start of for loop: 2 in block: 2 Use of uninitialized value $num in concatenation (.) or string at 3per +l.pl line 11. in function: .. ..
In that variation, the sub can't see the my variable $num. Why?

Comment on closure clarity, please
Select or Download Code
Re: closure clarity, please
by BrowserUk (Pope) on Nov 24, 2009 at 05:40 UTC

    sub definitions are parsed once only. The closure retains the value it had at that point in time. Try this:

    use warnings; use 5.010; for my $num (1 .. 5) { my $a = $num; {say "in block: $a"}; *test = sub { say "in function: $a"; }; test() } __END__ c:\test>junk11 in block: 1 in function: 1 in block: 2 Subroutine main::test redefined at C:\test\junk11.pl line 11. in function: 2 in block: 3 Subroutine main::test redefined at C:\test\junk11.pl line 11. in function: 3 in block: 4 Subroutine main::test redefined at C:\test\junk11.pl line 11. in function: 4 in block: 5 Subroutine main::test redefined at C:\test\junk11.pl line 11. in function: 5

    You can disable the redefinition warning.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

      sub definitions are parsed once only. The closure retains the value it had at that point in time.

      Yes, subs are only parsed once, but that's irrelevant.

      The capturing occurs when the code ref is created. That's when the sub is defined for named subs, and that's when the sub op is executed for anonymous subs.

      Furthermore, closures capture variables, not values. The value of the variable can be changed, from both inside and outside the sub.

        Different words, same result.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
Re: closure clarity, please
by ikegami (Pope) on Nov 24, 2009 at 07:26 UTC

    Why don't the docs mention that a named sub can create a closure?

    There's a whole section on it called "Persistent variables with closures".

    Could someone give a blow by blow description of what's happening?

    In short, named subs capture at compile time.

    You seem to believe that subs are scoped to the block that contain them. That's not true. Subs are global.

      I did not know that a named sub is global no matter where it is defined, but how is that relevant? I definitely was looking at the sub's containing scope to determine what variables the sub could see at the time it's closure was formed.

      However, knowing that a named sub is global no matter where it is defined, and knowing that a closure around a sub is formed at compile time doesn't give me any insight into which variables are contained in the closure. How do you figure that out?

      And why does a my variable that is in scope inside the loop's block (in my last example) not make it into the closure?

        How do you figure that out?

        First, don't put named subs inside loops or other subs. It makes no sense. The latter will even warn ("will not stay shared") if it captures anything from the sub its in.

        Then the answer is simple: All variables declared before it in a scope that hasn't been closed.

        my $x; <- this one. { my $y; <- not this one. ... } { my $z; <- this one. sub f { ... } }

        For anonymous subs, it's a different story. Feel free to use them in loops and other subs. They'll pick up the variables that are visible when the sub is executed.

        And why does a my variable that is in scope inside the loop's block

        for localizes its iterator variable. Localizing is temporary replacing a variable with a new one.

        You captured $num before the loop started. This is different from the $num of the first pass, which is different from $num of the second pass, ..., which is different from $num of the fifth pass.

        (Ok, I lied. I believe an optimisation will actually cause the variable of the first pass to be reused for the subsequent passes, but that's transparent here.)

      doing some more experimenting...
Re: closure clarity, please
by vitoco (Pilgrim) on Nov 24, 2009 at 13:15 UTC

    As ikegami said, subs are global:

    #!perl use v5.10; use strict; use warnings; my $a = shift; sub f { say "global f"; } sub g { say "global g"; my ($a) = @_; sub f { say "local f ($a)"; # line 16 } f(); } f(); g(shift); f(); __END__ C:\Temp>localsub.pl Variable "$a" will not stay shared at C:\Temp\localsub.pl line 16. Subroutine f redefined at C:\Temp\localsub.pl line 15. Use of uninitialized value $a in concatenation (.) or string at C:\Tem +p\localsub.pl line 16. local f () global g Use of uninitialized value $a in concatenation (.) or string at C:\Tem +p\localsub.pl line 16. local f () Use of uninitialized value $a in concatenation (.) or string at C:\Tem +p\localsub.pl line 16. local f () C:\Temp>localsub.pl AAA BBB Variable "$a" will not stay shared at C:\Temp\localsub.pl line 16. Subroutine f redefined at C:\Temp\localsub.pl line 15. Use of uninitialized value $a in concatenation (.) or string at C:\Tem +p\localsub.pl line 16. local f () global g local f (BBB) local f (BBB)

    What is a strange behavior is that variables got stuck with a value, even if it is assigned after it's first use... That does not seem to be a compile time assignment.

    Conclusion: always define global subroutines or pass variables as parameters to them if defined locally for clarity.

      What is a strange behavior is that variables got stuck with a value, even if it is assigned after it's first use... That does not seem to be a compile time assignment.

      What does this mean? The behaviour your code displays is what I'd expect; is there some gotcha that you didn't expect?

      I think that one has to be careful speaking of compile time in this example; there's the compilation of the program, which happens at (well) compile time, and then the compilation (if that's the right word) of the sub declaration, which happens at run time **. When the sub is compiled, $a has a value, and it's this value * that is compiled into the sub.

      * I'm intentionally being a bit sloppy here: as ikegami mentions, closures close over variables, not values; but, as the warning that you quote mentions, the actual variable that is closed over will no longer be accessible at the end of the subroutine invocation, so that there is no further way to change its value.
      ** Wrong; see below.

        Well, without reading the explanation for the warning, I would expect ()(BBB)() or (AAA)(BBB)(AAA) or just ()()() like the case of no parameters.

        It seems that sub f (the second one) compiles in runtime, and internal $a is not asigned the first time it is called, but when called from g, it glues the first value it receives.

        Let me show an improved example:

        #!perl use v5.10; use strict; use warnings; my $a = shift; sub f { say "global f ($a)"; } f(); sub g { my ($a) = @_; say "global g ($a)"; sub f { say "local f ($a)"; # line 17 } f(); } f(); g(shift); say "main1 $a"; $a = shift; say "main2 $a"; f(); g(shift); f(); __END__ C:\Temp>localsub.pl AAA BBB CCC DDD Variable "$a" will not stay shared at C:\Temp\localsub.pl line 17. Subroutine f redefined at C:\Temp\localsub.pl line 16. Use of uninitialized value $a in concatenation (.) or string at C:\Tem +p\localsub.pl line 17. local f () Use of uninitialized value $a in concatenation (.) or string at C:\Tem +p\localsub.pl line 17. local f () global g (BBB) local f (BBB) main1 AAA main2 CCC local f (BBB) global g (DDD) local f (BBB) local f (BBB)

        As you can see, the internal variable is empty (uninitialized) the first two times f is called, but as soon as it is defined, it keeps that value forever. Weird...

      Conclusion: always define global subroutines

      You have no choice. Named subroutines are always global. You're lying to yourself when you said "local f".

      What is a strange behavior is that variables got stuck with a value,

      Why do you think that creating a variable somewhere should replace a variable in some unrelated sub?

        You have no choice. Named subroutines are always global. You're lying to yourself when you said "local f".

        You are right. I meant "define all subroutines at the same global level."

        Why do you think that creating a variable somewhere should replace a variable in some unrelated sub?

        It is not clear to me when a lexical variable is used and when is lost during the program execution, as in my last example, where I'd expect an uninitialized $a in all "local f"'s messages.

Re: closure clarity, please
by 7stud (Deacon) on Nov 27, 2009 at 00:43 UTC
    You seem to believe that subs are scoped to the block that contain them. That's not true. Subs are global.

    Based on vitoco's example and JadeNB's analysis, I believe that statement should now be considered false. There appears to be two facets of scope that subs demonstrate. On the one hand, no matter where a sub is defined, it can be called anywhere in your program. In that sense, a sub has global scope.

    On the other hand, the variables that a sub can see depends on the scope in which the sub is defined. In that sense, a sub has local scope.

    Here is a simplified version of vitoco's example:

    use strict; use warnings; use 5.010; my $val = 10; sub f { say "in global f, \$val is: $val"; } f(); #Demonstrates that the global definition of f is overwritten #by the local definition of f (below) at compile time. As a #result, you won't see output from global f. When the local f #executes instead, the global $val above is hidden by a local #$val in the definition of f (below). say '=' x 20; sub g { my $val = shift; say "in g, \$val is: $val"; #This definition of f closes over $x in previous line. sub f { #line 40 say "in local f, \$val is: $val"; } f(); } g('hello'); say '=' x 20; g('goodbye'); say '=' x 20; --output:-- Variable "$val" will not stay shared at 4perl.pl line 41. Subroutine f redefined at 4perl.pl line 40. Use of uninitialized value $val in concatenation (.) or string at 4per +l.pl line 41. in local f, $val is: ==================== in g, $val is: hello in local f, $val is: hello ==================== in g, $val is: goodbye in local f, $val is: hello ====================

    If a sub truly had global scope, then the local $val would have no affect on the locally defined sub. Instead, the sub would have closed over the global $val, and changes to the local $val would not be seen by the sub.

    *Yes, I know that "local $val" and "global $val" are really my variables. However, referring to them as "the my variable $val that was defined in the same block as the nested sub definition" or "the my variable $val declared outside of any blocks" is too unwieldy."

      I believe [ the statement "Subs are global" ] should now be considered false.

      The scope of something is the area from which that something can be seen. Subroutines can be seen from anywhere, so they are global.

      On the other hand, the variables that a sub can see depends on the scope in which the sub is defined.

      You're talking of the scope of the variables now, not the scope of the sub. Yes, they are visible to the sub. We gave a number of examples of this. You can't have closures without this.

        You're talking of the scope of the variables now, not the scope of the sub. Yes, they are visible to the sub. We gave a number of examples of this. You can't have closures without this.

        No, I don't think I am. It matters where the sub is defined. Saying that a sub is global, end of discussion is not at all helpful in determining what variables a sub can see. Clearly, it matters where a sub is defined in determining what values it can see.

        A sub defined outside any blocks cannot see a my variable declared inside a block, yet if the sub is defined inside the block, it can see the my variable. The scope of the my variable is the same in both cases. I don't see how saying that the scope of a variable is "wherever it can be seen" is of any use. The goal is to determine where a variable can be seen.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://809000]
Approved by ww
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others wandering the Monastery: (19)
As of 2014-07-23 16:30 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (146 votes), past polls