Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

sub and anonymous sub

by BrowserUk (Patriarch)
on Jun 20, 2002 at 20:18 UTC ( [id://176143]=perlquestion: print w/replies, xml ) Need Help??

BrowserUk has asked for the wisdom of the Perl Monks concerning the following question:

The other day I was writing a recursive subroutine in which I wanted the sub to know the current depth of recursion.

In C, I would have used a local static - asking here in the CB, a kind fellow monk (sorry, I forgot which monk!) gave me the hint to use a 'my $variable' outside of the sub, but inside an enclosing anonymous block.

{ #(1) my $depth = -1; sub recurse { #(2) ++$depth; print +(" .") x $depth . "current depth:$depth\n" ; recurse() if $depth < 3; print +(" .") x $depth . "current depth:$depth\n" ; --$depth; } #(3) } #(4) recurse(); # (5)
The above (trivial) example produces:
current depth:0 .current depth:1 . .current depth:2 . . .current depth:3 . . .current depth:3 . .current depth:2 .current depth:1 current depth:0

Which is great. Exactly what I wanted. The only problem now is - I don't understand HOW it works.?

My (possibly flawed) conceptual understanding is that the sub keyword creates a named (or unnamed in the case of anonymous subs) block of code. That is to say, an entry in a symbol table somewhere with a "key" of "recurse" (in this case) and a value of the address where the compiler stored the code between the opening "{" (2) and its matching "}" (3). And when the label "recurse" is encountered (in the proper context), the label is looked up in the symbol table, the address of the code obtained and invoked.

To my (still very green) concept of these things, when the program is interpreted/complied/run, the my $depth var should come in to being at the top of the enclosing block (1) and disappear at the bottom (4). Meaning it wouldn't exist at the by the time the sub is invoked at (5)?

Anyone care to attempt to explain how this works to me<super>*</super> without reference to perlguts?


<super>*</super>Someone who has read the docs on my and local several times including the space-time explanation and is still somewhat confused.

Replies are listed 'Best First'.
(Revelation) Re: sub and anonymous sub
by Revelation (Deacon) on Jun 21, 2002 at 03:05 UTC
    Here's a little explanation, using as little perlguts as possible :)

    First of all What is lexical scoping? We put a variable into play, until it leaves the range we decided for it to be able to live within. Lexical scoping creates private constructs that are only visible within their scopes, can't be touched anywhere else. That scope is interpreted into the program, and should be obvious from the code. Think of 'lexical scoping (my)' as a person, who can only live within a certain range of time, and can't be everywhere at once; we declare this person out of scope later on, and he can no longer be modified or called back in name, although we may see the effects of his actions lingering. Outside of their scope, variables (and people) are invisible and can't be altered in any way. That's why you can talk behind a lexical variable's back :)

    Note: Local() creates temporary variables, and is not lexical, but dynamic. This means that the value is global, but is temporarily local()ized. Localized variables are temporary objects, whose values reset when go out of their scope. Localization has been optimized to not create temporary objects in some cases, but that's talk for another node.

    Why do we scope lexically: Because we've been told to do so :) Seriously though, there's no necesity to scope lexically, unless you're coding mod_perl scripts, in which case you may end up with memory leaks and the like. However, it is good programming practice as it frees up memory when the subroutine is not being executed. Lexical scoping can save us from name space problems, as well. You might not want to alter variables in subroutines that have the same name as global variables.

    How does lexical scoping Work : my($x) creates a new variable that is only visible in the current subroutine. This is done at compile-time, so is called lexical or static scoping. By putting brackets (1 And 4) around the code, we identify it as an 'enclosed block*' From here, perl takes over, seeing the my you have given it, perl scopes the variable lexically, and tells us to get rid of it, when we're done; however, everything within that block, can still see our lexical value, and will be able to use it. When we close the block, the variable goes can be declared and used again!

    How did this apply to your code: So for all intents and purposes, you are perfectly right. Your code creates an enclosed entry, with an initialized variable after 1, that disappears at 4. It doesn't exist when recursive() is invoked (5); however, we have told the recurse subroutine to access data *within* the lexical scope, so it is possible to access that data. Frankly, his idea to lexically scope into an enclosed block was crafty manipulation of the code, that merits a thumbs up. Your friendly monk enforced good coding practices on the fly, and saving us the pains that I have previously outlined. The code is a good example of a monk using good practice, although it's not necessary for functional code (try to remove the parens.) :)

    A good read would be This meta tutorial, when you get the time. It explains the issue in perlgutsy language, but also clears up most questions one could have. Dominus has a good reference as well, located here. Localizing variables is a tad bit more complicated than I made it out to be, but these two tutorials can iron you out on the specifics, and interesting little facts about them.

    I hope I've given you enough of a primer to understand what lexical scope is, and how and when to use my.
    *A `my' declares the listed variables to be confined (lexically) to the enclosing block, conditional (if/unless/elsif/else), loop (for/foreach/while/until/continue), subroutine, eval, or do/require/use'd file

      Why thank you; that monk using good pratice was me. The my lexicaly scopes the variable to the end of the my statement to the end of the smallest enclosing lexical block, where a lexical block is a pair of curlies, a file, or an eval string (and probably some other places, anybody?). By adding the extra {}s, you make a scope exactly as large as you need.


      We are using here a powerful strategy of synthesis: wishful thinking. -- The Wizard Book

        Thankyou kind monk theorbtwo.

        ...but look what you started me on:)

      Thankyou! I really think I get it now. That was almost worth creating a new monk just so as I could ++ you twice!

      Just for the record - it wasn't just idle (nor idol:) curiosity, nor some form of masocism that prompted the question.

      From long experience going (way) back to school, I was never good at "rote learning". I found early on that if I could build a mental picture of the way things work, even if it was technically flawed, so long as it fit my observations, knowledge would stick. I won't embarrass myself by quoting my homebrew derivation of -b +/- b^2 -4ac/2a, but the formula stuck.

      References bookmarked for future reading when my eyes are upto it. its nearly 5 am here.

      Update:Just to show I did do my homework assignment...I especially like the "seven good uses of local()".

•Re: sub and anonymous sub
by merlyn (Sage) on Jun 20, 2002 at 20:35 UTC
    I'll give you a scarier one:
    sub my_sub { local $my_sub_level = $my_sub_level + 1; .. }
    Yes, the localization for the variable doesn't happen until the end of the statement, so the value on the right is the previous value (plus one), and it's undone at the end of the block.

    -- Randal L. Schwartz, Perl hacker

      Thanks Merlyn, "scarier" is just what I needed:)

      In your example, wouldn't $my_sub_level need to have been declared previously in order to have something to localise?

      From perlsub A local just gives temporary values to global (meaning package) variables. It does not create a local variable.

        I was presuming a global variable was already available. If you have use strict, you'd have to "use vars" the variable, yes.

        Or, just make it a package variable explictly:

        sub my_subroutine { local $main::my_subroutine_level = $main::my_subroutine_level + 1; .... }

        -- Randal L. Schwartz, Perl hacker

    A reply falls below the community's threshold of quality. You may see it by logging in.
Re: sub and anonymous sub
by Courage (Parson) on Jun 20, 2002 at 20:43 UTC
    1. As your program proved, "$depth" variable does not goes away at "#5" because it is used in sub "recurse" (hence REFCNT($depth)++ few times). And it was allocated during compile stage, to perform some optimizations here

    2. I'll suggest you to read "perlsub" and not "perlguts", and may be after that reading "perlcall" also will help getting you deeper into your question.

    Courage, the Cowardly Dog.
    PS. Something fishy is going on there, or my name is Vadim Konovalov. And it's not.

      Re: 1) Would it be fair to say that $depth has its ref count increased when the ++$depth is compiled As opposed to when the sub recurse is actually invoked?

      And does this mean that the symbol table entry for $depth will persist for the life of the program (allbeit inaccessible), and never be garbage collected, even if the sub recurse is never invoked?

      Re: 2) It was reading and trying to digest perlsub that raised the question in my mind. I scanned perlcall - its interesting. Hopefully a few more reads and a few more weeks of immersion in Perl, and it will make more sense. I had looked in perlguts, just long enough to know that I am not ready for that yet.

      Thanks.

        I'll answer only to 1st item, because I see 2nd already resolved.

        As best of my understanding, yes, ref count for $depth is increased when "sub recurse" is compiled. And if, say, we compiled anonymous sub as my $r=sub{$depth} and then $r goes away due to refcounting, I see that ref count for $depth will be decreased once (even if $depth was used several times inside that sub).

        Let me note that I just checked "perlsub" and see exactly your question explained at the section "Persistent Private Variables"

        Best wishes,
        Courage, the Cowardly Dog.
        PS. Something fishy is going on there, or my name is Vadim Konovalov. And it's not.

Re: sub and anonymous sub
by rinceWind (Monsignor) on Jun 21, 2002 at 12:59 UTC
    Interestingly, I thinkl there is a cleaner way to do it. My experience of other programming languages is showing. The algorithm you have given is non-reentrant, hence non-threadsafe.

    Instead, I prefer passing in a level number to recurse, which means that separate threads calling recurse will not interfere with one another.

    #(1) sub recurse { #(2) my ($depth) = @_; print +(" .") x $depth . "current depth:$depth\n" ; recurse($depth + 1) if $depth < 3; print +(" .") x $depth . "current depth:$depth\n" ; } #(3) #(4) recurse(0); # (5)
    I know perl is not thread safe, but perl 6 and/or perl 5.x might be.

    My $0.02 --rW

    Update: Gerbil is right. I don't need the extra braces. This is what comes with cutting and pasting someone else's example rather than rewriting from scratch.

    Have edited the example to take on Gerbil's suggestions

      I am far away from being a mighty Perl hacker, but I've done recursion depth counting this way all the time. It's "cleaner" because you don't have to understand Perl-specific techniques. And it's thread-safe, too? Cool ;) Didn't know that... But why keep the curly brackets?
      sub recurse { my ($depth) = @_; print +(" .") x $depth . "current depth:$depth\n" ; recurse($depth + 1) if $depth < 3; print +(" .") x $depth . "current depth:$depth\n" ; } recurse(-1);
      This is the same, right? Additionally, I would start with recurse(0);
Re: sub and anonymous sub
by grantm (Parson) on Jun 21, 2002 at 20:49 UTC

    A related question that may help shed some light is "why is 'my' documented in the 'perlfunc' man page?"

    The answer would seem to be that 'my' has both compile-time and run-time effects. At compile time, the Perl interpreter decides the scope of the variable based on lexical scoping rules. At run time, Perl allocates memory for the variable and starts tracking references to that memory. This latter step happens each time the line is excuted.

    This is why a mod_perl script should never use 'my' to declare a 'global' variable. At compile time, any block which refers to the variable will get pointed at the first instantiation. On the second invocation, when the 'my' line is reached, a new variable is created but all the other blocks are still pointing at the old one.

Re: sub and anonymous sub
by Anonymous Monk on Jun 21, 2002 at 18:27 UTC
    A minor nitpick from the staid and plaid realm of the anal retentive: It's recur, NOT recurse. Thank you for your time.

      I bow in deference to your superior knowledge........

      although this is possibly at odds with your statement, and this doesn't directly support it.

        Don't bother bowing. The previous AM is either a troll, or knows nothing about programming. (Odds are that you already noticed this.)

        To recur means for a situation to happen again. When you recurse a function, you cause yourself to recur immediately.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://176143]
Approved by TGI
Front-paged by jarich
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others studying the Monastery: (4)
As of 2024-03-19 05:47 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found