Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

Perl closures: internal details.(Updated)

by BrowserUk (Patriarch)
on Apr 12, 2012 at 08:50 UTC ( [id://964726]=perlquestion: print w/replies, xml ) Need Help??

BrowserUk has asked for the wisdom of the Perl Monks concerning the following question:

Update:I'm fully conversant with how closures work at the perl level. I am interested in the details of the internal implementation. I don't know how to make this question clearer?

I'm looking to understand the internal details of how Perl implements closures. Are there any documents, blogs or other articles that explain the internal implementation?

Specifically, I'm interested in understanding the mechanism that allows a coderef to 'remember' the (location of) the variables it closes over?


With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

The start of some sanity?

Replies are listed 'Best First'.
Re: Perl closures: internal details.(Updated)
by dave_the_m (Monsignor) on Apr 12, 2012 at 13:49 UTC
    Ok here's how it works.

    Each sub has three phases; compilation, creation, and execution; for named subs, creation happens at the same time as compilation; for anon subs, its later, and there can be multiple creations, which is why they're more interesting.

    At compile time, a note is made for each sub, of what outer lexicals it makes use of; at creation time, the sub is given its pad, with the first instance of each of its lexical vars created and stored within; also, a reference to the current instance of each outer lexical is stored in the pad. So the pad contains references to both its own lexical vars and to any outer ones.

    On first execution, all the vars are available in the pad. On return from the sub, all the sub's own lexical vars are abandoned, and new empty ones created in the pad, ready for the next execution (if any).

    There's a bit more to it than that, but that's the main effect. For more details, looks at pad.c in the perl source: S_pad_findlex() is the main function that does the compile-time stuff; I wrote that about 10 years ago, and don't remember much of what it does any more :-(. Then, Perl_cv_clone() shows what happens at create time for anon subs (the creation for named subs happens in parallel with compilation, and nothing special needs doing for it). Finally, in pp_*.c, the pp_pad?v functions show what happens to a lexical variable on entry to a scope (basically SAVECLEARSV() is called), while in scope.c, the SAVEt_CLEARSV branch of Perl_leave_scope shows what happens on scope exit, including an optimisation to just clear the var in place if nothing else is using it.

    Also try running with -DX or -DXv on debugging perl builds.

    Dave.

      That is perfect!

      Just enough words for an overview, and references to find the details. Many, many thanks for taking the time to type that up.

      That's my next few weeks diversion sorted :)


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

      The start of some sanity?

Re: Perl closures: internal details.
by moritz (Cardinal) on Apr 12, 2012 at 09:34 UTC

    In the abstract it's rather simple: a Perl-level coderef basically has a pointer to the actual code optree, and a pointer to the outer lexical pad (the data structure that holds the lexical variables) that the closure closes over. When the outer subroutine is run, a new pad is created, and anonymous subroutines reference that new pad.

    (This is the reason why named inner subs aren't closures -- they can be run before the lexical pad of the outer subroutine is created, so they cannot reference it).

    My first attempt to demonstrate that failed, because perl is smart and reuses references when possible:

    use 5.010; sub f { my $x = shift; sub () { $x } } say f(1); say f(2); __END__ CODE(0x18cede8) CODE(0x18cede8)

    The culprit here is that the ref count of the return value from f(1) goes to zero as soon as it has been printed. To demonstrate that the new association between a lexical pad and a code block indeed creates a new reference, we have to keep the old reference around:

    use 5.010; sub f { my $x = shift; sub () { $x } } say my $x = f(1); say f(2); __END__ CODE(0x1d5cde8) CODE(0x1d5cf98)

    Finally two different addresses from the same anon subroutine.

    In the concrete it's more complicated, because closures can close over multiple outer lexpads, and care must be taken that recursion doesn't lead to lexical confusion.

    I don't know much about the Perl 5 internals, I fear you either have to ask one of the perl 5 porters (Zefram or Nicholas would be good candidates), or browse the sources. perlguts and perlapi seem to be silent on this matter. But maybe it helps you to look at the output from B::Concise, because it gives you an idea what opcodes are involved:

    perl -MO=Concise,f -e 'sub f { my $x = shift; sub () { $x } };' main::f: 9 <1> leavesub[1 ref] K/REFC,1 ->(end) - <@> lineseq KP ->9 1 <;> nextstate(main 1 -e:1) v ->2 4 <2> sassign vKS/2 ->5 2 <0> shift s* ->3 3 <0> padsv[$x:1,3] sRM*/LVINTRO ->4 5 <;> nextstate(main 3 -e:1) v ->6 8 <1> refgen K/1 ->9 - <1> ex-list lKRM ->8 6 <0> pushmark sRM ->7 7 <$> anoncode[CV ""] lRM ->8 -e syntax OK

    It seems to be the interplay of refgen and anoncode that are responsible for creating the closure correctly.

    (Update: several small wording updates).

    Another update:

    Update:I'm fully conversant with how closures work at the perl level. I am interested in the details of the internal implementation. I don't know how to make this question clearer?

    Well, for me the explanation on the Perl level end at "subroutines can access the variables from outer scopes that were there at the time the reference to the subroutine was taken". Everything else (the fact the coderefs store pointers to lexpads and code blocks, the name of the ops etc.) is already "details of the internal implementation".

    If what I wrote aren't the details you are looking for, what is it your are looking for? I genuinely don't understand that.

Re: Perl closures: internal details.
by davido (Cardinal) on Apr 12, 2012 at 09:13 UTC

    This reference may be a little light on "guts" but is excellent on explanation. Pages 73 through 80 in Higher Order Perl paint a pretty clear picture.


    Dave

      a little light on "guts"

      I don't mean to be rude, but I specifically asked about "articles that explain the internal implementation?. Ie "the guts".


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

      The start of some sanity?

        Well, it does both :) the pad (lexpad, lexical pad) is attached to the CV ( Code Value, subroutine ), each scope creates a new pad ... maybe that is not gutsy enough, but http://search.cpan.org/dist/illguts/index.html elaborates more on the structure of the guts -- or were you looking for something different?

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://964726]
Approved by marto
Front-paged by davido
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others cooling their heels in the Monastery: (2)
As of 2024-03-19 06:28 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found