Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

modular file scoping

by Pstack (Scribe)
on Oct 10, 2017 at 02:38 UTC ( [id://1201062]=perlquestion: print w/replies, xml ) Need Help??

Pstack has asked for the wisdom of the Perl Monks concerning the following question:

#..................."~/test/DataBank.pm" package DataBank; + our @EXPORT_OK = qw( set_webpath get_scanpath get_findpath); my %webpaths = ( mypath => "", scanpath => "~/WebTabs/auto/htmlscan.0", findpath => "~/WebTabs/auto/htmlfilter.0" ); sub get_scanpath {return ($webpaths{"mypath"} || $webpaths{"scanpath"} +);} sub get_findpath {return ($webpaths{"mypath"} || $webpaths{"findpath"} +);} sub set_webpath { my $pathme = shift; unless (-f $pathme){$pathme = "";} $pathme ||= get_scanpath(); $webpaths{"mypath"} = $pathme; } #..................."~/test/Scanner.pm" package Scanner; ; our @EXPORT_OK = qw( postscan); use Importer 'SpecsGet' => qw( pathout ); sub webget {.........} # download heaps sub postscan { my $pathx = pathout(); print "\n$pathx\n\n"; } #..................."~/test/SpecsGet.pm" package SpecsGet; our @EXPORT_OK = qw( pathout); + use Importer 'DataBank' => qw( get_scanpath get_findpath ); sub pathout {return get_scanpath();} #..................."~/test/testme.pl" use Importer 'DataBank' => qw (set_webpath); use Importer 'Scanner' =>qw (postscan); use Cwd qw(getcwd cwd); set_webpath (getcwd()."/xxx"); #........................................# hours later postscan();

Given the 4 files above suitably 'stricted' etc & closed off properly:

#> perl testme.pl          ==>  "~/xxx"   ... (as expected)

Seems to function perfectly well, deploying DataBank.pm as a central store for 'modularised' semi-static specifications whose values may occasionally get changed by yet other modules. The problem I am having is not understanding exactly why it works so I can rely on it?

There are myriad net explanations of the seldom used "local", "our", & "package" scopings but very little on this "file" scoping that I (and maybe others) use extensively. If the "file"-scoped lexicals of DataBank.pm are forced to stay in scope throughout, what is doing the forcing with respect to scoping rules?

Replies are listed 'Best First'.
Re: modular file scoping (updated)
by haukex (Archbishop) on Oct 10, 2017 at 05:43 UTC
    There are myriad net explanations of the seldom used "local", "our", & "package" scopings but very little on this "file" scoping

    Well, local, our, and package scoped variables (the latter two being pretty much the same anyway, see our) are actually very common, I'd say roughly as common as lexically scoped (my) variables, including at the lexical scope of the file. Admittedly, when writing a normal script one probably types my a lot more often than our, local, or state, but package and dynamic scoping may be being used a lot under the hood, like in the modules one loads. Anyway, if I am understanding your question correctly, you're asking about the my %webpaths variable, and why your three functions in package DataBank can keep using it despite the execution of the file DataBank.pm having already finished?

    That's because the three subs refer to it, and so long as there is something that refers to the variable, Perl keeps it around. These "references" are uses of the variable in its lexical scope, as in your case, or lexical variables used by closures, but also references created explicitly, as in my $hashref = \%webpaths. So the answer is yes, this is the intended behavior you can rely on. Maybe this helps:

    use warnings; use strict; { package Tracer; sub new { my $c=shift; bless {@_}, $c } sub DESTROY { print "DESTROY ".shift->{name}."\n" } } END { print "END\n" } my $one = Tracer->new(name=>'one'); my $two = Tracer->new(name=>'two'); sub foo { my $three = Tracer->new(name=>'three'); my $four = Tracer->new(name=>'four'); $one->{foo}++; print "end of sub foo\n"; return $three; } my $th = foo(); print "clearing \$th\n"; $th = undef; print "end of main\n"; __END__ end of sub foo DESTROY four clearing $th DESTROY three end of main DESTROY two END DESTROY one

    Here, objects of my little class Tracer simply print a message when they are destroyed, which in Perl happens when the last reference to a variable goes away and it is garbage collected. You can see that:

    • $four is declared in the scope of sub foo and only used there, so when the call to sub foo finishes executing, the variable is destroyed,
    • $three is declared in the scope of sub foo, but the reference to the object is returned from the sub, so the outside code now holds a reference to it in $th, so it is not destroyed until we get rid of that reference,
    • $two is declared at the scope of the file, and there are no other references to it, so when the scope of the file ends, it is destroyed,
    • $one is declared at the scope of the file, and since sub foo uses it, it is kept around until global destruction.

    Further reading: perlsub, in particular "Private Variables via my()", perlref, and perhaps also perlmod. Perhaps the bit of info you were missing was <update2> the keyword "lexical scope" that "file scope" is not really special, but just another lexical scope. </update2>

    Update: Since I glossed over it above, it may be important to note that my has both a compile-time and run-time effect. At compile time, it declares that there is a lexical variable of that name so the compiler knows what that variable is when it sees it in the following code, but the initialization doesn't actually happen until runtime. This means that, for example, in sub bar { my %h; ... }, every call to bar() will create a new hash %h (unlike package variables, as I described in this recent thread). Also made a few small updates to above wording for clarification.

      Thank you indeed.

      "....you're asking about the my %webpaths variable, and why your three functions in package DataBank can keep using it despite the execution of the file DataBank.pm having already finished?..."

      Yes, the persisting STATE of %webpaths perplexing me somewhat!.

      But I take your point about more persisting going on 'under the hood' than I generally pay attention to, and I am thankful indeed that mostly I have been spared the need to. I think my problem is just explicitness (which I tend to practise stringently in terms of style). I use 'our @EXPORT_OK = qw(subx suby subz)' a lot, of course, but in that very limiting way. And with 'my' can generally see exactly its scope within the same page (aka file) I am working on. Now I am wondering if the key factor for Perl retaining the state of %webpaths is due specifically to the precedence of calls in testme.pl to subs defined in DataBank.pm, and will therefore persist while testme.pl stays alive, even though other modules which altered the state of %webpaths are themselves long dead? (That at least would satisfy my need for order!)

      I am most grateful of the time and effort you put into an erudite explanation, which I really will have to study in some depth (and perhaps open my eyes to possible trajectories I have hjtherto been too nervous to explore).

      Thanks again, haukex!

        Now I am wondering if the key factor for Perl retaining the state of %webpaths is due specifically to the precedence of calls in testme.pl to subs defined in DataBank.pm, ...

        Yes, that's it (I assume you meant s/precedence/presence/). From Persistent variables with closures:

        Unlike local variables in C or C++, Perl's lexical variables don't necessarily get recycled just because their scope has exited. If something more permanent is still aware of the lexical, it will stick around. So long as something else references a lexical, that lexical won't be freed--which is as it should be. You wouldn't want memory being free until you were done using it, or kept around once you were done. Automatic garbage collection takes care of this for you. ... If declared at the outermost scope (the file scope), then lexicals work somewhat like C's file statics. They are available to all functions in that same file declared below them, but are inaccessible from outside that file. This strategy is sometimes used in modules to create private variables that the whole module can see.

        That "something more permanent" are the subs, and I'll get more into why those are "more permanent" below.

        ... and will therefore persist while testme.pl stays alive, even though other modules which altered the state of %webpaths are themselves long dead?

        Well, obviously it's a bit more complicated than "dead" and "alive" :-) Especially in dynamic languages like Perl, where the lines between the traditional "compile time" and "run time" can be blurred - you can run code at compile time with BEGIN and use, and compile code at runtime with eval, do, and require. In this case, consider what is going on when you write use DataBank; (or in your case use Importer 'DataBank';, which as I understand it is equivalent): During the compliation of testme.pl, when the compiler encounters the line use DataBank;, basically it will immediately compile and execute all of the code in DataBank.pm (plus the import/export code, the details of which I'll skip over for now), so when the line use DataBank; finishes compiling, the code in DataBank.pm will have finished executing (and all of its scopes have ended).

        Now the code in DataBank.pm includes statements like sub set_webpath { ... }. The important thing to keep in mind is that this does not actually run the code inside the sub, all it does is install that code into the symbol table under the names &DataBank::set_webpath and via the import/export mechanism also as &main::set_webpath, so that it can be run later, when other code says set_webpath(...). So I hope it's obvious that the subs from DataBank.pm need to stick around until after it has finished compiling and executing, because otherwise testme.pl couldn't call those functions, and the entire concept of modules exporting functions would break down. And since those functions need the %webpaths variable, it makes sense to keep that around as well, as described above.

      What an excellent and thorough explanation of this subject.   Thanks so much for taking the time to write and share it.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1201062]
Approved by haukex
Front-paged by choroba
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others examining the Monastery: (5)
As of 2024-04-23 21:27 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found