|Perl: the Markov chain saw|
Lexical scoping like a foxby broquaint (Abbot)
|on Nov 18, 2002 at 19:19 UTC||Need Help??|
The first thing this tutorial will do is explain what lexical scoping means, so as to keep things simple from the start.
Firstly, don't go to a dictionary as that won't help you in this particular case. In perl, when we speak of something in the terms of it being lexically scoped, we are talking about the area of code where the given thing is visible e.g
In the above code $foo can only be seen between the opening and closing braces. This is because they delimit the length of the lexical scope, and after the ending brace that particular instance of $foo no longer exists.
So a lexical scope is a section of code where things can live temporarily. I say they live temporarily because anything created within a lexical scope will be deleted once the scope has been exited e.g
There is an exception to this rule however - if something is still referring to something created within a lexical scope upon exit of the scope, that thing will not be deleted since it is still being referred to by something. This does not mean you can still refer to it directly, it just means that perl has yet to clean it up.
So we can see that $foo is still being referred to by $ref but the user can't refer directly to it.
Notice how all the variables are being declared with my()? This wasn't done to comply with strict (although strict does encourage the use of lexical variables, and with good reason too), but because my() creates lexically scoped variables, or simply, lexical variables.
So every variable created with my() lives within the current lexical scope. What about other variables you may ask? Well anything that is not declared with a my() lives in the current package (for more info on package global variables see. Of Symbol Tables and Globs).
Here's a brief example to illustrate the difference between lexical variables and package global variables
There $foo lives within its lexical scope, $bar lives within the current package, so doesn't disappear until it is explicitly deleted from the symbol table.
Another thing to be noted about my() is that it is a compile-time directive (this is because all things lexical are calculated at compile-time). This is the phase when the perl interpreter is putting the code together. So once our scopes and variables have been set they cannot be changed at runtime, like package globals can.
What this means is that lexical variables are declared at compile-time, not initialised e.g
This demonstrates that $foo is declared, since strict does not have a problem, but is still undefined since it hasn't has anything assigned to it.
More than naked
So far we've been using naked blocks to delimit the length of our lexical scopes. How else, you might wonder, are lexical scopes defined?
Well firstly there's the lexical file scope, which is the length of a given perl source file e.g
Now on the command-line
As we can see there, $foo only lives for the length of the file lextut1.pl, and has fallen out of scope by the time require has finished doing its thing.
Secondly, the braces around subroutine code delimit a lexical scope, so anything declared within a subroutine cannot be seen from outside it e.g
So subroutines scope follows along the same lines as the scope in naked blocks.
For conditional statements and loop statements the case is somewhat different as lexicals can be declared in the condition block/loop assignment, which occurs before the braces e.g
Although somewhat convoluted the above example demonstrates the fact that the condition of the if and the loop assignment in the foreach are lexically scoped to the braces which delimit the respective statements.
Note, however, that statement modifiers do not create a new lexical scope (this should be obvious through their lack of braces) e.g
The remaining ways of creating a lexical scope are as follows
A lot of literature when talking about lexical variables refers to them as private variables. This is because they cannot be seen outside their given lexical scope. As has already been illustrated, lexical variables are deleted once the end of their given scope is reached (exceptions withstanding), so they really are private to their respective scope.
A feature which is an essential part of lexical scoping is that scopes can be nested and inner scopes will not effect outer scopes e.g
There, the inner scope is a new scope (much like the outer scope is a new sub scope of the file scope), so a new instance of $foo is created leaving the outer $foo untouched when the inner scope exits. And because the inner $foo only lives within that scope, it private to that scope, and nothing else can see it.
This is not to say that nested scopes do not affect the rest of the program (as any new scopes are just sub scopes of the file level lexical scope), it just means that anything created within them is private to that given scope e.g
So even though we create a new scope with the if/else statement, we're still changing $w in the scope above (which in turn is modifying the elements of list since $w is just an alias to each element) as we haven't created a new $w for that particular scope (and of course, it wouldn't do us a lot of good as it would've fallen out of scope by the time we came to print it).
Well, we've been putting it off long enough and now it is time face that most confounding of functions - local.
The first thing that we absolutely must declare is that local does not create variables! Not only does it not create variables, it has nothing to do with lexical variables.
With that said, what local does do is change the value of an existing package global for the length of a given dynamic scope. A dynamic scope is just like a lexical scope but is defined by the length of scope, not the visibility of the scope. So local is localising a package globals value for the length of a given lexical scope e.g
As we can see the value of $x is still set to 'altered state' in foo() even though its outside of the initial lexical scope. But because $x has been dynamically scoped with local and foo() was called within the surrounding lexical scope $x will stay set to 'altered state' until the lexical scope exits.
You might also see examples of it being used to create private variables - this is rather misguided as it is auto-vivifying (creating it upon request of its existence) the variable e.g
So local has forced $x's temporary creation and then it dutifully fall's out of scope, leaving it undefined but still with an existing entry in the symbol table.
In the first case we've set the input separator to undefined, so when $fh is read, it reads right to the end of the file. And in the second case we localise the list separator for stringfied lists to a comma followed by a space, and the original list describes its final output.
This is somewhat of an oddball in the world of variables in that it creates a package level variable which is visible for the remaining lexical scope e.g
So our $x has created the package global $foo::x, but it is also visible in the remaining lexical scope which can still be seen in the package bar. This illustrates why our is somewhat of a two-faced function and best left alone unless the behaviour is specifically desired (at least in this humble tutorial author's opinion).
Ok, you say, I can see what lexical scoping is about and have an understanding of how it works, but what use is it to me?
Firstly, you can neatly encapsulate separate groups of operations into individual lexical scopes to avoid namespace collision and the like (this is widely demonstrated through the use of subroutines and modules). This in turn leads to nicely encapsulated sections of code which can be isolated from the main body of code, which in turns means that the variables will tie very closely to the surrounding code.
Secondly, because lexical scoping is determined at compile-time, if there are any errors they will be picked up before the program can even run (this is doubly true if you're running with strictures on, you are use()ing strict right?).
Thirdly, at the exit of a lexical scope all the variables are destroyed (except of course, for those that are still in use), which means your memory won't keep growing and growing as more variables are created. Also quite handily, any objects will have the DESTROY method called upon exit, so you can handle how your objects are cleaned up.
Now we're done with our learning, let's have some doing!
The below example will recurse through a given directory and will list each .pl and .pm with the amount of lines in the file.
Wow, there's quite a lot of lexical scoping going on there, both explicitly (i.e the naked block containing the core of the program) and implicitly (i.e count_lines()' lexical scope) and at this point it should all be pretty straight forward (and I imagine the comments help too :).
A lexical scope defines an area of code in which any variables declared within that area will live for only duration of the execution of that area of code, unless a variable is still referenced after the area of code has been left. A dynamic scope is orthogonal to a lexical scope and is defined by the length of the scope (as opposed to the visibility of the scope).
my declares lexically scoped variables at compile time, local changes a package global's value throughout a dynamic scope and our creates a package global which is visible throughout its given lexical scope.
And there we have it! I hope you've enjoyed this tutorial and gotten everything out of it that you had intended to, and can now go forth and frolic in the land of lexical scoping with glee and pride!