Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

Comment on

( #3333=superdoc: print w/ replies, xml ) Need Help??
The first thing this tutorial will do is explain what lexical scoping means, so as to keep things simple from the start.

Firstly, don't go to a dictionary as that won't help you in this particular case. In perl, when we speak of something in the terms of it being lexically scoped, we are talking about the area of code where the given thing is visible e.g

{ # beginning of lexical scope my $foo; } # end of lexical scope
In the above code $foo can only be seen between the opening and closing braces. This is because they delimit the length of the lexical scope, and after the ending brace that particular instance of $foo no longer exists.

So a lexical scope is a section of code where things can live temporarily. I say they live temporarily because anything created within a lexical scope will be deleted once the scope has been exited e.g
{ # begin lexical scope my $foo = "a string"; print " \$foo is: ", (defined $foo ? $foo : "undefined"), $/; } # end lexical scope print "\$foo is: ", (defined $foo ? $foo : "undefined"), $/; __output__ $foo is: a string $foo is: undefined
There is an exception to this rule however - if something is still referring to something created within a lexical scope upon exit of the scope, that thing will not be deleted since it is still being referred to by something. This does not mean you can still refer to it directly, it just means that perl has yet to clean it up.
my $ref; { # begin lexical scope my $foo = "something in a lexical scope"; $ref = \$foo; } # end lexical scope print "\$ref refers to: $$ref", $/; print "\$foo is: ", (defined $foo ? $foo : "undefined"), $/; __output__ $ref refers to: something in a lexical scope $foo is: undefined
So we can see that $foo is still being referred to by $ref but the user can't refer directly to it.

my variables

Notice how all the variables are being declared with my()? This wasn't done to comply with strict (although strict does encourage the use of lexical variables, and with good reason too), but because my() creates lexically scoped variables, or simply, lexical variables.

So every variable created with my() lives within the current lexical scope. What about other variables you may ask? Well anything that is not declared with a my() lives in the current package (for more info on package global variables see. Of Symbol Tables and Globs).

Here's a brief example to illustrate the difference between lexical variables and package global variables

{ my $foo = "a lexical variable"; $bar = "a package variable"; print " \$foo is: ", (defined $foo ? $foo : "undefined"), $/; print " \$bar is: ", (defined $bar ? $bar : "undefined"), $/; } print "\$foo is: ", (defined $foo ? $foo : "undefined"), $/; print "\$bar is: ", (defined $bar ? $bar : "undefined"), $/; __output__ $foo is: a lexical variable $bar is: a package variable $foo is: undefined $bar is: a package variable
There $foo lives within its lexical scope, $bar lives within the current package, so doesn't disappear until it is explicitly deleted from the symbol table.

Another thing to be noted about my() is that it is a compile-time directive (this is because all things lexical are calculated at compile-time). This is the phase when the perl interpreter is putting the code together. So once our scopes and variables have been set they cannot be changed at runtime, like package globals can.

What this means is that lexical variables are declared at compile-time, not initialised e.g

use strict; my $foo = "defined"; BEGIN { print "foo is ", defined($foo) ? $foo : 'undef', " during BEGIN phase\n"; }; print "foo is ", defined($foo) ? $foo : undef, " at runtime\n"; __output__ foo is undef during BEGIN phase foo is defined at runtime
This demonstrates that $foo is declared, since strict does not have a problem, but is still undefined since it hasn't has anything assigned to it.

More than naked

So far we've been using naked blocks to delimit the length of our lexical scopes. How else, you might wonder, are lexical scopes defined?

Well firstly there's the lexical file scope, which is the length of a given perl source file e.g

## lextut1.pl my $foo = "in lextut1.pl's lexical file scope"; print "\$foo is: ", (defined $foo ? $foo : "undefined"), $/;
Now on the command-line
perl -e 'require "lextut1.pl"; \ print "\$foo is: ", (defined $foo ? $foo : "undefined"), $/;' $foo is: in lextut1.pl's lexical file scope $foo is: undefined
As we can see there, $foo only lives for the length of the file lextut1.pl, and has fallen out of scope by the time require has finished doing its thing.

Secondly, the braces around subroutine code delimit a lexical scope, so anything declared within a subroutine cannot be seen from outside it e.g

sub foo { # begin lexical scope my $x = "a string"; print "\$x in foo() is: ", (defined $x ? $x : "undefined"), $/; bar(); } # end lexical scope sub bar { # begin lexical scope print "\$x in bar() is: ", (defined $x ? $x : "undefined"), $/; } # end lexical scope foo(); __output__ $x in foo() is: a string $x in bar() is: undefined
So subroutines scope follows along the same lines as the scope in naked blocks.

For conditional statements and loop statements the case is somewhat different as lexicals can be declared in the condition block/loop assignment, which occurs before the braces e.g

open(SRC, $0) or die("ack: $!"); my @lines = <SRC>; ## $line is declared *before* the braces foreach my $line (@lines) { ## $w is declared within the condition, which is ## also before the braces if(my($w) = $line =~ /\b(\w+)\b/) { print "bareword found: $w\n"; } print "\$w is: ", (defined $w ? $w : "undefined"), $/ if $line eq $lines[$#lines]; } print "\$line is: ", (defined $line ? $line : "undefined"), $/; __output__ bareword found: open bareword found: my bareword found: foreach bareword found: if bareword found: print bareword found: print bareword found: if bareword found: print $w is: undefined $line is: undefined
Although somewhat convoluted the above example demonstrates the fact that the condition of the if and the loop assignment in the foreach are lexically scoped to the braces which delimit the respective statements.

Note, however, that statement modifiers do not create a new lexical scope (this should be obvious through their lack of braces) e.g

## otherwise $r would be auto-vifified as a package global use strict; print $r,$/ if my $r = 10 % 5; __output__ Global symbol "$r" requires explicit package name at - line 1. Execution of - aborted due to compilation errors.
The remaining ways of creating a lexical scope are as follows
  • builtin functions which take code blocks e.g map, grep, exec, sort etc
  • anonymous subroutines (since they are orthogonal with normal subroutines in this respect)
  • naked blocks, anonymous or labelled
  • and the nasty but occasionally necessary string eval.

In private

A lot of literature when talking about lexical variables refers to them as private variables. This is because they cannot be seen outside their given lexical scope. As has already been illustrated, lexical variables are deleted once the end of their given scope is reached (exceptions withstanding), so they really are private to their respective scope.

A feature which is an essential part of lexical scoping is that scopes can be nested and inner scopes will not effect outer scopes e.g

my $foo = "file scope"; { my $foo = "outer scope"; { my $foo = "inner scope"; print " \$foo is: $foo\n"; } print " \$foo is: $foo\n"; } print "\$foo is: $foo\n"; __output__ $foo is: inner scope $foo is: outer scope $foo is: file scope
There, the inner scope is a new scope (much like the outer scope is a new sub scope of the file scope), so a new instance of $foo is created leaving the outer $foo untouched when the inner scope exits. And because the inner $foo only lives within that scope, it private to that scope, and nothing else can see it.

This is not to say that nested scopes do not affect the rest of the program (as any new scopes are just sub scopes of the file level lexical scope), it just means that anything created within them is private to that given scope e.g

my @list = qw(a list of words); for my $w (@list) { if($w =~ /^[aeiou]/) { $w = "$w: begins with a vowel"; } else { $w = "$w: begins with a consonant"; } print $w, $/; } __output__ a: begins with a vowel list: begins with a consonant of: begins with a vowel words: begins with a consonant
So even though we create a new scope with the if/else statement, we're still changing $w in the scope above (which in turn is modifying the elements of list since $w is just an alias to each element) as we haven't created a new $w for that particular scope (and of course, it wouldn't do us a lot of good as it would've fallen out of scope by the time we came to print it).

local debunked

Well, we've been putting it off long enough and now it is time face that most confounding of functions - local.

The first thing that we absolutely must declare is that local does not create variables! Not only does it not create variables, it has nothing to do with lexical variables.

With that said, what local does do is change the value of an existing package global for the length of a given dynamic scope. A dynamic scope is just like a lexical scope but is defined by the length of scope, not the visibility of the scope. So local is localising a package globals value for the length of a given lexical scope e.g

sub foo { print " \$x is: $x\n"; } $x = "original state"; { # beginning of lexical scope local $x = "altered state"; foo(); } # end of lexical scope print "\$x is: $x\n"; __output__ $x is: altered state $x is: original state
As we can see the value of $x is still set to 'altered state' in foo() even though its outside of the initial lexical scope. But because $x has been dynamically scoped with local and foo() was called within the surrounding lexical scope $x will stay set to 'altered state' until the lexical scope exits.

You might also see examples of it being used to create private variables - this is rather misguided as it is auto-vivifying (creating it upon request of its existence) the variable e.g

{ # begin lexical scope local $x = "auto-vivified"; print " \$x is: ", (defined $x ? $x : "undefined"), $/; } # end lexical scope print "\$x is: ", (defined $x ? $x : "undefined"), $/; print "*x is: ", (exists $main::{x} ? $main::{x} : "undefined"), $/; __output__ $x is: auto-vivified $x is: undefined $main::{x} is: *main::x
So local has forced $x's temporary creation and then it dutifully fall's out of scope, leaving it undefined but still with an existing entry in the symbol table.

So generally you'll want to use my instead of local. However local does have its uses, such as localising punctuation globals e.g

use IO::File; my $file; { ## this trick is known as file slurping local $/; my $fh = IO::File->new("lextut1.pl") or die("ack: $!"); $file = <$fh>; } print $file; my @foo = qw( a comma separated list of words ); { local $" = ', '; print "@foo\n" } __output__ my $foo = "in the lextut1.pl's lexical file scope"; print "\$foo is: ", (defined $foo ? $foo : "undefined"), $/; a, comma, separated, list, of, words
In the first case we've set the input separator to undefined, so when $fh is read, it reads right to the end of the file. And in the second case we localise the list separator for stringfied lists to a comma followed by a space, and the original list describes its final output.

our variables

This is somewhat of an oddball in the world of variables in that it creates a package level variable which is visible for the remaining lexical scope e.g

{ package foo; our $x = "in foo"; package bar; ## $x can still be seen as it is still in scope print " \$x is: $x\n"; } print "\$foo::x is: $foo::x\n"; __output__ $x is: in foo $foo::x is: in foo
So our $x has created the package global $foo::x, but it is also visible in the remaining lexical scope which can still be seen in the package bar. This illustrates why our is somewhat of a two-faced function and best left alone unless the behaviour is specifically desired (at least in this humble tutorial author's opinion).

Scoping schmoping

Ok, you say, I can see what lexical scoping is about and have an understanding of how it works, but what use is it to me?

Firstly, you can neatly encapsulate separate groups of operations into individual lexical scopes to avoid namespace collision and the like (this is widely demonstrated through the use of subroutines and modules). This in turn leads to nicely encapsulated sections of code which can be isolated from the main body of code, which in turns means that the variables will tie very closely to the surrounding code.

Secondly, because lexical scoping is determined at compile-time, if there are any errors they will be picked up before the program can even run (this is doubly true if you're running with strictures on, you are use()ing strict right?).

Thirdly, at the exit of a lexical scope all the variables are destroyed (except of course, for those that are still in use), which means your memory won't keep growing and growing as more variables are created. Also quite handily, any objects will have the DESTROY method called upon exit, so you can handle how your objects are cleaned up.

Something useful

Now we're done with our learning, let's have some doing!

The below example will recurse through a given directory and will list each .pl and .pm with the amount of lines in the file.

## set stricture checking for the rest of the file scope use strict; ## see. man perllexwarn for why this is double-plus good use warnings; ## ah, heaven-sent use File::Find::Rule; ## for lexically scoped file-handles use IO::File; ## prevent subroutines from being able access program level variables { ## naked block's lexical scope my @files = File::Find::Rule->file() ->name("*.{pl,pm}") ->in( shift @ARGV ); ## $fl is in the foreach lexical scope foreach my $fl (@files) { ## ditto with $lc my $lc = count_lines($fl); print "$fl: $lc line".($lc != 1 && 's')." of code\n"; } } sub count_lines { ## will be closed when we exit the current scope my $fh = IO::File->new(shift) or die("ack: $!"); ## scoped (and therefore private) to count_lines() my $count = 0; $count++ while <$fh>; return $count; }
Wow, there's quite a lot of lexical scoping going on there, both explicitly (i.e the naked block containing the core of the program) and implicitly (i.e count_lines()' lexical scope) and at this point it should all be pretty straight forward (and I imagine the comments help too :).

In review

A lexical scope defines an area of code in which any variables declared within that area will live for only duration of the execution of that area of code, unless a variable is still referenced after the area of code has been left. A dynamic scope is orthogonal to a lexical scope and is defined by the length of the scope (as opposed to the visibility of the scope).

my declares lexically scoped variables at compile time, local changes a package global's value throughout a dynamic scope and our creates a package global which is visible throughout its given lexical scope.

And there we have it! I hope you've enjoyed this tutorial and gotten everything out of it that you had intended to, and can now go forth and frolic in the land of lexical scoping with glee and pride!

Many thanks to adrianh, AltBlue, BrowserUk, davis, dingus, Elian, jdporter, jpl, robartes and tye for their input and help in knocking out the various bugs.

_________
broquaint


In reply to Lexical scoping like a fox by broquaint

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • Outside of code tags, you may need to use entities for some characters:
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?
    Username:
    Password:

    What's my password?
    Create A New User
    Chatterbox?
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others chilling in the Monastery: (11)
    As of 2014-08-29 21:00 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?

      The best computer themed movie is:











      Results (289 votes), past polls