Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??
Welcome to Of Symbol Tables and Globs where you'll be taken on a journey through the inner workings of those mysterious perlish substances: globs and symbol tables. We'll start off in the land of symbol tables where the globs live and in the second part of the tutorial progress onto the glob creatures themselves.

Symbol tables

Perl has two different types of variables - lexical and package global. In this particular tutorial we'll only be covering package global variables as lexical variables have nothing to do with globs or symbol tables (see. Lexical scoping like a fox for more information on lexical variables).

Now a package global variable can only live within a symbol table and is dynamically scoped (versus lexically scoped). These package global variables live in symbol tables, or to be more accurate, they live in slots within globs which themselves live in the symbol tables.

A symbol table comes about in various ways, but the most common way in which they are created is through the package declaration. Every variable, subroutine, filehandle and format declared within a package will live in a glob slot within the given package's symbol table (this is of course excluding any lexical declarations)

## create an anonymous block to limit the scope of the package { package globtut; $var = "a string"; @var = qw( a list of strings ); sub var { } } use Data::Dumper; print Dumper(\%globtut::); __output__ $VAR1 = { 'var' => *globtut::var };
There we create a symbol table with package globtut, then the scalar, array and subroutine are all 'put' into the *var glob because they all share the same name. This is implicit behavior for the vars, so if we wanted to explicitly declare the vars into the globtut symbol table we'd do the following
$globtut::var = "a string"; @globtut::var = qw( a list of strings ); sub globtut::var { } use Data::Dumper; print Dumper(\%globtut::); __output__ $VAR1 = { 'var' => *globtut::var };
Notice how we didn't use a package declaration there? This is because the globtut symbol table is auto-vivified when $globtut::var is declared.

Something else to note about the symbol table is that it has two colons appended to the name, so globtut became %globtut::. This means that any packages that live below that will have :: prepended to the name, so if we add a child package it would be neatly separated by the double colons e.g

use Data::Dumper; { package globtut; package globtut::child; ## ^^ }
Another attribute of symbol tables demonstrated when %globtut:: was dumped above is that they are accessed just like normal perl hashes. In fact, they are like normal hashes in many respects, you can perform all the normal hash operations on a symbol table and add normal key-value pairs, and if you're brave enough to look under the hood you'll notice that they are in fact hashes, but with a touch of that perl Magic. Here are some examples of hash operations being used on symbol tables
use Data::Dumper; { package globtut; $foo = "a string"; $globtut::{bar} = "I'm not even a glob!"; %globtut::baz:: = %globtut::; print Data::Dumper::Dumper(\%globtut::baz::); print "keys: ", join(', ', keys %globtut::), $/; print "values: ", join(', ', values %globtut::), $/; print "each: ", join(' => ', each %globtut::), $/; print "exists: ", (exists $globtut::{foo} && "exists"), $/; print "delete: ", (delete $globtut::{foo} && "deleted"), $/; print "defined: ", (defined $globtut::{foo} || "no foo"), $/; } __output__ $VAR1 = { 'foo' => *globtut::foo, 'bar' => 'I\'m not even a glob!', 'baz::' => *{'globtut::baz::'} }; keys: foo, bar, baz:: values: *globtut::foo, I'm not even a glob!, *globtut::baz:: each: foo => *globtut::foo exists: exists delete: deleted defined: no foo
So to access the globs within the globtut symbol table we access the desired key which will correspond to a variable name
{ package globtut; $variable = "a string"; @variable = qw( a list of strings ); sub variable { } print $globtut::{variable}, "\n"; } __output__ *globtut::variable
And if we want to add another glob to a symbol table we add it exactly like we would with a hash
{ package globtut; $foo = "a string"; $globtut::{variable} = *foo; print "\$variable: $variable\n"; } __output__ $variable: a string
If you'd like to see some more advanced uses of symbol tables and symbol table manipulation then check out the Symbol module which comes with the core perl distribution, and more specifically the Symbol::gensym function.

Globs

So we can now see that globs live within symbol tables, but that doesn't tell us a lot about globs themselves and so this section of the tutorial shall endeavour to explain them.

Within a glob are 6 slots where the various perl data types will be stored. The 6 slots which are available are

  • SCALAR - scalar variables
  • ARRAY - array variables
  • HASH - hash variables
  • CODE - subroutines
  • IO - directory/file handles
  • FORMAT - formats

All these slots are accessible bar the FORMAT slot. Why this is I don't know, but I don't think it's of any great loss.

It may be asked as to why there isn't a GLOB type, and the answer would be that globs are containers or meta-types (depending on how you want to see it) not data types.

Accessing globs is similar to accessing hashes, accept we use the * sigil and the only keys are those data types listed above

$scalar = "a simple string"; print *scalar{SCALAR}, "\n"; __output__ SCALAR(0x8107e78)
"$Exclamation", you say, "I was expecting 'a simple string', not a reference!". This is because the slots within the globs only contain references, and these references point to the values. So what we really wanted to say was
$scalar = "a simple string"; print ${ *scalar{SCALAR} }, "\n"; __output__ a simple string
Which is essentially just a complex way of saying
$scalar = "a simple string"; print $::scalar, "\n"; __output__ a simple string
So as you can probably guess perl's sigils are the conventional method of accessing the individual data types within globs. As for the likes of IO it has to be accessed specifically as perl doesn't provide an access sigil for it.

Something you may have noticed is that we're referencing the globs directly, without going through the symbol table. This is because globs are "global" and are not effected by strict. But if we wanted to access the globs via the symbol table then we would do it like so

$scalar = "a simple string"; print ${ *{$main::{scalar}}{SCALAR} }, "\n"; __output__ a simple string
Now the devious among you may be thinking something along the lines of "If it's a hash then why don't I just put any old value in there?". The answer to this of course, is that you can't as globs aren't hashes! So we can try, but we will fail like so
${ *scalar{FOO} } = "the FOO data type"; __output__ Can't use an undefined value as a SCALAR reference at - line 1.
So we can't force a new type into the glob, we'll only ever get an undefined value when an undefined slot is accessed. But if we were to use SCALAR instead of FOO then the $scalar variable would contain "the FOO data type".

Another thing to be noted from the above example is that you can't assign to glob slots directly, only through dereferencing them.

## this is fine as we're dereferencing the stored reference ${ *foo{SCALAR} } = "a string"; ## this will generate a compile-time error *foo{SCALAR} = "a string"; __output__ Can't modify glob elem in scalar assignment at - line 5, near ""a stri +ng";"
As one might imagine having to dereference a glob with the correct data every time one wants to assign to a glob can be tedious and occasionally prohibitive. Thankfully, globs come with some of perl's yet to be patented Magic, so that when you assign to a glob the correct slot will be filled depending on the datatype being used in the assignment e.g
*foo = \"a scalar"; print $foo, "\n"; *foo = [ qw( a list of strings ) ]; print @foo, "\n"; *foo = sub { "a subroutine" }; print foo(), "\n"; __output__ a scalar alistofstrings a subroutine
Note that we're using references there as globs only contain references, not the actual values. If you assign a value to a glob, it will assign the glob to a glob of the name corresponding to the value. Here's some code to help clarify that last sentence
use Data::Dumper; ## use a fresh uncluttered package for minimal Dumper output { package globtut; *foo = "string"; print Data::Dumper::Dumper(\%globtut::); } __output__ $VAR1 = { 'string' => *globtut::string, 'foo' => *globtut::string };
So when the glob *foo is assigned "string" it then points to the glob *string. But this is generally not what you want, so moving on swiftly ...

Bringing it all together

Now that we have some knowledge of symbol tables and globs let's put them to use by implementing an import method.

When use()ing a module the import method is called from that module. The purpose of this is so that you can import things into the calling package. This is what Exporter does, it imports the things listed in @EXPORT and optionally @EXPORT_OK (see the Exporter docs for more details). An import method will do this by assigning things to the caller's symbol table.

We'll now write a very simple import method to import all the subroutines into the caller's package

## put this code in Foo.pm package Foo; use strict; sub import { ## find out who is calling us my $pkg = caller; ## while strict doesn't deal with globs, it still ## catches symbolic de/referencing no strict 'refs'; ## iterate through all the globs in the symbol table foreach my $glob (keys %Foo::) { ## skip anything without a subroutine and 'import' next if not defined *{$Foo::{$glob}}{CODE} or $glob eq 'import'; ## assign subroutine into caller's package *{$pkg . "::$glob"} = \&{"Foo::$glob"}; } } ## this won't be imported ... $Foo::testsub = "a string"; ## ... but this will sub testsub { print "this is a testsub from Foo\n"; } ## and so will this sub fooify { return join " foo ", @_; } q</package Foo>;
Now for the demonstration code
use Data::Dumper; ## we'll stay out of the 'polluted' %main:: symbol table { package globtut; use Foo; testsub(); print "no \$testsub defined\n" unless defined $testsub; print "fooified: ", fooify(qw( ichi ni san shi )), "\n"; print Data::Dumper::Dumper(\%globtut::); } __output__ this is a testsub from Foo no $testsub defined fooified: ichi foo ni foo san foo shi $VAR1 = { 'testsub' => *globtut::testsub, 'BEGIN' => *globtut::BEGIN, 'fooify' => *globtut::fooify };
Hurrah, we have succesfully imported Foo's subroutines into the globtut symbol table (the BEGIN there is somewhat magical and created during the use).

Summary

So in summary, symbol tables store globs and can be treated like hashes. Globs are accessed like hashes and store references to the individual data types. I hope you've learned something along the way and can now go forth and munge these two no longer mysterious aspects of perl with confidence!

_________
broquaint


In reply to Of Symbol Tables and Globs by broquaint

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others cooling their heels in the Monastery: (6)
As of 2024-03-19 02:43 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found