|Perl: the Markov chain saw|
Welcome to Of Symbol Tables and Globs where you'll be taken on a journey through the inner workings of those mysterious perlish substances: globs and symbol tables. We'll start off in the land of symbol tables where the globs live and in the second part of the tutorial progress onto the glob creatures themselves.
Perl has two different types of variables - lexical and package global. In this particular tutorial we'll only be covering package global variables as lexical variables have nothing to do with globs or symbol tables (see. Lexical scoping like a fox for more information on lexical variables).
Now a package global variable can only live within a symbol table and is dynamically scoped (versus lexically scoped). These package global variables live in symbol tables, or to be more accurate, they live in slots within globs which themselves live in the symbol tables.
A symbol table comes about in various ways, but the most common way in which they are created is through the package declaration. Every variable, subroutine, filehandle and format declared within a package will live in a glob slot within the given package's symbol table (this is of course excluding any lexical declarations)
There we create a symbol table with package globtut, then the scalar, array and subroutine are all 'put' into the *var glob because they all share the same name. This is implicit behavior for the vars, so if we wanted to explicitly declare the vars into the globtut symbol table we'd do the following
Notice how we didn't use a package declaration there? This is because the globtut symbol table is auto-vivified when $globtut::var is declared.
Something else to note about the symbol table is that it has two colons appended to the name, so globtut became %globtut::. This means that any packages that live below that will have :: prepended to the name, so if we add a child package it would be neatly separated by the double colons e.g
Another attribute of symbol tables demonstrated when %globtut:: was dumped above is that they are accessed just like normal perl hashes. In fact, they are like normal hashes in many respects, you can perform all the normal hash operations on a symbol table and add normal key-value pairs, and if you're brave enough to look under the hood you'll notice that they are in fact hashes, but with a touch of that perl Magic. Here are some examples of hash operations being used on symbol tables
So to access the globs within the globtut symbol table we access the desired key which will correspond to a variable name
And if we want to add another glob to a symbol table we add it exactly like we would with a hash
If you'd like to see some more advanced uses of symbol tables and symbol table manipulation then check out the Symbol module which comes with the core perl distribution, and more specifically the Symbol::gensym function.
So we can now see that globs live within symbol tables, but that doesn't tell us a lot about globs themselves and so this section of the tutorial shall endeavour to explain them.
Within a glob are 6 slots where the various perl data types will be stored. The 6 slots which are available are
All these slots are accessible bar the FORMAT slot. Why this is I don't know, but I don't think it's of any great loss.
It may be asked as to why there isn't a GLOB type, and the answer would be that globs are containers or meta-types (depending on how you want to see it) not data types.
Accessing globs is similar to accessing hashes, accept we use the * sigil and the only keys are those data types listed above
"$Exclamation", you say, "I was expecting 'a simple string', not a reference!". This is because the slots within the globs only contain references, and these references point to the values. So what we really wanted to say was
Which is essentially just a complex way of saying
So as you can probably guess perl's sigils are the conventional method of accessing the individual data types within globs. As for the likes of IO it has to be accessed specifically as perl doesn't provide an access sigil for it.
Something you may have noticed is that we're referencing the globs directly, without going through the symbol table. This is because globs are "global" and are not effected by strict. But if we wanted to access the globs via the symbol table then we would do it like so
Now the devious among you may be thinking something along the lines of "If it's a hash then why don't I just put any old value in there?". The answer to this of course, is that you can't as globs aren't hashes! So we can try, but we will fail like so
So we can't force a new type into the glob, we'll only ever get an undefined value when an undefined slot is accessed. But if we were to use SCALAR instead of FOO then the $scalar variable would contain "the FOO data type".
Another thing to be noted from the above example is that you can't assign to glob slots directly, only through dereferencing them.
As one might imagine having to dereference a glob with the correct data every time one wants to assign to a glob can be tedious and occasionally prohibitive. Thankfully, globs come with some of perl's yet to be patented Magic, so that when you assign to a glob the correct slot will be filled depending on the datatype being used in the assignment e.g
Note that we're using references there as globs only contain references, not the actual values. If you assign a value to a glob, it will assign the glob to a glob of the name corresponding to the value. Here's some code to help clarify that last sentence
So when the glob *foo is assigned "string" it then points to the glob *string. But this is generally not what you want, so moving on swiftly ...
Bringing it all together
Now that we have some knowledge of symbol tables and globs let's put them to use by implementing an import method.
When use()ing a module the import method is called from that module. The purpose of this is so that you can import things into the calling package. This is what Exporter does, it imports the things listed in @EXPORT and optionally @EXPORT_OK (see the Exporter docs for more details). An import method will do this by assigning things to the caller's symbol table.
We'll now write a very simple import method to import all the subroutines into the caller's package
Now for the demonstration code
Hurrah, we have succesfully imported Foo's subroutines into the globtut symbol table (the BEGIN there is somewhat magical and created during the use).
So in summary, symbol tables store globs and can be treated like hashes. Globs are accessed like hashes and store references to the individual data types. I hope you've learned something along the way and can now go forth and munge these two no longer mysterious aspects of perl with confidence!