<?xml version="1.0" encoding="windows-1252"?>
<node id="211441" title="Of Symbol Tables and Globs" created="2002-11-08 09:53:49" updated="2005-08-14 10:29:40">
<type id="120">
perlmeditation</type>
<author id="87452">
broquaint</author>
<data>
<field name="doctext">
Welcome to Of Symbol Tables and Globs where you'll be taken on a journey through the
inner workings of those mysterious perlish substances: globs and symbol tables.
We'll start off in the land of symbol tables where the globs live and in the
second part of the tutorial progress onto the glob creatures themselves.
&lt;p/&gt;

&lt;readmore&gt;

&lt;b&gt;Symbol tables&lt;/b&gt;
&lt;p/&gt;

Perl has two different types of variables - lexical and package global. In this
particular tutorial we'll only be covering package global variables as lexical
variables have nothing to do with globs or symbol tables (see. [id://213855] for more information on lexical variables).
&lt;p/&gt;

Now a package global
variable can only live within a symbol table and is dynamically scoped (versus
lexically scoped). These package global variables live in symbol tables, or to
be more accurate, they live in slots within globs which themselves live
in the symbol tables.
&lt;p/&gt;

A symbol table comes about in various ways, but the most
common way in which they are created is through the &lt;tt&gt;[perlfunc:package|package]&lt;/tt&gt;
declaration. Every variable, subroutine, filehandle and format
declared within a package will live in a glob slot within the given package's
symbol table (this is of course excluding any lexical declarations)

&lt;code&gt;
## create an anonymous block to limit the scope of the package
{
  package globtut;

  $var = "a string";
  @var  = qw( a list of strings );

  sub var { }
}

use Data::Dumper;

print Dumper(\%globtut::);

__output__

$VAR1 = { 
          'var' =&gt; *globtut::var
        };
&lt;/code&gt;

There we create a symbol table with &lt;tt&gt;package globtut&lt;/tt&gt;, then
the scalar, array and subroutine are all 'put' into the &lt;tt&gt;*var&lt;/tt&gt;
glob because they all share the same name. This is implicit behavior for
the vars, so if we wanted to explicitly declare the vars into
the &lt;tt&gt;globtut&lt;/tt&gt; symbol table we'd do the following

&lt;code&gt;
$globtut::var = "a string";
@globtut::var  = qw( a list of strings );

sub globtut::var { }

use Data::Dumper;

print Dumper(\%globtut::);

__output__

$VAR1 = { 
          'var' =&gt; *globtut::var
        };
&lt;/code&gt;

Notice how we didn't use a &lt;tt&gt;[perlfunc:package|package]&lt;/tt&gt; declaration
there? This is because the &lt;tt&gt;globtut&lt;/tt&gt; symbol table is
&lt;i&gt;auto-vivified&lt;/i&gt; when &lt;tt&gt;$globtut::var&lt;/tt&gt; is declared.
&lt;p/&gt;

Something else to note about the symbol table
is that it has two colons appended to the name, so &lt;tt&gt;globtut&lt;/tt&gt;
became &lt;tt&gt;%globtut::&lt;/tt&gt;. This means that any packages that live below that will
have &lt;tt&gt;::&lt;/tt&gt; prepended to the name, so if we add a child package it would be neatly
separated by the double colons e.g

&lt;code&gt;
use Data::Dumper;
{
  package globtut;
  package globtut::child;
  ##             ^^
}
&lt;/code&gt;

Another attribute of symbol tables demonstrated when &lt;tt&gt;%globtut::&lt;/tt&gt; was dumped above
is that they are accessed just like normal perl hashes. In fact, they are like normal
hashes in many respects, you can perform all the normal hash operations on a symbol
table and add normal key-value pairs, and if you're brave enough to look under the hood
you'll notice that they are in fact hashes, but with a touch of that perl &lt;i&gt;Magic&lt;/i&gt;.
Here are some examples of hash operations being used on symbol tables

&lt;code&gt;
use Data::Dumper;
{
  package globtut;

  $foo = "a string";

  $globtut::{bar} = "I'm not even a glob!";
  %globtut::baz:: = %globtut::;

  print Data::Dumper::Dumper(\%globtut::baz::);

  print "keys:    ", join(', ', keys %globtut::),   $/;
  print "values:  ", join(', ', values %globtut::), $/;
  print "each:    ", join(' =&gt; ', each %globtut::), $/;

  print "exists:  ", (exists $globtut::{foo}  &amp;&amp; "exists"),  $/;
  print "delete:  ", (delete $globtut::{foo}  &amp;&amp; "deleted"), $/;
  print "defined: ", (defined $globtut::{foo} || "no foo"),  $/;
}

__output__

$VAR1 = {
          'foo' =&gt; *globtut::foo,
          'bar' =&gt; 'I\'m not even a glob!',
          'baz::' =&gt; *{'globtut::baz::'}
        };
keys:    foo, bar, baz::
values:  *globtut::foo, I'm not even a glob!, *globtut::baz::
each:    foo =&gt; *globtut::foo
exists:  exists
delete:  deleted
defined: no foo
&lt;/code&gt;

So to access the globs within the &lt;tt&gt;globtut&lt;/tt&gt; symbol table we access
the desired key which will correspond to a variable name

&lt;code&gt;
{
  package globtut;

  $variable = "a string";
  @variable  = qw( a list of strings );

  sub variable { }

  print $globtut::{variable}, "\n";
}

__output__

*globtut::variable
&lt;/code&gt;

And if we want to add another glob to a symbol table we add it exactly
like we would with a hash

&lt;code&gt;
{
  package globtut;

  $foo = "a string";
  $globtut::{variable} = *foo;

  print "\$variable: $variable\n";
}

__output__

$variable: a string
&lt;/code&gt;

If you'd like to see some more advanced uses of symbol tables and symbol
table manipulation then check out the &lt;tt&gt;[Symbol]&lt;/tt&gt; module which
comes with the core perl distribution, and more specifically the
&lt;tt&gt;Symbol::gensym&lt;/tt&gt; function.
&lt;p/&gt;

&lt;b&gt;Globs&lt;/b&gt;
&lt;p/&gt;

So we can now see that globs live within symbol tables, but that doesn't tell
us a lot about globs themselves and so this section of the tutorial shall endeavour
to explain them.
&lt;p/&gt;

Within a glob are 6 slots where the various perl data types will be stored.
The 6 slots which are available are
&lt;p/&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;tt&gt;SCALAR&lt;/tt&gt; - scalar variables
  &lt;li&gt;&lt;tt&gt;ARRAY&lt;/tt&gt; - array variables
  &lt;li&gt;&lt;tt&gt;HASH&lt;/tt&gt; - hash variables
  &lt;li&gt;&lt;tt&gt;CODE&lt;/tt&gt; - subroutines
  &lt;li&gt;&lt;tt&gt;IO&lt;/tt&gt; - directory/file handles
  &lt;li&gt;&lt;tt&gt;FORMAT&lt;/tt&gt; - formats
&lt;/ul&gt;
&lt;p/&gt;

All these slots are accessible [id://201967|bar the FORMAT slot]. Why this is I don't
know, but I don't think it's of any great loss.
&lt;p/&gt;

It may be asked as to why there isn't a &lt;tt&gt;GLOB&lt;/tt&gt; type, and the answer would
be that globs are containers or meta-types (depending on how you want to see
it) not data types.
&lt;p/&gt;

Accessing globs is similar to accessing hashes, accept we use
the &lt;tt&gt;*&lt;/tt&gt; sigil and the only keys are those data types listed above
&lt;p/&gt;

&lt;code&gt;
$scalar = "a simple string";
print *scalar{SCALAR}, "\n";

__output__

SCALAR(0x8107e78)
&lt;/code&gt;

"$Exclamation", you say, "I was expecting 'a simple string', not a reference!".
This is because the slots within the globs only contain references, and these
references point to the values. So what we really wanted to say was

&lt;code&gt;
$scalar = "a simple string";
print ${ *scalar{SCALAR} }, "\n";

__output__

a simple string
&lt;/code&gt;

Which is essentially just a complex way of saying

&lt;code&gt;
  $scalar = "a simple string";
  print $::scalar, "\n";

  __output__

  a simple string
&lt;/code&gt;

So as you can probably guess perl's sigils are the conventional method of
accessing the individual data types within globs. As for the likes of &lt;tt&gt;IO&lt;/tt&gt;
it has to be accessed specifically as perl doesn't provide an access sigil
for it.
&lt;p/&gt;

Something you may have noticed is that we're referencing the globs directly,
without going through the symbol table. This is because globs are "global"
and are not effected by &lt;tt&gt;[strict]&lt;/tt&gt;. But if we wanted to access the
globs via the symbol table then we would do it like so

&lt;code&gt;
$scalar = "a simple string";
print ${ *{$main::{scalar}}{SCALAR} }, "\n";

__output__

a simple string
&lt;/code&gt;

Now the devious among you may be thinking something along the lines of
"If it's a hash then why don't I just put any old value in there?".
The answer to this of course, is that you can't as globs aren't hashes!
So we can try, but we will fail like so

&lt;code&gt;
${ *scalar{FOO} } = "the FOO data type";

__output__

Can't use an undefined value as a SCALAR reference at - line 1.
&lt;/code&gt;

So we can't force a new type into the glob, we'll only ever get an undefined
value when an undefined slot is accessed.  But if we were to use &lt;tt&gt;SCALAR&lt;/tt&gt;
instead of &lt;tt&gt;FOO&lt;/tt&gt; then the &lt;tt&gt;$scalar&lt;/tt&gt; variable would contain
&lt;tt&gt;"the FOO data type"&lt;/tt&gt;.
&lt;p/&gt;

Another thing to be noted from the above example is that you can't assign to
glob slots directly, only through dereferencing them.

&lt;code&gt;
## this is fine as we're dereferencing the stored reference
${ *foo{SCALAR} } = "a string";

## this will generate a compile-time error
*foo{SCALAR} = "a string";

__output__

Can't modify glob elem in scalar assignment at - line 5, near ""a string";"
&lt;/code&gt;

As one might imagine having to dereference a glob with the correct data every
time one wants to assign to a glob can be tedious and occasionally prohibitive.
Thankfully, globs come with some of perl's yet to be patented
&lt;i&gt;Magic&lt;/i&gt;, so that when you assign to a glob the correct slot will be
filled depending on the datatype being used in the assignment e.g

&lt;code&gt;
*foo = \"a scalar";
print $foo, "\n";

*foo = [ qw( a list of strings ) ];
print @foo, "\n";


*foo = sub { "a subroutine" };
print foo(), "\n";

__output__

a scalar
alistofstrings
a subroutine
&lt;/code&gt;

Note that we're using references there as globs only contain references, not
the actual values. If you assign a value to a glob, it will assign the glob to a
glob of the name corresponding to the value. Here's some code to help clarify
that last sentence

&lt;code&gt;
use Data::Dumper;
## use a fresh uncluttered package for minimal Dumper output
{
  package globtut;

  *foo = "string";

  print Data::Dumper::Dumper(\%globtut::);
}

__output__

$VAR1 = {
      'string' =&gt; *globtut::string,
      'foo' =&gt; *globtut::string
};
&lt;/code&gt;

So when the glob &lt;tt&gt;*foo&lt;/tt&gt; is assigned &lt;tt&gt;"string"&lt;/tt&gt; it then points to
the glob &lt;tt&gt;*string&lt;/tt&gt;. But this is generally not what you want, so moving
on swiftly ...
&lt;p/&gt;

&lt;b&gt;Bringing it all together&lt;/b&gt;
&lt;p/&gt;

Now that we have some knowledge of symbol tables and globs let's put them to
use by implementing an &lt;tt&gt;[perlfunc:import|import]&lt;/tt&gt; method.
&lt;p/&gt;

When &lt;tt&gt;[perlfunc:use|use]&lt;/tt&gt;()ing a module the &lt;tt&gt;[perlfunc:import|import]&lt;/tt&gt; method
is called from that module.
The purpose of this is so that you can import things into the calling package.
This is what &lt;tt&gt;[Exporter]&lt;/tt&gt; does, it imports the things listed in &lt;tt&gt;@EXPORT&lt;/tt&gt; and
optionally &lt;tt&gt;@EXPORT_OK&lt;/tt&gt; (see the [Exporter|Exporter docs] for more details).
An import method will do this by assigning things to the caller's symbol table.
&lt;p/&gt;

We'll now write a very simple import method to import all the subroutines into
the caller's package

&lt;code&gt;
## put this code in Foo.pm

package Foo;

use strict;

sub import {
    ## find out who is calling us
    my $pkg = caller;

    ## while strict doesn't deal with globs, it still
    ## catches symbolic de/referencing
    no strict 'refs';

    ## iterate through all the globs in the symbol table
    foreach my $glob (keys %Foo::) {
        ## skip anything without a subroutine and 'import'
        next if not defined *{$Foo::{$glob}}{CODE}
                or $glob eq 'import';

        ## assign subroutine into caller's package
        *{$pkg . "::$glob"} = \&amp;{"Foo::$glob"};
    }
}

## this won't be imported ...
$Foo::testsub = "a string";

## ... but this will
sub testsub {
    print "this is a testsub from Foo\n";
}

## and so will this
sub fooify {
    return join " foo ", @_;
}

q&lt;/package Foo&gt;;
&lt;/code&gt;

Now for the demonstration code

&lt;code&gt;
use Data::Dumper;
## we'll stay out of the 'polluted' %main:: symbol table
{
  package globtut;

  use Foo;

  testsub();

  print "no \$testsub defined\n"
      unless defined $testsub;

  print "fooified: ", fooify(qw( ichi ni san shi )), "\n";

  print Data::Dumper::Dumper(\%globtut::);
}

__output__

this is a testsub from Foo
no $testsub defined
fooified: ichi foo ni foo san foo shi
$VAR1 = {
          'testsub' =&gt; *globtut::testsub,
          'BEGIN' =&gt; *globtut::BEGIN,
          'fooify' =&gt; *globtut::fooify
        };
&lt;/code&gt;
 
Hurrah, we have succesfully imported &lt;tt&gt;Foo&lt;/tt&gt;'s subroutines into
the &lt;tt&gt;globtut&lt;/tt&gt; symbol table (the &lt;tt&gt;BEGIN&lt;/tt&gt; there is somewhat magical
and created during the &lt;tt&gt;[perlfunc:use|use]&lt;/tt&gt;).
&lt;p/&gt;

&lt;b&gt;Summary&lt;/b&gt;
&lt;p/&gt;

So in summary, symbol tables store globs and can be treated like hashes.
Globs are accessed &lt;i&gt;like&lt;/i&gt; hashes and store references to the individual data types.
I hope you've learned something along the way and can now go forth and
munge these two no longer mysterious aspects of perl with confidence!

&lt;/readmore&gt;

&lt;p/&gt;
&lt;tt&gt;_________&lt;br&gt;&lt;u&gt;broquaint&lt;/u&gt;&lt;/tt&gt;</field>
</data>
</node>
