Tabari has asked for the wisdom of the Perl Monks concerning the following question:

Hello Monks

In an attempt at code clean up which was long overdue, I wanted to separate data form logic in my module and therefore tried to apply the <DATA> filehandle in the BEGIN section, putting the data neatly at the end of the file.
To no avail , however, although the same code worked well in a normal subroutine . I finally reverted to a here document instead.
This begs the folowing questions, in order of ascending vagueness :

1/ Is the DATA handle not initialized in BEGIN?
2/ Can I force such an initialization somehow, be it via another package?
3/ Is there a cleaner way for all this?

Tabari

Comment on __DATA__ used in BEGIN
Re: __DATA__ used in BEGIN
by moritz (Cardinal) on Nov 06, 2007 at 11:27 UTC
    You can't use <DATA> in a BEGIN block because that BEGIN block is executed at compile time, i.e. before the rest of the file is even parsed.

    If you don't need BEGIN time, you could try to stuff your routine that works with the data into a CHECK or INIT block.

    But it's hard to give an advise without knowing why you need which data at compile time.

      Some further context seems indeed to be appropriate.
      In our version control system, we have rules to associate the names of some objects with a specific location in a directory structure.
      We made a perl wrapping around our VC system to automate this logic and to avoid the creation of too many categories. At that time, we were already faced with a history of objects which did not fit the pattern.
      For these objects , hashes were created, as a kind of exception to the rule and these were initialized at the loading the module.
      Alas, as so often happens, exceptions proliferated, so this list became rather long, which lead to my question above.
      To be short, a (huge) hash has to be initialised at run time.
      Tabari

        Why can you not just:

        my %hugeHash = eval do {local $/; <DATA>}; ... __DATA__ exception1 => 'wibble', ...

        to load your hash from the data section? You could make it a whole lot more robust, but if you want quick and dirty that gets you going.


        Perl is environmentally friendly - it saves trees
        For these objects , hashes were created, as a kind of exception to the rule and these were initialized at the loading the module.

        ...

        To be short, a (huge) hash has to be initialised at run time.

        This is still confusing. Are you sure you need to do this in a BEGIN block? If you really need to initialize this hash "at runtime" as you say, then you can do it in normal code. (And if you can do it in normal code, I would suggest calling a sub that returns the hash, rather than using __DATA__.)

        Also unclear is whether this initialization is happening in a script or in a module. If it's a module and you just need the hash to be created when the module is used, then again a BEGIN block is not necessary. If, on the other hand, it's in a script and you need the hash to be created before any modules are loaded, then BEGIN is necessary.

        It would help to see some actual code here.

Re: __DATA__ used in BEGIN
by shmem (Canon) on Nov 06, 2007 at 12:08 UTC
    As moritz pointed out, BEGIN blocks are executed at compile time, so the rest of the file is not known at that time. So, no __DATA__ token has been seen yet, and its filehandle can't be opened.

    If you absolutely need __DATA__ in a BEGIN block, you have to roll your own __DATA__ handling (opening and positioning, i.e. skipping all until the __DATA__ token):

    package Foo::Bar::Quux; print "now in runtime...\n"; BEGIN { (my $package = __PACKAGE__) =~ s!::!/!g; my $file = $INC{$package.'.pm'}; open DATA , '<', $file or die "Can't open '$file': $!\n"; my $data = 0; while (<DATA>) { /^__DATA__$/ and $data = 1 and next; next unless $data; print "DATA: $_"; }; } 1; __DATA__ foo bar quux now in runtime... now in main
    qwurx [shmem] ~ > perl -e 'print "now in main\n"; use Foo::Bar::Quux' DATA: foo DATA: bar DATA: quux now in runtime...

    That's for a module. In a script, you would open $0.

    --shmem

    _($_=" "x(1<<5)."?\n".q·/)Oo.  G°\        /
                                  /\_¯/(q    /
    ----------------------------  \__(m.====·.(_("always off the crowd"))."·
    ");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}
      Using a heredoc would have the same effect with less magic, although the code layout would be different.
      print "now in runtime...\n"; BEGIN { open my $fh , '<', \<<'__EOI__' or die; foo bar quux now in runtime... now in main __EOI__ while (<$fh>) { print "DATA: $_"; } } ...
Re: __DATA__ used in BEGIN
by KurtSchwind (Hermit) on Nov 06, 2007 at 13:18 UTC
    Like others have said, you can't ref the DATA block in the BEGIN section. And a HEREDOC isn't a bad solution, but if you'd like to better seperate the logic from the code and have it in your BEGIN section, you could use a do to a seperate file that has the HEREDOC in it. This might lead to a bit of a cleaner design.
    --
    I used to drive a Heisenbergmobile, but everyone I looked at the speedometer, I got lost.