Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

Best practices - if any?

by AriSoft (Sexton)
on Feb 20, 2010 at 21:51 UTC ( [id://824435]=perlquestion: print w/replies, xml ) Need Help??

AriSoft has asked for the wisdom of the Perl Monks concerning the following question:

Hello dear Monks,

I feel myself little stupid because my problem is so trivial. I want to split my source file to two parts but I did not get the idea how this is planned to work.

C-programmers would put line #include "second_part.c" to the end of the first part. I tried to put do 'second_part.pl' at the main block of code but it started to whine about missing global (our) variables.

I am not writing a new module or class. Just putting couple of big subs to a safe place.

Replies are listed 'Best First'.
Re: Best practices - if any?
by biohisham (Priest) on Feb 20, 2010 at 22:51 UTC
    "I am not writing a new module or class. Just putting couple of big subs to a safe place"
    If these subs are part of the same package then this is possible to achieve and then you can use the spread package by 'require'ing the filenames.pl which contain that package (File boundaries are not considered to be package boundaries in Perl). However, I wouldn't rule this out as your path to follow because I am not sure of how your code looks like..

    Following is an example of the same package's subroutines being spread over a couple of files and then accessed and checked for which __package__ they belong.

    #FileOne.pl package ProgramSpread; BEGIN{} sub subroutine1{ print "Hello from the sub 1 in ", __PACKAGE__, "\n"; } return 1; END{}
    #FileTwo.pl package ProgramSpread; #The same package above BEGIN{} sub subroutine2{ print "Hello from the sub 2 in ",__PACKAGE__,"\n"; } return 1; END{}
    #using the package require "FileOne.pl"; #File names containing the package require "FileTwo.pl"; ProgramSpread::subroutine1(); ProgramSpread::subroutine2();
    UPDATE: Though this is possible, it is still not recommended for proper design.


    Excellence is an Endeavor of Persistence. Chance Favors a Prepared Mind.

      Why BEGIN{} ... END{} ??

        That is just to show the package skeleton in general. It is not required in this case.

        There are situations like when you want to initialize some variables in the start or do some cleanup/deallocation at the end, for such cases you might wanna use BEGIN{} and END{}.

        package Constructor_Destructor; BEGIN{ our $text; $text = "Hello from BEGIN\n\n"; } sub subroutine{ print $text; } END{ print "DESTROYING...\n"; $text=0; print "Now \$text is $text\n"; print "Exiting with $?\n" } #return 1; #did not return since I am calling from the same package #Use the package: Constructor_Destructor::subroutine();
        You can also use multiple BEGIN{} and END{} subroutines, the BEGIN{} ones would execute in the order encountered and the END{} ones would execute in the reverse order they were defined in order to match the BEGIN{} subroutines..


        Excellence is an Endeavor of Persistence. Chance Favors a Prepared Mind.
Re: Best practices - if any?
by desemondo (Hermit) on Feb 20, 2010 at 22:05 UTC
    maybe you need to require second_part.pl in your main script ?

    If that's not it, maybe showing us a little of your code will help clarify what your trying to do.

      It gives me a long list of errors like these:

      Variable "$datalock" is not imported at agent.pl line 277. Variable "$debug" is not imported at agent.pl line 285.Global symbol " +$datalock" requires explicit package name at agent.pl line 277. Global symbol "$debug" requires explicit package name at agent.pl line + 285.

      I tried do and require. I had to copy use commands from the main part to the second one to get it compiled this far but now it whines about many variables like:

      our $datalock = MyLock::new; our $debug = 1; #Debug messages

      I understand that do works in limited lexical view but how should I originally declare variables which spans to global scope if "our" is not global enough?

        how should I originally declare variables which spans to global scope if "our" is not global enough?

        our is lexically scoped, and files included via require and friends have their own implicit lexical scope (so you'd need to redeclare your our variables).

        You might also declare your global package variables using use vars in order to share them across files with strictures enabled.  As another alternative, just fully qualify every occurrence of a global variable — as long as they're in the main namespace, that would simply be something like $::foo. The advantage of the latter approach is that they're immediately evident, and that the required additional typing helps to keep them at a minimum :)

        That said, think twice before you do so!  What is the real idea behind splitting the code? You say "putting couple of big subs to a safe place", but why are they unsafe in their original place?  If modularisation/reuse is the idea, why not create proper modules?  Also, having to share many variables across different files typically is an indication of bad design in the first place...

Re: Best practices - if any?
by BrowserUk (Patriarch) on Feb 21, 2010 at 00:17 UTC

    Files constitute scopes, even when "incorporated" via do. Just declare any our variables used within the subs at the top of the separate file, or better within the sub bodies they are used in, (as well as in the main file).


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

      Ok. I was already wondering what is the the practical difference with "my" and "our" variable declarations in the main level. I knew that "our" goes to symbol table but I did not realize what it means until now. I can declare the same variable in many files and they all points to the same symbol as far as the module is the same one. Right?

        I can declare the same variable in many files and they all points to the same symbol as far as the module is the same one. Right?

        With the above correction, yes.

        The only unfortunate exception is when you use threads. Then each thread inherits a cloned, non-shared copy of those globals already in existance in the thread from which it is cloned. If you want a shared global, you must post-fix it with :shared everywhere it is declared. Use it some places and omit it other and things get really messy.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.

        Variables declared with our create a package variable and a lexically scoped variable which is an alias to that package variable, visible through the entire scope (file or block) even spanning packages:

        # file foo.pl use strict; our $foo; # that's $main::foo { package Foo; our $foo = "foo"; # package variable $Foo::foo created print __FILE__,' ',__LINE__,' ',__PACKAGE__,":: \$foo is '$foo'\n" +; package Bar; print __FILE__,' ',__LINE__,' ',__PACKAGE__,":: \$foo is '$foo'\n" +; } print __FILE__,' ',__LINE__,' ',__PACKAGE__,":: \$foo is '$foo'\n"; print __FILE__,' ',__LINE__,' ',__PACKAGE__,":: \$Foo::foo is '$Foo::f +oo'\n";
        # file bar.pl use strict; { package Bar; our $foo; # package variable $Bar::foo created print __FILE__,' ',__LINE__,' ',__PACKAGE__,":: \$foo is '$foo'\n" +; } # end of scope package Foo; our $foo; # package variable $Foo::foo initialized in 'foo.pl' print __FILE__,' ',__LINE__,' ',__PACKAGE__,":: \$foo is '$foo'\n"; package Bar; print __FILE__,' ',__LINE__,' ',__PACKAGE__,":: \$foo is '$foo'\n";
        #!/usr/bin/perl use strict; our $foo = 'bar'; # package variale main::foo require 'foo.pl'; require 'bar.pl'; print __FILE__,' ',__LINE__,' ',__PACKAGE__,":: \$foo is '$foo'\n";

        Running main.pl yields

        foo.pl 8 Foo:: $foo is 'foo' foo.pl 11 Bar:: $foo is 'foo' foo.pl 13 main:: $foo is 'bar' foo.pl 14 main:: $Foo::foo is 'foo' bar.pl 7 Bar:: $foo is '' bar.pl 12 Foo:: $foo is 'foo' bar.pl 15 Bar:: $foo is 'foo' main.pl 8 main:: $foo is 'bar'

        updated as per JavaFan's comment below. Of course there's only one variable and it's alias in the current scope.

Re: Best practices - if any?
by afoken (Chancellor) on Feb 21, 2010 at 12:27 UTC

    If you have that much code that you feel the need to spread it over several files, think about modularising it.

    In C, you would better NOT simply #include "second_part.c", but instead split your code into smaller parts, compile them separately into object files, and use the linker to create a single executable. You would perhaps end with something like main.c, inputreader.c, logger.c, smoothify.c, prettyprint.c, and perhaps utils.c and globals.c. For most of the files, there would be a corresponding *.h file containing the "public" interface, i.e. those functions that are called by one of the other files (logger.h would perhaps contain something like extern int initlogger(const char * logfile); and extern void log(int level, const char * message);, globals.h would instead define the few needed global variables, e.g. extern int verbose; extern char frobnicate;). All functions (and global variables) not needed outside one of the source files would be declared as static, so that the linker does not try to resolve those names.

    In Perl, you would do pretty much the same: Put groups of functions into modules, have a public interface for each module (i.e. use Exporter for non-OOP code), and have a short main program that delegates to the modules.

    Because Perl already has a lot of modules, use a unique prefix for your module names. If you have no better idea, use the application name and / or your last name or your company's name. You would end with AriSoft::Frobnicate for the main routines, AriSoft::Frobnicate::InputReader, AriSoft::Frobnicate::Logger, AriSoft::Frobnicate::Smoothify, AriSoft::Frobnicate::PrettyPrint, AriSoft::Frobnicate::Utils, and perhaps AriSoft::Frobnicate::Globals.

    Thinking about "big subs":

    Some big things are a pleasure to the eye, but "big" subs spanning more than one or two screens (i.e. more than 50 lines) are a sure sign of wrong design. You will become confused when you need to change the code, you will pile up status variables and obscure if-then-else constructs, and perhaps you even will abuse goto. Split them into smaller, specialised functions. This is pretty independant from Perl, you have exactly the same problem in nearly every other language.

    I'm currently earning my money by refactoring C-like code written by an ungifted amateur, full of bugs, copy-and-paste, gotos and cargo cult, without any proper indenting, with functions spanning literally thousands of lines, with loops and ifs nested more than 10 levels deep, and of course without any useful documentation. I've removed more than 30% of the code without any loss, and I will remove about another 30% before the will go back to the production machines. During the process, the number of functions will at least double. That code is a real nightmare, no one has a clue about what it does, and just deleting the crap and starting from scratch is not an option. All we can do is to cleanup every piece of code we need to touch, and hope for a slow improvement over time.

    Learn from that, start writing clean, structured, and documented code NOW.

    Alexander

    --
    Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)

      Lets test this theory. Here is one authentic sub from my project. Why should I break it pieces and how this helps to keep it with other subs in the same file? I still prefer to keep this in a separate file like a module.

        I am not going to get into the details of this, but just a couple of quick stylistic comments:

        • instead of including anonymous subs in your hashes, why don't you name them? Instead of row => sub  { # complex code here that ASAIK creates a record }, write row => \&create_record, and then define create_record a little further. This way you're giving a name to that sub, and it becomes easier to see the data structure, without the big, often irrelevant, blob of code right in there.
        • $$flight{Reg}; can be written $flight->{Reg};, which is especially handy when writing $cols->[0]</<c> instead of <c>@{$cols}[0]
        • the commented-out code does not belong in there, if you have to remove code just do it, the source-control system will keep the old version.
Re: Best practices - if any?
by cdarke (Prior) on Feb 21, 2010 at 08:39 UTC
    I'm going to recommend this node to anyone who questions the advice "don't use global variables".
      "don't use global variables".

      I have to comment that our variables are far from globals. It is impossible to keep all data in function parameters. If you are passing refs you are reinventing globals :-) Passing refs automatically by the compiler is called OOP.

      I will use "goto" and "our" without any pain and I also put parentheses in a "wrong" line. With perl you can do it in many ways but I am still missing inline functions.

        I must have been dreaming all these years I've kept all my data in function parameters. As for passing refs reinventing globals, I think you're mistaken. Maybe it makes sense if you're in the habit of addressing your variables by memory location (as in C; pointer arithmetic, for example). The only time a reference is global in Perl is if you've stored it in a global variable (although I'm sure someone is going to pop up with a counter-example). However, disregard that argument for a moment. You're definitely missing one of the key advantages of parameter passing - it effectively documents the flow of a variable through your code, so that when you change a variable you can trace the effects of that change. The problem with globals is that you can change the value of a variable and then find it hard to figure out what's affected downstream; and the bigger the codebase grows, the harder it becomes.

        Tim

Re: Best practices - if any?
by Anonymous Monk on Feb 22, 2010 at 20:55 UTC

    To put your functions and package global variables into their own package/module, just create a Foobar.pm file (in the same dir as your script) and put your subs in it like so:

    package Foobar; use strict; use warnings; our $some_global_var = 8; sub some_sub { print "Hi from Foobar::some_sub().\n"; } 1;

    Now, in your script, to use that new module you just created:

    #!/usr/bin/env perl use strict; use warnings; use lib '.'; use Foobar; Foobar::some_sub(); print "The global var is $Foobar::some_global_var.\n";

    That's it.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://824435]
Approved by desemondo
Front-paged by biohisham
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others contemplating the Monastery: (3)
As of 2024-04-23 22:51 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found