Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

Using guards for script execution?

by R0b0t1 (Initiate)
on Feb 28, 2017 at 21:58 UTC ( [id://1183203]=perlquestion: print w/replies, xml ) Need Help??

R0b0t1 has asked for the wisdom of the Perl Monks concerning the following question:

I've come from a Java, C, and Python background. My first inclination was to look up a pattern similar to the following:

#!/usr/bin/env python3 def main(): print('Hello, world!') if __name__ == '__main__': main()

And, indeed, it exists:

#!/usr/bin/env perl use warnings; use strict; sub main { print "Hello, world!\n"; } unless (caller) { main; }

I attempted to find as much justification for the above as possible, and the main argument seems to be to give an enclosed scope to the main body of the program. However based on Perl scripts that I have read I would agree with the detractors who say it is not idiomatic Perl, at least for very short programs and programs which provide a Unix-like interface.

While this may end up being a matter of opinion I was hoping there may be people who can comment on the extensibility of small utilities which start in one form or the other, or perhaps have experience to offer which goes in a direction I haven't thought of.

Replies are listed 'Best First'.
Re: Using guards for script execution?
by stevieb (Canon) on Feb 28, 2017 at 23:04 UTC

    Typically, scripts don't include the already-inclusive main as anonymonk pointed out. Where it is a bit more idiomatic, is when including a package within a file, along with the main code (I *only* use this for examples though). First (use warnings; use strict; is implied here):

    package main; { ... }

    Is effectively the exact same as:

    { ... }

    ...in a script as far as scoping is concerned. With the braces, the scope is determined. Without the braces, everything still resides in main, but is scoped at the file level, not at the block level. Now, onto the package and script in one file example (untested):

    package Blah; { sub foo { print "bar!\n"; } } package main; { foo(); }

    I believe that it allows the reader at a quick glance to see where the script begins.

    It is very unusual in Perl code to see main referenced directly. Sometimes hacking at the symbol table you'll see it for illustration purposes, but anything without a package prepended is main anyways:

    perl -wMstrict -E 'my $x; say *::x; say *main::x' *main::x *main::x

    To further, in say C, the main function must be defined explicitly, and that function handles any incoming arguments:

    int main (int argc, char* argv[])

    In Perl, the args happen at the file level (even *if* you scope main):

    my ($x, $y); if (@ARGV == 2){ ($x, $y) = @ARGV; } else { ... } package main; { print "$x, $y\n"; }

    In conclusion, with most compiled languages I write in, main is considered the "entry-point". In Perl scripts, there isn't really such a definition, and regardless of whether packages are declared or not, the entry point is the first executable line of code found. If none are found outside of a package declaration, it'll execute the first line of code within a main package, if found. If not, it'll do its compile time stuff, and nothing else (does this sound right all?).

    update: note that in Python, you don't necessarily *need* main either. Often, it's explicitly expressed when you want a library to optionally operate as a script as well, and the main definition would end up at the bottom of the library, with the:

    if __name__ == '__main__': main()

    For a Python file that you intend only to use as a script, no reference to main is necessary at all. Eg:

    x = 1 y = 2 print(x + y)
      package Blah; { sub foo { print "bar!\n"; } } package main; { foo(); }

      This code will define  foo() in the  Blah package; it will not be available in the  main package unless exported to it or invoked as a fully qualified subroutine (called without warnings or strictures):

      c:\@Work\Perl\monks\R0b0t1>perl -le "package Blah; { sub foo { print \"bar!\n\"; } } package main; { foo(); } " Undefined subroutine &main::foo called at -e line 1. c:\@Work\Perl\monks\R0b0t1>perl -le "package Blah; { sub foo { print \"bar!\n\"; } } package main; { Blah::foo(); } " bar!

      On a tangential note, the package syntax

      { package Foo; my $x = ...; ... sub bar { ... } ... }
      and from Perl version 5.14 onward
      package Foo { my $x = ...; ... sub bar { ... } ... }
      will cause the asserted package to be "turned off" at the end of the block, with reversion to the "original" package. E.g., (with full warnings and strictures):
      c:\@Work\Perl\monks\R0b0t1>perl -wMstrict -le "print 'perl version: ', $]; ;; in_pkg('A'); ;; package Foo { ::in_pkg('B'); } ;; in_pkg('C'); ;; sub in_pkg { print qq{$_[0]: in package }, scalar caller; } " perl version: 5.014004 A: in package main B: in package Foo C: in package main


      Give a man a fish:  <%-{-{-{-<

      Thanks, I appreciate the response.

      The suggestion to use main, a function, was to provide a namespace that was not package-level to prevent accidental use of "global" variables. In practice I have no idea how useful this is and how often that mistake occurs.

      I agree that it seems strange to reference a default and implicit namespace, and in a similar manner, it's particularly verbose to have logic which ensures a script was run and not imported in small programs. I thought to ask this question as a lot of python programs by default include a check that a script was not imported. It may be an antipattern, I'm not exactly sure.

        "... provide a namespace that was not package-level to prevent accidental use of "global" variables."

        I use anonymous namespaces for this. I use them often. They can be nested to an arbitrary depth. Here's a (highly contrived) example:

        my $global; { my $local_outer; # $global known here # $local_outer known here # $local_inner unknown here { my $local_inner; # $global known here # $local_outer known here # $local_inner known here } # $global known here # $local_outer known here # $local_inner unknown here # An entirely different $local_inner: my $local_inner; } # $global known here # $local_outer unknown here # $local_inner unknown here # An entirely different $local_outer: my $local_outer; # An entirely different $local_inner: my $local_inner;

        Beyond avoiding all the issues with global variables, there's addition benefits. When an anonymous block is exited, the lexical variables declared within it, go out of scope and can be garbage collected. Also, if those variables were filehandles, Perl automatically closes them for you. Another contrived example:

        { open my $fh, '<', $filename; # ... read and process file contents here ... # As soon as the closing brace is reached: # 1) $fh goes out of scope - available for garbage collection # 2) an automatic "close $fh" is performed }
        "In practice I have no idea how useful this is and how often that mistake occurs."

        This is very useful, and a practice I recommend using as a default coding technique. It's very often the case that scripts, that start off being very short (e.g. a couple of dozen lines), are enhanced and extended and can end up with hundreds of lines. It's at this point that problems with global variables become apparent: you switch to debugging mode and start changing multiple $text variables, for instance, to $xxx_text, $yyy_text, and so on; then start the test/edit cycle, changing the $text variables missed on previous iterations, fixing incorrect renaming (s/$yyy_text/$xxx_text/) or typos (s/$xxxtext/$xxx_text/), and so on.

        This sort of problem does seem very common. We get lots of "What's wrong with my code?" questions where scoping is the underlying cause.

        Although I've focussed on anonymous namespaces here; the underlying objective is to use lexical variables in the smallest scope possible: that scope could also be provided by, for example, subroutine definitions and BEGIN blocks.

        — Ken

        In Perl, globals are not exported by default anywhere, at any time (including functions and methods). In fact, nothing is. You need to explicitly export them before they can be imported into any other module/package that uses a Perl file, so that safeguard is built in.

        In other words, as I said in my last post, things are file-based scope. Anything in another file that uses a different file does not have implicit access (ie. namespaces won't be clobbered) unless that is specifically and explicitly configured.

        You can include other Perl files to your heart's content, and unless you explicitly export things (from the included file (in Python, an import), you'll never be able to see them within your current namespace.

        Also, useing a Perl file does not execute it, so any executable code you have in a Perl file will not be run when including it into another Perl file. That means that you can have main() code anywhere in a Perl file, even non-scoped (file-level global), that will not be executed or evaluated (into) when including said file in another file.

        I digress a bit. There *are* ways around this, but I believe my fellow monks would agree with me that those are round-about ways, and most definitely not common, standard practice that you'd find in any remotely reasonable example unless you were outright looking for such a way.

        provide a namespace that was not package-level to prevent accidental use of "global" variables

        For clarification:

        Variables declared with my are not package variables. At file-level, they are file scoped, so will be accessible from the declaration until the end of file. When declared inside a block, they are only accessible until the end of that block.

        Package variables are declared with our or use vars and are accessible from inside the package they are declared in.

        Also, package variables can be accessed by their fully qualified names from anywhere. So, in that sense, are also "global".

Re: Using guards for script execution?
by Marshall (Canon) on Mar 01, 2017 at 05:05 UTC
    A few comments to your Perl code,
    sub main { print "Hello, world!\n"; } unless (caller) { main; }
    This code is a weird formulation because the file that contains main() will always be executed as a main program? Normal practice would be to do away with this extra level of indentation implied by the subroutine main and just start writing the "main" code. The unless (caller){} does nothing useful. You could have just put a simple main(); instead to call the sub main.

    I attach demomain.pl and demo.pm below.
    Note that demomain.pl could have been called "demo.pl", but I didn't want to confuse you. Anyway note that you can have a .pm file of the same name as a .pl file.

    In my demo.pm file, I use caller(), test() if !caller;. If demo.pm is being run as a main program, test() will execute. If demo.pm is being "used" by another program, say by demomain.pl, all of the top-level code in demo.pm will run, but test() will not run because Perl knows that demo.pm is not being run as a main program. There is no need to so something similar in your main.pl program because your main is always a main program!

    This provides a very "lightweight" test framework. There are such things as .t files which are used in more complicated situations.

    It is possible to split a main program across multiple files, i.e., same package in multiple files. I don't demo that because I think it is a very bad idea.

    #!/usr/bin/perl # File: demo.pl use strict; use warnings; $|=1; # turn off buffering use Demo qw(example); # test() is not exported by Demo.pl # this the error produced... # Undefined subroutine &main::test called at C:\Projects_Perl\testing\ +demo.pl line 7. # my $x = test(); my $y = Demo::test(); #this is ok , Fully qualified name print "test() returned $y\n"; __END__ Prints: top level in Demo.pm this is the test subroutine! test() returned 1
    #!/usr/bin/perl # File: Demo.pm use strict; use warnings; package Demo; use vars qw(@ISA @EXPORT @EXPORT_OK %EXPORT_TAGS $VERSION); use Exporter; our $VERSION=1.01; our @ISA = qw(Exporter); our @EXPORT = qw(); our @EXPORT_OK = qw(example); our $DEBUG =0; test() if !caller; # runs test() if run as a "main program" sub test { print "this is the test subroutine!\n"; return 1;} sub example {return "this is an example"}; print "top level in Demo.pm\n"; 1;

      Actually, there can be value to writing a script like:

      #!/usr/bin/perl -w use strict; use Some::Module qw< this that >; my $Config = ...; Main( @ARGV ) if $0 eq __FILE__; return 0; # Exits -- no run-time code below this point, just 'sub's. sub Main { ... }

      Because it allows you to write more types of tests for the code included in the script. An automated test can require or do the script in order to have the global configuration initialized (which should not involve any interesting calculations) and then the test can do some setup and then call subroutines (including Main) from the script. An automated test can even do some setup, set C<$0>, then require/do the script and it will execute normally (except for things set up by the test) and then return control to the test where aspects of the process can be checked.

      Yes, you can also do even more work to move absolutely all of the code out of the script and into one or more modules and then do all of those types of tests using the module(s). Indeed, there can be even more benefits to moving all of the code to modules. But my experience is that very often this extra work is not warranted for at least some of the code in a script and the trivial work of allowing the script itself to be loaded without it always running is all that is needed to make complete automated testing possible.

      - tye        

        That is far too 20th century for my tastes. You should not test scripts, you should test interfaces. Your scripts really should just be stubs that call one method: run()
Re: Using guards for script execution?
by Anonymous Monk on Feb 28, 2017 at 22:06 UTC
    Perl provides you with a "main" package out of the box:
    perl -le'print __PACKAGE__'
Re: Using guards for script execution?
by hakonhagland (Scribe) on Feb 28, 2017 at 23:37 UTC

      no, don't use modulinos, they offer no benefits in perl, they're actually an evil gimmick

      scripts shouldn't pretend to be modules, scripts should use modules

      Deciding what sub/method to run based on how the "module" (scriptfile) is loaded , is about as pointless and dumb as it gets

      Make it into a real module, then change the script to use it

      use App::NowModule; App::NowModule::Main(@ARGV);

      modulino and perldoc, Re: modulino and $VERSION (all code in module , script as module )

      Thank you for the link. That is what I was referring to, and it links to a PerlMonks post I will follow up on. I had been trying to find where the topic had been covered before and that may be it.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1183203]
Approved by chacham
Front-paged by kcott
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (3)
As of 2024-07-15 14:41 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?
    erzuuli‥ 🛈The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.