Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

Main routines, unit tests, and sugar

by stephen (Priest)
on Jun 13, 2013 at 04:43 UTC ( #1038667=perlmeditation: print w/ replies, xml ) Need Help??

Recently, I've been thinking about a really, really minor perl issue: what's the best way to format your script's main routine? I'd also always wondered how you were supposed to unit-test the main routine in your script. I recently came up with an idea (inspired by a brian_d_foy article) that answers both questions for me.

When I first started coding, I just put everything in my main routine at top level-- in global scope. My scripts looked something like this (translated from perl 4):

#!/usr/bin/env perl use strict; use warnings; # Script variables our $Foo = 'bar'; # Main routine my ($name, $greeted) = @ARGV; $name //= 'Horace!'; $greeted //= 'world'; say_hello($greeted); # Subroutines sub say_hello { my ($name) = @_; print "Hello $name\n"; }

The problem with that format is that any variable I have in the main routine, like $name above, is in scope for the whole file. If I'd misspelled $name in the argument list to say_hello(), it could have given rise to a hard-to-trace bug.

I've seen some people solve this problem by declaring a subroutine called main(), then calling it immediately afterward. This solves the immediate problem, but makes the code a little more confusing, at least to my eyes:

# Main routine sub main { my ($name, $greeted) = @ARGV; $name //= 'Horace!'; $greeted //= 'world'; say_hello($greeted); } main();

It works, of course, but something about it bothers me. It's easy to miss the call to main() when reading the code, and my eyes tend to skip over the subroutine when looking for the main routine. Worse, if the main() call gets deleted, the entire script will fail to run with no error or warning.

The style I adopted uses a named block to split the difference between subroutine and toplevel code, like so:

# Main routine MAIN: { my ($name, $greeted) = @ARGV; $name //= 'Horace!'; $greeted //= 'world'; say_hello($greeted); }

Here, I thought, was the One True Main Routine Style. My main routine is in a lexical block, but is automatically executed whenever I run the script. A friend of mine pointed out the issue with this code, though: I should call exit(0) afterward, to prevent any other code from being run:

# Main routine MAIN: { my ($name, $greeted) = @ARGV; $name //= 'Horace!'; $greeted //= 'world'; say_hello($greeted); exit(0); }

OK, that works, but the extra statement in every script got me thinking. Could I create some syntactic sugar that makes it obvious where the main routine is, but means that I don't have to remember to type 'exit' in every script?

The answer was inspired by a brian_d_foy article, "Five Ways To Improve Your Perl Programming", in which he describes modulinos, which are programs declared as modules. The key here is that he uses the 'sub main' method, but combines it with a function call that only happens when the file is run as a script. That lets you still use the package as a library, or write unit tests for it. Still, the code was a little more complicated than I wanted to write every day:

package My::Routine; # ... # Main routine sub main { my ($name, $greeted) = @ARGV; $name //= 'Horace!'; $greeted //= 'world'; say_hello($greeted); } # Run the main routine only when called as a script __PACKAGE__->main() unless caller;

Today, it occurred to me that I could write a module that provides some syntactic sugar for the perfect main routine. It would be simple and obvious, provide a lexical block, and exit afterward. In short, like this:

#!/usr/bin/perl use Devel::Main 'main'; # ... # Main routine main { my ($name, $greeted) = @ARGV; $greeted //= 'world'; say_hello($greeted); };

It might seem odd to make syntactic sugar that does so little, but this is the best solution I've come up with. The 'main' block will run if this file is called as a script. If it's brought into another script via require() or use(), the main routine won't run, but it will create a function called run_main() that will call the main routine with @ARGV set to its arguments. That way, I can write a test script like so:

#!/usr/bin/env perl require 'script_with_main.pl'; print "Loaded script!\n"; run_main('Shakespeare!', 'perlmonks');

Which would print:

$ perl test_test_main.pl Loaded script! Hello perlmonks

So here's the code that provides the syntactic sugar. Honestly, I wouldn't be surprised if someone has already done this on CPAN, but a cursory search didn't show me anything.

use strict; use warnings; # Devel::Main by stephen package Devel::Main { # We use Sub::Exporter so you can import main with different names # with 'use Devel::Main 'main' => { -as => 'other' } use Sub::Exporter; Sub::Exporter::setup_exporter({ exports => [ qw/main/ ]}); # Later versions will let you customize this our $Main_Sub_Name = 'run_main'; sub main (&) { my ($main_sub) = @_; # If we're called from a script, run main and exit if ( !defined caller(1) ) { $main_sub->(); exit(0); } # Otherwise, create a sub that turns its arguments into @ARGV else { no strict 'refs'; my $package = caller; *{"${package}::$Main_Sub_Name"} = sub { local @ARGV = @_; return $main_sub->(); }; # Return 1 to make the script pass 'require' return 1; } } }; 1;

stephen

Comment on Main routines, unit tests, and sugar
Select or Download Code
Re: Main routines, unit tests, and sugar
by BrowserUk (Pope) on Jun 13, 2013 at 05:51 UTC
    The problem with that format is that any variable I have in the main routine, like $name above, is in scope for the whole file. If I'd misspelled $name in the argument list to say_hello(), it could have given rise to a hard-to-trace bug.

    Put your subroutines first, at the top of the file, and the 'main' code at the bottom.


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

      That's another variant that I see often, and of course it works. I don't prefer it because I then need to scroll down to the bottom to understand the rest of the code.

      stephen

        Then I just hit ^END and I'm there. (Unless I have a large __DATA__ section, in which case ^F__DATA__ takes me there.)

        The main advantages of subroutines at the top is that it avoids accidental closures and the need for pre-declarations for prototype checking.

        But I realise that it is a personal preference thing.


        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Main routines, unit tests, and sugar
by talexb (Canon) on Jun 13, 2013 at 19:34 UTC

    As per BrowserUK, for scripts I put subroutines first, then the main routine at the bottom. Here's roughly how your script would look with my formatting:

    #!/usr/bin/env perl use strict; use warnings; # Script variables our $Foo = 'bar'; # Subroutines sub say_hello { my ($name) = @_; print "Hello $name\n"; } { # Main routine my ($name, $greeted) = @ARGV; $name //= 'Horace!'; $greeted //= 'world'; say_hello($greeted); }

    I don't see any value in having a subroutine called main which is then called once after being defined. (In fact, I find it distinctly *odd*. Personal preference.)

    Alex / talexb / Toronto

    "Groklaw is the open-source mentality applied to legal research" ~ Linus Torvalds

Re: Main routines, unit tests, and sugar
by dcmertens (Beadle) on Jun 14, 2013 at 18:21 UTC
    This is a cool approach! Just goes to show, though: different strokes for different folks. Once I wrapped my head around how Perl's module handling worked, I started to move nearly all of my functions and class declarations into modules, keeping my scripts to really just be the main functionality.
Re: Main routines, unit tests, and sugar
by vsespb (Hermit) on Jun 16, 2013 at 15:07 UTC

    I prefer the following approach:

    Main .pm file is just a normal .pm file

    There is a tiny .pl script with executable permissions which call main() in main module

    This script is possibly autogenerated or uses FindBin to setup @INC

    Example is perlbrew

      Reminds me of the guy that painted a bicycle on the side of his Hummer in an attempt to seem green. :)


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

        Why is that?

        Any important code resides under one App::MyApp namespace, and not split across App::MyApp and myapp.pl

        Also different deploy methods can generate different binaries (some with hardcoded paths)

        Same used in Ruby ecosystems - binaries are autogenerate and are almost empty

        Also, if binary autogenerated it might not contain copyright/license notice

Re: Main routines, unit tests, and sugar
by lee_crites (Beadle) on Jun 17, 2013 at 18:41 UTC

    I designed my own coding style to answer this issue for myself. Mine is the "main is the last code in the file" option, as already suggested by others.

    I upvoted this for one reason: seeing someone work through an issue and find an appropriate, personally meaningful, solution is always a good thing!

    Lee Crites
    lee@critesclan.com
Re: Main routines, unit tests, and sugar
by radiantmatrix (Parson) on Jun 17, 2013 at 21:47 UTC

    I'm not entirely sure how use Devel::Main 'main' is better syntactic sugar than calling main()... though I object to the main() rather than an executable body in the first place.

    It seems like your main concern with just having your 'main' be the body of the script is the risk of conflicting variable names in different scopes causing readability or debugging issues. The easy way around this is to either:

    • Abstract your subroutines into one or more modules. Switching files is a pretty good cue that you're in a different scope when you're reading listings.
    • OR

    • Use different variable naming conventions for the outer scope; for example, when I have scripts large enough for this to be an issue, all my package-scoped variables begin with a capital letter.

    I suggest that if your scripts are large enough for this sort of confusion to be likely, you're probably best abstracting bits of it away into modules anyhow.

    <radiant.matrix>
    Ramblings and references
    “A positive attitude may not solve all your problems, but it will annoy enough people to make it worth the effort.” Herm Albright
    I haven't found a problem yet that can't be solved by a well-placed trebuchet
Re: Main routines, unit tests, and sugar
by perlfan (Curate) on Jun 19, 2013 at 15:47 UTC
    Personally, I really like brian's modulino approach, but I use it sparingly.

    Main methods are one of the things that LPW wanted to do away with, or make implied (as many other things are).

    Yes, the main() or however you do it will make it more familiar with those comfortable with compiled languages. However, it's not really useful unless you're going to be taking advantage of it for things like unit testing.

    Where brian's modulino approach shines for me are situations where I am writing a utility that generating a useful set of subroutines for a related set of utilities. With the modulino, I get the ability to run it as a utility or include it as a library in a related util. I get to create unit tests as well.

    So, I would highly recommend that approach in these situations, rather than just arbitrarily creating a main() method.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlmeditation [id://1038667]
Approved by davido
Front-paged by davido
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others studying the Monastery: (5)
As of 2014-12-28 05:19 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (178 votes), past polls