comment on

Recently, I've been thinking about a really, really minor perl issue: what's the best way to format your script's main routine? I'd also always wondered how you were supposed to unit-test the main routine in your script. I recently came up with an idea (inspired by a brian_d_foy article) that answers both questions for me.

When I first started coding, I just put everything in my main routine at top level-- in global scope. My scripts looked something like this (translated from perl 4):

#!/usr/bin/env perl

use strict;
use warnings;

# Script variables
our $Foo = 'bar';

# Main routine
my ($name, $greeted) = @ARGV;

$name //= 'Horace!';
$greeted //= 'world';
say_hello($greeted);

# Subroutines
sub say_hello {
    my ($name) = @_;
    
    print "Hello $name\n";
}
[download]

The problem with that format is that any variable I have in the main routine, like $name above, is in scope for the whole file. If I'd misspelled $name in the argument list to say_hello(), it could have given rise to a hard-to-trace bug.

I've seen some people solve this problem by declaring a subroutine called main(), then calling it immediately afterward. This solves the immediate problem, but makes the code a little more confusing, at least to my eyes:

# Main routine
sub main {
    my ($name, $greeted) = @ARGV;
    $name //= 'Horace!';
    $greeted //= 'world';

    say_hello($greeted);
}

main();
[download]

It works, of course, but something about it bothers me. It's easy to miss the call to main() when reading the code, and my eyes tend to skip over the subroutine when looking for the main routine. Worse, if the main() call gets deleted, the entire script will fail to run with no error or warning.

The style I adopted uses a named block to split the difference between subroutine and toplevel code, like so:

# Main routine
MAIN: {
    my ($name, $greeted) = @ARGV;
    $name //= 'Horace!';
    $greeted //= 'world';

    say_hello($greeted);
}
[download]

Here, I thought, was the One True Main Routine Style. My main routine is in a lexical block, but is automatically executed whenever I run the script. A friend of mine pointed out the issue with this code, though: I should call exit(0) afterward, to prevent any other code from being run:

# Main routine
MAIN: {
    my ($name, $greeted) = @ARGV;
    $name //= 'Horace!';
    $greeted //= 'world';

    say_hello($greeted);

    exit(0);
}
[download]

OK, that works, but the extra statement in every script got me thinking. Could I create some syntactic sugar that makes it obvious where the main routine is, but means that I don't have to remember to type 'exit' in every script?

The answer was inspired by a brian_d_foy article, "Five Ways To Improve Your Perl Programming", in which he describes modulinos, which are programs declared as modules. The key here is that he uses the 'sub main' method, but combines it with a function call that only happens when the file is run as a script. That lets you still use the package as a library, or write unit tests for it. Still, the code was a little more complicated than I wanted to write every day:

package My::Routine;

# ...

# Main routine
sub main {
    my ($name, $greeted) = @ARGV;
    $name //= 'Horace!';
    $greeted //= 'world';

    say_hello($greeted);
}

# Run the main routine only when called as a script
__PACKAGE__->main() unless caller;
[download]

Today, it occurred to me that I could write a module that provides some syntactic sugar for the perfect main routine. It would be simple and obvious, provide a lexical block, and exit afterward. In short, like this:

#!/usr/bin/perl

use Devel::Main 'main';

# ...

# Main routine
main {
    my ($name, $greeted) = @ARGV;
    $greeted //= 'world';
    say_hello($greeted);
};
[download]

It might seem odd to make syntactic sugar that does so little, but this is the best solution I've come up with. The 'main' block will run if this file is called as a script. If it's brought into another script via require() or use(), the main routine won't run, but it will create a function called run_main() that will call the main routine with @ARGV set to its arguments. That way, I can write a test script like so:

#!/usr/bin/env perl
require 'script_with_main.pl';
print "Loaded script!\n";
run_main('Shakespeare!', 'perlmonks');
[download]

Which would print:

$ perl test_test_main.pl 
Loaded script!
Hello perlmonks
[download]

So here's the code that provides the syntactic sugar. Honestly, I wouldn't be surprised if someone has already done this on CPAN, but a cursory search didn't show me anything.

use strict;
use warnings;

# Devel::Main by stephen

package Devel::Main {

    # We use Sub::Exporter so you can import main with different names
    # with 'use Devel::Main 'main' => { -as => 'other' }
    use Sub::Exporter;
    Sub::Exporter::setup_exporter({ exports => [ qw/main/ ]});

    # Later versions will let you customize this
    our $Main_Sub_Name = 'run_main';

    sub main (&) {
        my ($main_sub) = @_;

        # If we're called from a script, run main and exit
        if ( !defined caller(1) ) {
            $main_sub->();
            exit(0);
        }
        # Otherwise, create a sub that turns its arguments into @ARGV
        else {
            no strict 'refs';
            my $package = caller;
            *{"${package}::$Main_Sub_Name"} = sub {
                local @ARGV = @_;
                return $main_sub->();
            };
            
            # Return 1 to make the script pass 'require'
            return 1;
        }
    }


};

1;
[download]

stephen

In reply to Main routines, unit tests, and sugar by stephen

Are you posting in the right place? Check out Where do I post X? to know for sure.
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
Want more info? How to link or How to display code and escape characters are good places to start.


Think about Loose Coupling
	PerlMonks