appleb has asked for the wisdom of the Perl Monks concerning the following question:

My perl script accepts many arguments via Getopt::Std, for example

perl myScript -i origSamples -p 1 -o 100

all lovely jubbly. Then I decide to improve it so that I can run it many times without having to ask it implicitly by turning the arguments into lists.

perl myScript -i origSamples,moreSamples -p 1,3 -o 100,200

It calls the main processing subroutine many times

processOne('origSamples',1,100) processOne('origSamples',1,200) processOne('origSamples',3,100) processOne('origSamples',3,200) processOne('moreSamples',1,100) ...

but in order to achieve this, there are now many foreach statements that run each argument in combination with each of the other arguments in the other lists. There is one of the following for each argument option, all nested:

if (scalar(@incls) > 0) { foreach $inc (@incls) { $args{i} = $inc; processOneCheckP(); } } else { processOneCheckP(); }
My customer has already added more arguments, so I want to store the arguments in a data structure that will allow me to combine each argument from each list in turn. Then if I add another argument or change it into a list I shouldn't have to recode. Where do I start? Arrays, hashes, hashes of arrays? Something efficient, but allows easy parsing into a list of arguments to send to a subroutine.

Cheers in anticipation.

Replies are listed 'Best First'.
Re: Storing multiple arguments in a data structure that allows for future expansion
by Moron (Curate) on May 02, 2007 at 08:57 UTC
    I use a hash for the options whose keys are option letters and whose values can be either a scalar for a single or empty value or an array reference for multiple values, e.g.:
    use Getopt::Std; my %opt; getopt('abc',\%opt); Multivalued(\%opt); #... sub Multivalued { # convert lists to array refs. my $href = shift; while( my ($k, $v ) = each %$href ) { my @anon = split /\,/, $v or next; $href -> { $k } = \@anon; } }
    And to iterate the combinations of the arrays, see Math::Combinatorics.

    Update: and if you have to support protected commas in quotes, instead of just splitting, parse with something like Text::Csv::Simple


    ^M Free your mind!

Re: Storing multiple arguments in a data structure that allows for future expansion
by jbert (Priest) on May 02, 2007 at 09:07 UTC
    I'm not sure this is the simplest way, but the problem suggested recursion to me. The code below demonstrates what I think you want to achieve. It's a 'first class functions' approach and uses the ability of a closure (anonymous subroutine) to capture information.

    My starting point is where I think you are already - you have a hash whose keys are the options and whose values are array references containing the multiple-choice values. If you don't have that already, then simply go through the values of your options hash, replacing each value with [ split(/,/, $value) ] or similar.

    I then pass the multi-option hash and the function I was executed to the 'run_all_perms' function. This then plucks one key and list-of-values from the options hash.

    If we have more options to process, it recurses once for each value in the list, but we wrap the 'work' function in in a closure which installs the particular value in a hashref passed to the work function.

    If we don't have any more options to process, it can actually call the passed in work function, passing in the particular value.

    Here's the code:

    #!/usr/bin/perl use warnings; use strict; use Data::Dumper qw/Dumper/; my %opts = ( a => [qw/1 2 3/], b => [qw/4 5 6/], c => [qw/foo bar baz/], ); # The work function should expect a hashref containing the values # for this run my $work = sub { my $hr = shift; print "A $hr->{a} B $hr->{b} C $hr->{c}\n"; }; run_all_perms(\%opts, $work); exit 0; sub run_all_perms { my $orig_opts = shift; my $work = shift; my %multi_opts = %$orig_opts; my @keys = sort keys %multi_opts; my $key = $keys[0]; # We don't want to pass this key on my $vals = delete $multi_opts{$key}; foreach my $val (@$vals) { if (keys %multi_opts) { # We have more perms to do my $new_work = sub { my $hr = shift; # Capture this arg in the hashref $hr->{$key} = $val; # Run the previous work function with the accumulated +args $work->($hr); }; # Recurse with our accumulating work function run_all_perms(\%multi_opts, $new_work); } else { # We've captured all the args we need, let's go $work->({$key => $val}); } } }
Re: Storing multiple arguments in a data structure that allows for future expansion
by Herkum (Parson) on May 02, 2007 at 11:59 UTC

    Don't use an array, use a hash reference. It will be so much easier. Take this example,

    processOne('Sample', 2, '', '', 1000, '', 'write', 'read', 'cache');

    If you use an array, you are dependent on the order of the array to pass the arguments. Switches you would not need, you have to include a value for them anyways just to make sure the array is the correct structure. It becomes easy to miss a value and you will spend alot of time trying to identify what is the problem. In addition, none of the values are descriptive so you have no idea what they mean just by looking at them. If you have to go back, you may forget the order and someone else who needs to change it will have a hard time trying to figure it out.

    Using a hash reference makes it easier to read and add switches.

    processOne({ file_type => 'Sample', read_lines_to => 1000, write_output => 1, # TRUE cache_file => 1, # TRUE cache_in_memory => 0, # FALSE fetch_column_no => 2,

    The next person who comes along at least has a chance of identifying the values and what they are supposed to used for.

Re: Storing multiple arguments in a data structure that allows for future expansion
by dragonchild (Archbishop) on May 02, 2007 at 17:15 UTC
    You're looking for Algorithm::Loops

    My criteria for good software:
    1. Does it work?
    2. Can someone else come in, make a change, and be reasonably certain no bugs were introduced?
Re: Storing multiple arguments in a data structure that allows for future expansion
by Anonymous Monk on May 02, 2007 at 20:05 UTC
    Trying to understand exactly what you mean by "more arguments".

    Are you saying that you currently have arguments -i, -p, and -o, and your customer wants to add arguments -x, -y, and -z?

    Or, are you saying you currently handle two arguments for each parameter, and your customer wants to use 5 for each parameter?

    If you're talking about the former, I'd imagine you'd have to make some mods in your program to do that anyway. Before the processOne subroutine only is taking three paramters. Now, you'd have to not only pass in five parameters, but modify the code to actually use them. If you are referring to the same three parameters, but more values for them. You could do something like this:

    my (@opt_i_list, @opt_o_list, @opt_p_list); # # ####Put options into Arrays # if ($opt_i) { @opt_i_list = split(",", $opt_i); } else { #No $opt_i parameter: Use default $opt_i_list[0] = $DEFAULT_OPT_I_VALUE; } if ($opt_o) { @opt_o_list = split(",", $opt_o); } else { $opt_o_list[0] = $DEFAULT_OPT_O_VALUE; } if ($opt_p) { @opt_p_list = split(",", $opt_p); } else { $opt_p_list[0] = $DEFAULT_OPT_P_VALUE; } # # ####Now run processOne for each arrangement of parameters # foreach my $opt_i_value (@opt_i_list) { foreach my $opt_o_value (@opt_o_list) { foreach my $opt_p_value (@opt_p_list { processOne('$opt_i_value',$opt_p_value,$opt_o_value) } } }
    That will process all values of all the options against each other. So, if you have two options for each, the processOne subroutine will execute eight times. If there are 3 of each parameter, the processOne subroutine will execute 27 times.

    If you add a parameter -x, you'd have to modify your processOne subroutine anyway. Adding another foreach loop in that structure would probably be relatively simple compared to the other coding changes you'd have to make. Is it efficient? Well, the number of times a loop has to process quickly grows with each additional value. That's not exactly efficient, but that's what you said you wanted.

Re: Storing multiple arguments in a data structure that allows for future expansion
by appleb (Novice) on May 03, 2007 at 03:53 UTC
    Thanks for all the help guys - really appreciate it. I wanted to be able to add more options (-x, -y, -z) easily, and to do it I ended up using an array of hashes - @arrAllArgs. When I enter:

    perl myScript -i origSamples,moreSamples -p 1,3 -o 100,200

    There is now a separate statement for parsing each argument:

    parseNumericArgList($args{p}, 'pipeline', 1); parseNumericArgList($args{o}, 'overlap', 1);

    And the subroutine (it's long winded but I can understand it):

    sub parseNumericArgList { my ($arg, $tag, $deflt) = @_; my $sing = 0; my @new = (); if (defined($arg)) { # a list was entered if ($arg =~ /,/) { my @singles = split(",", $arg); foreach $sing (@singles) { if ($sing =~ /\D/) { print "\n\nWrong value in $tag\n\n"; die $USAGE } else { if (scalar(@arrAllArgs) == 0) { push (@new, addToRef($allArgs, $tag, $sing)); } else { foreach $allArgs (@arrAllArgs) { push (@new, addToRef($allArgs, $tag, $sing +)); } } } } } else { # a single value was entered if ($arg =~ /\D/) { print "\n\nWrong value in $tag\n\n"; die $USAGE } if (scalar(@arrAllArgs) == 0) { push (@new, addToRef($allArgs, $tag, $arg)); } else { foreach $allArgs (@arrAllArgs) { push (@new, addToRef($allArgs, $tag, $arg)); } } } } else { # nothing was entered, use default if (scalar(@arrAllArgs) == 0) { push (@new, addToRef($allArgs, $tag, $deflt)); } else { foreach $allArgs (@arrAllArgs) { push (@new, addToRef($allArgs, $tag, $deflt)); } } } @arrAllArgs = (); @arrAllArgs = @new; } sub addToRef { my ( $refIn, $addNm, $addVal ) = @_; my $ref = {}; my $k = ""; foreach $k (keys %{$refIn}) { $ref->{$k} = $refIn->{$k}; } $ref->{$addNm} = $addVal; return $ref; }
    This provides me with the array of hashes @arrAllArgs
    0 HASH(0x9e2930) 'overlap' => 100 'pipeline' => 1 1 HASH(0x9e55f0) 'overlap' => 200 'pipeline' => 1 2 HASH(0x9e2910) 'overlap' => 100 'pipeline' => 3 3 HASH(0xd8ead0) 'overlap' => 200 'pipeline' => 3
    And then when I get to the main block of code it's a simple matter of

    foreach $run (@arrAllArgs) { foreach $ak (keys %{$run}) { $runArgs{$ak} = $run->{$ak}; } processOne(); }
    Have to add a bit to cope with non-numeric arguments and I'm there. Hope this helps anyone else in the same sitch....