Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

Re^2: Parallelization of heterogenous (runs itself Fortran executables) code

by Dominus (Parson)
on Nov 20, 2007 at 14:16 UTC ( #651934=note: print w/ replies, xml ) Need Help??


in reply to Re: Parallelization of heterogenous (runs itself Fortran executables) code
in thread Parallelization of heterogenous (runs itself Fortran executables) code

You didn't steal it; it was a gift.

It has evolved a little since I posted it. Here is the current version:

#!/usr/bin/perl use Getopt::Std; my %opt = (n => 1); getopts('r:n:v', \%opt) or usage(); my $cmd = shift; @ARGV = shuffle(@ARGV) if $opt{r}; my %pid; while (@ARGV) { if (keys(%pid) < $opt{n}) { $pid{spawn($cmd, split /\s+/, shift @ARGV)} = 1; } else { delete $pid{wait()}; } } 1 while wait() >= 0; sub spawn { my $pid = fork; die "fork: $!" unless defined $pid; return $pid if $pid; warn "@_\n" if $opt{v}; exec @_; die "exec: $!"; } sub usage { print STDERR "Usage: $0 [-n N] [-r] [-v] command arg1 arg2... Run command arg1, command arg2, etc., concurrently. Run no more than N processes simultaneously (default 1) -r: run commands in random order instead of specified order (unimp +l.) -v: verbose mode "; exit 1; }
The major missing feature is that at present there's no way to get it to run cmd -x arg1, cmd -x arg2...; there's no way to get the constant -x in there.

I hereby put this program in the public domain.

Share and enjoy!


Comment on Re^2: Parallelization of heterogenous (runs itself Fortran executables) code
Download Code
Re^3: Parallelization of heterogenous (runs itself Fortran executables) code
by codeacrobat (Chaplain) on Nov 20, 2007 at 21:40 UTC
    Here is a version using Text::ParseWords and a modified spawn subroutine, which uses exec in scalar context. This allows parameterized command execution.
    #!/usr/bin/perl use Getopt::Std; use Text::ParseWords; my %opt = (n => 1); getopts('r:n:v', \%opt) or usage(); my $cmd = shift; @ARGV = shuffle(@ARGV) if $opt{r}; my %pid; while (@ARGV) { if (keys(%pid) < $opt{n}) { $pid{spawn($cmd, map { shellwords($_) } shift @ARGV)} = 1; } else { delete $pid{wait()}; } } 1 while wait() >= 0; sub spawn { my $pid = fork; die "fork: $!" unless defined $pid; return $pid if $pid; warn "@_\n" if $opt{v}; exec qq(@_); die "exec: $!"; } sub usage { print STDERR "Usage: $0 [-n N] [-r] [-v] command arg1 arg2... Run command arg1, command arg2, etc., concurrently. Run no more than N processes simultaneously (default 1) -r: run commands in random order instead of specified order (unimp +l.) -v: verbose mode "; exit 1; }
    The following snippet will ping google.com and amazon.com once in the first chunk and yahoo.com in the second.
    runN -v -n 2 'ping -n 1' google.com amazon.com yahoo.com ping -n 1 google.com ping -n 1 amazon.com Ping google.com [72.14.207.99] mit 32 Bytes Daten: Antwort von 72.14.207.99: Bytes=32 Zeit=116ms TTL=242 Ping-Statistik für 72.14.207.99: Pakete: Gesendet = 1, Empfangen = 1, Verloren = 0 (0% Verlust), Ca. Zeitangaben in Millisek.: Minimum = 116ms, Maximum = 116ms, Mittelwert = 116ms Ping amazon.com [72.21.210.11] mit 32 Bytes Daten: ping -n 1 yahoo.com Ping yahoo.com [66.94.234.13] mit 32 Bytes Daten: Antwort von 66.94.234.13: Bytes=32 Zeit=185ms TTL=54 Ping-Statistik für 66.94.234.13: Pakete: Gesendet = 1, Empfangen = 1, Verloren = 0 (0% Verlust), Ca. Zeitangaben in Millisek.: Minimum = 185ms, Maximum = 185ms, Mittelwert = 185ms Zeitüberschreitung der Anforderung. Ping-Statistik für 72.21.210.11: Pakete: Gesendet = 1, Empfangen = 0, Verloren = 1 (100% Verlust)

    print+qq(\L@{[ref\&@]}@{['@'x7^'!#2/"!4']});
      Here is a version using Text::ParseWords and a modified spawn subroutine, which uses exec in scalar context.
      I think that was an extremely bad move, because now you have to worry about shell metacharacters in arguments, which you formerly did not. For example, consider:
      runN echo '*'
      We are expecting this to run one echo to print out a star, but with your implementation, it does not; the quotes are stripped off and it echoes a list of the filenames in the current directory.

      Similarly, suppose a file in the current directory is named foo bar. Then runN command * will run the command not with the argument foo bar but with the two arguments foo and bar.

      Now let's suppose there is a file in the current directory named `rm -rf /`. (Note backquotes!) Then running your version of runN command * will erase the entire filesystem.

      I wrote a blog entry about the perils of multiple shell evaluation in this context. I said:

      My fear was that by introducing a double set of shell-like interpretation, I'd be opening a horrible can of escape character worms and weird errors, and my hope was that if I ignored the issue the problems might be simpler, and might never arise in practice.
      This is exactly the sort of thing I was worried about. Your implementation certainly proves that I was correct about at least the first part of this prediction!

Re^3: Parallelization of heterogenous (runs itself Fortran executables) code
by blazar (Canon) on Nov 21, 2007 at 15:58 UTC

    Ok, you are mjd so I shouldn't probably "dare" to comment, but...

    use Getopt::Std;

    I personally believe that since we recommend newbies and more expert programmers altogether to always use strict and warnings, and this is a nice little utility likely to be picked up as an example, it would be a good thing if it had

    use strict; use warnings;

    at the top. So to build a better future for our children...

    Also,

    use List::Util 'shuffle';

    would implement the -r switch straight ahead.

    getopts('r:n:v', \%opt) or usage();

    The docs do not say anything about getopts() return value, and indeed experimental evidence is that it can't be relied upon for failure checking. Suitable hooks are provided instead, although admittedly I don't like the interface. (Suitably named subs in main::)

    sub usage { print STDERR "Usage: $0 [-n N] [-r] [-v] command arg1 arg2... Run command arg1, command arg2, etc., concurrently. Run no more than N processes simultaneously (default 1) -r: run commands in random order instead of specified order (unimp +l.) -v: verbose mode "; exit 1;

    Any good reason for basically reimplementing die? Incidentally, I would have used a here-doc instead. Personally, I like to implement a USAGE sub like thus:

    sub USAGE () { my $name=basename $0; # File::Basename's <<".EOT"; Usage: $name [args] [actual usage here] .EOT }

    So if the user explicitly asks for help, then I print to STDOUT and exit regularly, for in that case I wouldn't consider the program termination to be "abnormal". Else, I regularly die USAGE.

      Also, use List::Util 'shuffle'; would implement the -r switch straight ahead.
      Sure, but if nobody uses the feature, my implementation is better.

      The docs do not say anything about getopts() return value, and indeed experimental evidence is that it can't be relied upon for failure checking.
      It is very strange that the manual does not mention the return value of getopts(), but it does in fact return true on success and false on failure, and has done so since perl 4.0.

      I would like to see the "experiments" that you tried. The source code is quite clear:

      sub getopts ($;$) { my ($argumentative, $hash) = @_; ... # no "return" anywhere... ... $errs == 0; }
      So it seems to me that what we have here is a documentation failure.

      Addendum: Every Getopt::Std test suite ever distributed Perl, going back to 5.004_04, tests for this behavior. So now I would really like to see your experiments.

        Addendum: Every Getopt::Std test suite ever distributed Perl, going back to 5.004_04, tests for this behavior. So now I would really like to see your experiments.

        I personally believe that you're right. More precisely, you're obviously right. Anyway I probably just tried to add an unknown switch to a cmd line that worked:

        C:\temp>runn.pl -q Unknown option: q Usage: C:\temp\runn.pl [-n N] [-r] [-v] command arg1 arg2... Run command arg1, command arg2, etc., concurrently. Run no more than N processes simultaneously (default 1) -r: run commands in random order instead of specified order (unimp +l.) -v: verbose mode C:\temp>runn.pl -n 2 dir *.txt *.pl -q exec: No such file or directory at C:\temp\runn.pl line 27. exec: No such file or directory at C:\temp\runn.pl line 27. exec: No such file or directory at C:\temp\runn.pl line 27.

        IPB that the latter should exit early printing the usage screen too.

Re^3: Parallelization of heterogenous (runs itself Fortran executables) code
by Dominus (Parson) on Nov 21, 2007 at 22:19 UTC
    The major missing feature is that at present there's no way to get it to run cmd -x arg1, cmd -x arg2...; there's no way to get the constant -x in there.
    I tried fixing it by changing:
    my $cmd = shift;
    to:
    my @cmd = split /\s+/, shift;
    and:
    $pid{spawn($cmd, split /\s+/, shift @ARGV)} = 1;
    to:
    $pid{spawn(@cmd, shift @ARGV)} = 1;
    (I also decided that splitting the remaining arguments was a fairly dumb mistake, so got rid of that.)

    Now, to get it to run cmd -x arg1, cmd -x arg2...; you just runN "cmd -x" arg1 arg2... .

    I think this has worked out pretty well in practice so far, but only time will tell.

    Complete code is now:

    #!/usr/bin/perl use Getopt::Std; my %opt = (n => 1); getopts('r:n:v', \%opt) or usage(); my @cmd = split /\s+/, shift; @ARGV = shuffle(@ARGV) if $opt{r}; my %pid; while (@ARGV) { if (keys(%pid) < $opt{n}) { $pid{spawn(@cmd, shift @ARGV)} = 1; } else { delete $pid{wait()}; } } 1 while wait() >= 0; sub spawn { my $pid = fork; die "fork: $!" unless defined $pid; return $pid if $pid; warn "@_\n" if $opt{v}; exec @_; die "exec: $!"; } sub usage { print STDERR "Usage: $0 [-n N] [-r] [-v] command arg1 arg2... Run command arg1, command arg2, etc., concurrently. Run no more than N processes simultaneously (default 1) -r: run commands in random order instead of specified order (unimp +l.) -v: verbose mode "; exit 1; }

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://651934]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others browsing the Monastery: (4)
As of 2014-09-21 00:02 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (165 votes), past polls