Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Fork and wait question

by chris68805 (Initiate)
on Jun 12, 2008 at 15:43 UTC ( #691696=perlquestion: print w/replies, xml ) Need Help??
chris68805 has asked for the wisdom of the Perl Monks concerning the following question:

Hello, I am writing a perl program that has to be able to open 16 files using another program in C (filename is passed by the command line) and wait until all files are completed before continuning the perl program. I have researched online and found fork() and wait(). I have added them into my perl program and they create the process, however they do not wait until all files are completed before continuing. Please help.
#This is just the command to be called in execu my $inputfile="perloutput"; my $endfile=".txt "; my $smb="> "; my $outputfile="output"; my $smb2="&"; while ($index<16) { my $testpath="chirp_md5sum "; $testpath=$testpath.$inputfile.$index.$endfile.$smb.$outputfile.$ +index.$endfile.$smb2; chomp $testpath; print ("$testpath \n"); $id=fork(); if($id eq 0) { exec ($testpath); wait(); exit(); } $index++; }

Replies are listed 'Best First'.
Re: Fork and wait question
by kyle (Abbot) on Jun 12, 2008 at 16:07 UTC

    You wait in the child instead of in the parent. You might want to look at something like Parallel::ForkManager to handle this kind of thing for you. Otherwise, the easiest thing to do is wait 16 times outside the loop to collect all the children.

    You might also want to keep track of all the child PIDs so you can tell which you've reaped and which are still hanging around. If something is taking too long, it might be helpful to be able to kill it.

      kyle is right - your call to wait is in the child and it should be in the parent. Actually, since it is right after the exec it will never get called under normal circumstances.

      To spawn a new child process, use:

      die "unable to fork: $!" unless (defined(my $id = fork()); if ($id == 0) { exec($testpath) or die "unable to exec $testpath: $!\n"; }
      and after the spawning loop, use:
      while (wait > -1) {};
      to wait for all of your children to terminate.
        ok, I I added what you requested. However it still will continue the program and finish. I can pull up process on my linux and it shows the two instances of the chirp_md5sum program running.
        $index=0; my $inputfile="perloutput"; my $endfile=".txt "; my $smb="> "; my $outputfile="output"; my $smb2="&"; while ($index<2) { my $testpath="chirp_md5sum "; $testpath=$testpath.$inputfile.$index.$endfile.$smb.$outputfile.$in +dex.$endfile.$smb2; chomp $testpath; print ("$testpath \n"); $id=fork(); if ($id == 0) { exec($testpath) or die "unable to exec $testpath: $!\n"; } $index++; } while (wait > -1) {};
Re: Fork and wait question
by ikegami (Pope) on Jun 12, 2008 at 16:08 UTC

    Do you have a particular need to execute the 16 processes in parallel? If not, use system.

    for (0..16) { my $idx = $_ || ''; # Don't actually include '0' in the file name. my $in_file = "perloutput$idx.txt"; my $out_file = "output$idx.txt"; my $rv = system("chirp_md5sum $in_file > $out_file"); die("Error launching child: $!\n") if $rv == -1; die(sprintf("Child died with signal %d\n", $rv & 127))if $rv & 127; die(sprintf("Child returned error %d\n, $rv >> 8)) if $rv; }
Re: Fork and wait question
by samtregar (Abbot) on Jun 12, 2008 at 16:56 UTC
    I'd like to second the recommendation to use Parallel::ForkManager. Friends don't let friends write their own fork() management code.

    But I'd also like to draw your attention to something not yet pointed out - you seem to believe that you can run code after calling exec(). That's incorrect - exec() replaces the process with a new one and never returns. Sometimes you'll see:

       exec(...) or die(...);

    But that's just there to handle the possiblity that exec() might fail. If exec() succeeds then that process won't be running your code anymore.

    -sam

Re: Fork and wait question
by zentara (Archbishop) on Jun 12, 2008 at 17:27 UTC
    Just screwing around.... but I thought of using a piped open as an easy fork-an-exec. One problem, the filehandle in a hash won't print out. Anyone know why? Would this type of approach work?
    #!/usr/bin/perl use warnings; use strict; my %pids; my @to_be_processed = (1..20); my $cmd = 'echo'; foreach my $p (@to_be_processed){ $pids{$p}{'pid'} = open($pids{$p}{'fh'}, "$cmd $p 2>&1 |"); } # get output dosn't work? get globs foreach my $key(keys %pids){ while (<{ $pids{$key}{'fh'} }>){print $_} } foreach my $key(keys %pids){ waitpid($pids{$key}, 1); } print "done\n";

    I'm not really a human, but I play one on earth CandyGram for Mongo
      The <> operator is super fussy due to being overloaded with ancient glob() craziness. It only works if you put in a simple scalar, not any kind of expression. This works:

      my $fh = $pids{$key}{'fh'}; while (<$fh>){print $_}

      This kind of thing will work fine until your sub-processes need to produce more than 4k of data (or whatever your buffer size is). Then the sub-procs will block when they print() and won't work in parallel while you're collecting results. To fix this you need to switch to using something like IO::Select or EV to pull data from each process as it's ready.

      -sam

      Calling close on the file handles will also wait for the processes to complete.
        Where do you put the close for the file handlers?

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://691696]
Approved by kyle
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others scrutinizing the Monastery: (3)
As of 2018-11-18 08:00 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    My code is most likely broken because:
















    Results (205 votes). Check out past polls.

    Notices?