tradez has asked for the wisdom of the Perl Monks concerning the following question:
The issue I come to you today with is one of a mix between distributed computing, a favorite of all my SETI friends, and forking (which until 5.8 was a favorite of no one). I am utilizing Proc::Queue for the Fork Management System currently and I am loving it. I just see that I have downtime while processes switch off that I would like to eliminate by allowing my children to spawn children, this is very scary though so I thought I would bounce this off the wall first. Consider the following:
Proc::Queue::size(5); print "Starting some procs\n"; foreach my $switchName (@switchList){ #last; $ga->{'switchName'} = $switchName; my $hashRef = $ga->{'outputString'}{$switchName}; run_back { print "Working on $switchName\n"; print "Running remote script on $switchName \n"; my $cmd = `/usr/bin/ssh psg01\@$switchName /home/psg01/serverSnaps +hot.pl 2>/dev/null`; print "Getting the files from $switchName\n"; $cmd = `/usr/bin/scp psg01\@$switchName:/tmp/snapshot*.out /tmp/ > +/dev/null 2>&1`; print "Got the files, time to Proces for $switchName\n"; processFiles(switchName=>$switchName, outputStringHashRef=>$hashRe +f); print " HEY!! I am so done with $switchName :-) \n"; } #last; } #die; 1 while wait != -1; print "Finished!\n"; Proc::Queue::size(30); foreach my $switchName (@switchList){ my $loadID = $ompLoadHash{$switchName}; my @formNames = sort keys %{$activeFormHash{$loadID}}; foreach my $formName (@formNames){ run_back{ my $db_input = "/tmp/snapshot.$switchName.$formName.input"; print "Going into DB for $switchName\n"; `psql test -U postgres -c "COPY $formName\_snapshot (parm_id, va +lue, element_id, collect_date) from '$db_input'"`; unlink<INPUTFILE>; print "I am done going into DB for $switchName on $formName :-{} +\n"; } } #last; } 1 while wait != -1; print "Finished!\n";
This all works great. The only problem is that the first 2 steps in the first run_back (running remote script and getting the files from remote box) cause alot of local idle time. What I would like to do is instead of having the second run_back, just have the first run_back launch of 30 proc's to do the COPY while the next in the proc::queue line go and insantiate the scripts on the next remote box. Does this logic work? What is my best path to follow? Oh let your wisdom shine down upon me.
Tradez
"Every official that come in
Cripples us leaves us maimed
Silent and tamed
And with our flesh and bones
He builds his homes"
- Zach de la Rocha