Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

controling the num of fork sessions

by Anonymous Monk
on Nov 27, 2000 at 02:50 UTC ( [id://43414]=perlquestion: print w/replies, xml ) Need Help??

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

<html> <head> </head> <body>

I need to run a single command on multiple boxes.  I am pulling the cpu name from an array then plugging them into a loop.  I am then forking the loop so that all of them run at once.<o:p></o:p>

The problem that I am experiencing is: Is there a way to control the number of sessions made at once by the looped fork?  <o:p></o:p>

If I am running the command on 20 servers I am OK.  However, if I need to run it on all 2000 of the boxes, then I run into MAJOR system resource problems.  As a temprary fix I am forking 20 of them, then running 1 outside of the fork. When the 1 from outside of the fork ends then the whole process starts over.  This allows me to do them in CHUNKS of 20.  What I am looking for is a way that allows a new session to start whenever one ends, so that I have a CONSTANT streem of 25 sessions NOT chunks of 25.<o:p></o:p>

##this runs the entire list at once @list=(1,2,3); #list while(<@list>){ #I loop the list if (fork){ #with the loop I run the whole list at once print "$_\n"; #I do my stuff exit; } }
Thanks, BRN </body> </html>

Replies are listed 'Best First'.
Re: controling the num of fork sessions
by Fastolfe (Vicar) on Nov 27, 2000 at 03:48 UTC
    I don't quite understand how forking your script will permit it to run on multiple boxes. Multiple CPU's, OK, but how do you cause it to run on another system? What you are describing is inconsistent what what you are asking and your sample code.

    If you're asking how to keep 20 child processes active at any given time, keep track of them and use waitpid to see how many are gone:

    use POSIX ':sys_wait_h'; my $max_children = 20; my $cur_children = 0; $SIG{CHLD} = sub { $cur_children-- while waitpid(-1, &WNOHANG) != -1; &spawn_children; } sub spawn_children { while ($cur_children < $max_children) { my $pid = fork; die "fork: $!" unless defined $pid; &child_process if !$pid; $cur_children++; } } sub child_process { # what the kid does exit 0; # important } &spawn_children;
    But like I said, this makes no sense with respects to processes running on other systems. You may have to give us more information if this doesn't help you.
      A nice solution, but I would suggest ( as I almost always do to this kind of question ) a few improvements.

      Why are you going through all this pain with REAPERs and counters? Handle the child's death yourself in the main loop of the code. This obviates some complexity and also gets around perl's rarely ( honestly, I have never seen this interupt problem in about 6 years of perl programming, but that is merely anecdotal ) seen interupt problems.

      What I would suggest ( and have used several times ) is a loop like this:

      while( $again ) { #-- # Initial loop to spawn children #-- if ( $ref < $max ) { if ( $cpid = fork ) { $kids{ $cpid } = 1; } else { #do interesting process here exit; } $ref++; } else { do { select undef, undef, undef, 0.5; $dpid = waitpid( -1, 1 ); } until $dpid; last if ( $dpid == -1 ); # No children left delete $kids{$dpid}; if ( $cpid = fork ) { $kids{ $cpid } = 1; } else { # Same interesting process exit; } } $total++; }
      There is some complexity here - I was using this to seriously abuse some cycles :) The if() portion merely checks to see if all the children have been spawned. If they haven't, spawn another off and log the creation into a hash.

      If all the children have been spawned, poll the system every 1/2 second until one dies. I make sure I have a real pid, dropping out of the loop if not, and remove that entry from the tracking hash. This hash can be used to log when a child has died and what it was doing. Spawn another child off and loop again.

      Because I am handling the deaths myself and not waiting for quasi-mystical signals, even if 100 children die at the same time, each child will remain in the process table until I have processed it. I can then be certain that I will spawn 100 more children off, no matter how or when they die.

      Oh. This loop was actually run as a child process - my loop exitted when the signal handler set $again. You can replace this with a variety of exit conditions - timeouts, all the children have been reaped, etc.

      mikfire

        I was using the signal approach so as to free up the parent process for other things. The use of waitpid in a loop as I was doing should also catch all 100 children if they die simultaneously (under the announcement of a single signal). The only problem might be with unreliable signals. It's just a preference thing.. I would rather not devote a process solely to the task of keeping track of children. Signals (or I guess a more complex event loop) let me do that while allowing me to work on other things at the same time.

        I really don't see it as a pain keeping track of stuff like that. You're using a hash of PID's to keep track of your kids, I was going under the assumption that a count was adequate. TMTOWTDI.

      sorry,

      In my example I have a print. In the full .pl I have a telnet function that logs into individual boxes at that point and runs an EXE. Thanks for the info on waitpid. That may solve my problem.

      BRN
Re: controling the num of fork sessions
by cephas (Pilgrim) on Nov 27, 2000 at 03:40 UTC
    Well, for starters it will be easier if you do your work in the children, and keep track of them using the parent.

    Here's a bit of untested code that should probably do the trick...

    #!/usr/bin/perl use POSIX ":sys_wait_h"; $MAX = 20; #max num of children at a given time $SIG{CHLD} = \&REAPER; #deal with our children @list = (1,2,3); $count = 0; #number of children outstanding foreach $item ($list) { while($count >= $MAX) { #too many children, wait awhile sleep(1); } if(fork) { #Parent $count++; next; } else { #Child print("$item\n"); #do some child stuff here exit; } } sub REAPER { my $pid; while($pid = waitpid(-1,&WNOHANG)) { last if($pid == -1); $count--; #Decrement our outstanding children } \&REAPER; #Reinstall the signal hanler }


    cephas
Re: controling the num of fork sessions
by rpc (Monk) on Nov 27, 2000 at 03:17 UTC
    Is this the most elegant approach?
    I can see this working ok for 10-20 servers, but for 2000 wouldn't a client/server architecture be more appropriate? What about having a listening process (rpc perhaps?) on each machine, and then whenever you need to feed it commands, your client can send messages to the servers in a simple loop.

    For the way you're implementing it now, I would suggest maybe using Threads over forking new processes.

      Thanks, and yes in a perfect world I would run it as a client / server process. However in this instance I am updating a minor portion of servers that I do not have complete say over. So I am attempting to use only existing items on the server side. (where I put "print" in my example I have a telnet proc that runs an EXE on the remote box)

      As for "threads" I am worried about the stability of it when running in a Windows NT environment.

      Thanks, BRN

Re: controling the num of fork sessions
by merlyn (Sage) on Nov 27, 2000 at 10:51 UTC
Re: controling the num of fork sessions
by tilly (Archbishop) on Nov 27, 2000 at 04:31 UTC
    Your description doesn't make much sense to me either, but for one way to run a set of commands in parallel, take a glance at Run commands in parallel.
Re: controling the num of fork sessions
by repson (Chaplain) on Nov 27, 2000 at 10:29 UTC
    As I understand it altering variables from within sig-handlers can sometimes go wrong since perl's code isn't reenterent. Another way of accomplishing your goal to avoid this possible problem would be to give each child a portion of the list.
    Something like this:
    my @list = (1..2000); my $kids = 20; my $num_per_kid = int (@list / 20); for my $chld (1..$kids) { my @tmp = @list[(($chld-1)*$num_per_kid)..($chld==$kids ? $#list : $c +hld*$num_per_kid)]; if (fork) { for my $item (@tmp) { print "$_\n"; } } }
Re: controling the num of fork sessions
by AgentM (Curate) on Nov 27, 2000 at 21:21 UTC
    Is there any specific reason that you need to run them all at the same time? I imagine that you are worried about system resources, but I can't imagine that your network would be much happier. Why not try a linear execution embedded with forks (i.e. limiting your processes to 10 or so at a time while going through the list.) You might need to experiment which number of processes makes the least intrusive use of system resources and network since both are probably doing more important things in the meantime.
    AgentM Systems nor Nasca Enterprises nor Bone::Easy nor Macperl is responsible for the comments made by AgentM. Remember, you can build any logical system with NOR.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://43414]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (3)
As of 2025-02-14 21:14 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found