Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw

Re: Monitoring Child Process

by Marshall (Prior)
on Oct 19, 2011 at 19:42 UTC ( #932483=note: print w/ replies, xml ) Need Help??

in reply to Monitoring Child Process

First, the reaping code has a couple of flaws. (1) there are multiple places where you do a waitpid(), (2) the reaper routine can miss some SIGCHLD's., (3) no need to set $SIG{CHLD} within REAPER

So I would suggest:

sub REAPER { my $pid; while ($pid = waitpid(-1, WNOHANG) > 0) { print "Process $pid exited.\n"; push @dead_kids, $pid; } }
You are not guaranteed to get a separate SIGCHLD for each child who exits! Basically interpret SIGCHLD to mean that one or more children have exited. Use a while loop to service everybody who is ready within the signal handler. There is not going to be a "false alarm", so I took that stuff out.

Also, to "reap" a child basically means to read its exit status from the process table. The function that does this read is waitpid(). When you have this in two places, somebody may get reaped and yet not go through the reaper. I recommend letting REAPER do all of the reaping and do not call waitpid() anywhere else. Maybe this issue is why you had the "false alarm" code, a race condition? Anyway, only reap in one place and you will not miss any and won't wind up in the signal handler with somebody mysteriously disappearing!

Update: temporarily don't have access to a unix box, so can't test, but here is a suggestion..

1. I see why you had the waitpid in the parent (so see if all children have finished). You need another test, perhaps, (@dead_kids < 10) with some wait delay. I think this test will be ok in Perl without further adieu. In C I would need to protect this critical section with some procmask() voodoo. But I think that due to the way Perl >=5.8 delivers deferred signals, that this is not necessary to prevent the REAPER and parent main program from tripping over each other.

Comment on Re: Monitoring Child Process
Download Code
Re^2: Monitoring Child Process
by Anonymous Monk on Oct 19, 2011 at 19:59 UTC
    Thank you for your suggestion. I incorporated your suggestion by commenting this out... so that there is only one waitpid in the whole script.
    foreach (@all_kids) { waitpid ($_, 0); }
    And using the REAPER subroutine that you gave. But now two things came up. Under normal condition (without pressing ctrl-c), the parent script exits without 'waiting for the children to finish'.

    And when I try to put a 'sleep 20' and the end of the main script, this is the output that I am getting.
    Process 1 exited. Process 1 exited.
      Oh, I guess a mis-communication. I updated my post with a more detailed suggestion. This foreach(@all_kids) code is that part should be deleted!

      The REAPER is part of the parent. So having waitpid() in only one place means having it only in REAPER().

      Some replacement code to this foreach (@all_kids) loop:

      while (@dead_kids <10) { sleep(1); } print "all kids dead .. @dead_kids\n";
      You can also just sleep(20) and print @dead_kids. The focus right now should be on getting this to work without the CTL-C complication and then add that later.

      Update: Oh, I would also add use warnings; either by that statement, or a -w in the hash bang line. This has nothing to do with your current woes, but there are run time checks with warnings enabled that are useful. I leave them on unless some rare, very rare performance or other reason indicates otherwise.

      Another Update: Was able to test some code...for some reason, when SIGCHLD happens, this causes the sleep to end. I don't know why. So there is a loop to restart the sleep every 1 seconds. try this code...have to run to an appointment...oh, exit 1 was caused by missing parens in while statement in the reaper. The sleep issue is the real puzzle here.

      update: added readmore tag - updated code in later post

        Thank you very much. The normal condition works now where it is able to monitor the child process.

        But the ctrl-c condition where I press the ctrl-c does not seem to be working. Maybe my testing is just wrong, but when I press ctrl-c, I do not see any other running child process.
        while (@dead_kids < scalar @all_kids ) { print "dead kids: ", scalar @dead_kids, "\n";; sleep 1; # This is where I want to test the ctrl-c. sleep 5 if ( @dead_kids > 1 ); }

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://932483]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others surveying the Monastery: (9)
As of 2014-08-28 11:40 GMT
Find Nodes?
    Voting Booth?

    The best computer themed movie is:

    Results (259 votes), past polls