Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

Re^7: Best way to kill a child process

by Marshall (Canon)
on Oct 15, 2011 at 21:18 UTC ( [id://931707]=note: print w/replies, xml ) Need Help??


in reply to Re^6: Best way to kill a child process
in thread Best way to kill a child process

This is odd and I don't know how to replicate this. If you can replicate this on a Linux system, I'd like to play with it.

$SIG{CHLD} = sub {while (waitpid(-1, WNOHANG) > 0){} };
This shouldn't hang if there are no children to reap. And it will reap all available children to reap (0,1,20,150).

Somewhere in the last couple of threads about this, there was a question about grandchildren. There can be "big trouble in Dodge City" if children are creating grandchildren. Maybe the grandchild dies and expects its parent who was a "child" to reap it, but if that child dies, I suspect that there could be some race conditions about who reaps that - maybe the "step-child" is still running?

The parent should only be responsible for reaping it own children. I highly recommend against the idea of children making further grandchildren. After a fork(), the child should close the passive socket.

As a general "rule" children should not spawn more grandchildren. I mean a child is supposed to do its job of talking to a client and then die. But things can get "out of wack" if that child has its own child. So the first "step child" is supposed to die and get reaped, but that can't happen if it itself has as a child that is supposed to be active? I think there is a problem here. We have a "dead" parent, that can't be reaped because it has a child that can't be reaped.

I think things will be ok if you do not allow children to make other children. First thing that a child should do is close the passive socket.

First thing that a parent (server) should do after the fork() is close the active socket. Parents (servers) should only listen for new client connections. Children should only deal with their currently active socket (to the client).

Yes, there are models where parents coordinate children activities, but that is an advanced topic and I don't think that we are talking about that here. That is into the range of not only complicated, but darn complicated!

Replies are listed 'Best First'.
Re^8: Best way to kill a child process
by flexvault (Monsignor) on Oct 16, 2011 at 13:28 UTC

    Some points about this:

    • Problem happens in Debian, OpenSuse with perl5.10.1 and perl5.12.2
    • There are no grandchildren.
    • Does not seem to happen on single CPU, only multiple CPUs.
    • Use ratio of 4 children per CPU and parent only maintains that number
    • Children exit after 4 hours.
    • Usually takes 20-24 hours for problem to happen!

    I'm sure it's a timing issue, but I haven't been able to duplicate the problem in testing. I think it hangs in $SIG{CHLD} since when I set the child $SIG{CHLD} to 'IGNORE', the problem goes away. It could be perl, linux or my program. If I find a way to duplicate it, I'll let you know. Regards.

    Thank you

    "Well done is better than well said." - Benjamin Franklin

      Yes, if you find a way to duplicate this with a fairly simple program, I'd like to run it and see if I can replicate it also.

      I don't have a Linux machine myself, but I do have access to a 64 bit Linux, active-state 64 bit installation on a remote machine. I can hammer on this machine during the evenings and on the weekends.

      Setting $SIG{CHLD} = 'IGNORE' should cause the low level sigaction() structure to be set. Once that happens, Perl has nothing at all to do with this signal as Perl would never even see the signal. So if this causes a "reap" of the child, I can see why this doesn't cause a problem.

      Anyway, if you can make this happen more often than once every 24 hours, then I have a way to make a couple of runs on the weekend when the college's server is at a low usage level.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://931707]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (4)
As of 2024-04-23 16:28 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found