Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

Understanding fork + wait

by jplindstrom (Monsignor)
on Oct 29, 2002 at 13:11 UTC ( #208725=perlquestion: print w/ replies, xml ) Need Help??
jplindstrom has asked for the wisdom of the Perl Monks concerning the following question:

I wonder if someone could help me understand how fork and wait works. I may do the right thing, but I may also have missed something fundamental here.

Consider the following HTTP::Daemon code (simplified from my scenario), pretty much snipped from the synopsis, with forking added.

my $d = HTTP::Daemon->new( LocalPort => $self->port(), ReuseAddr => 1, ); $SIG{CLD} = $SIG{CHLD} = 'IGNORE'; while (my $c = $d->accept()) { next if(fork()); $SIG{CLD} = $SIG{CHLD} = 'DEFAULT'; #Reset child signal handler while (my $r = $c->get_request()) { # ... do stuff here, including a system() call } $c->close; undef($c); exit(0); }

At first, the fork left zombies. So I Googled for that, and figured I need a wait() or IGNORE handler. So I added that. The zombies went away.

But, instead I got problems with a system() call. It failed and $! was "No child processes". Huh?

My guess is that now it's the child signal handler that's messing this up, so I reset it to 'DEFAULT' after the fork. This seems to solve the problem and the system() works fine again.

Is this correct? Or did I just manage to fix the symptom instead of the problem?


/J

Comment on Understanding fork + wait
Download Code
Re: Understanding fork + wait
by robartes (Priest) on Oct 29, 2002 at 13:35 UTC
    Hi,

    my gut feeling tells me that this has something to do with the fact that system() is basically fork and exec while the parent (in your case, the child :- ) wait()s. The wait() returns with your "No child process" error.

    Interlude while Robartes checks the man page for wait

    Eureka. The man page for wait() states that wait returns ECHILD..

    ...if the process specified in pid does not exist or is not a child of the calling process. (This can happen for one's own child if the action for SIGCHLD is set to SIG_IGN. See also the NOTES sec- tion about threads.)

    This is on a Linux system with glibc v2.2, so YMMV.

    CU
    Robartes-

Re: Understanding fork + wait
by VSarkiss (Monsignor) on Oct 29, 2002 at 16:10 UTC

    How many child processes are you forking off? All systems I know of impose a limit in the number of children a process can have (except if it's owned by root). Some also limit the number of processes a user can have. The specific number varies by system.

    If you are going over the limit, you'll need to impose a throttle. For example, if the number of children exceeds some high water mark, don't go into the accept, but do a blocking wait until the number drops.

      Very light load, but a few connections may come in simultaneously.

      The problems are gone with my "fix". I just wonder if it's the right fix :)


      /J

        ...this puzzled me for a while, but the quick answer is, "yes, you are fixing the right problem." It was robartes observation that led me to this conclusion.

        Essentially, you are using three process structures in each service event by your server.

        Primary Server Client Service instance Process created by system()

        I believe that it was the third of those that was returning ECHILD. By resetting handlers on the second level, you made it it possible for the second level process to call wait() appropriately upon the death of its child.

        I just peeked to see if I could find evidence that signal handlers for ignored signals are reset during exec() but my notes (for Solaris anyway) say only that signals with handlers are reset to default during exec(). That makes sense because the signal handlers would be part of the parent's process address space, but would not be available to the child. I guess it doesn't matter what that third-level process does with the SIGCHLD because it (presumably) doesn't have any children.

        But you do want the second-level process to clean up its child proc structure (ie. to harvest its exit code) when the call to system() is complete.

        I don't know if this helps, but I believe that you did the right thing. Bring us back in here if you see any other odd symptoms from your system.

        ...All the world looks like -well- all the world, when your hammer is Perl.
        ---v

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://208725]
Approved by Bukowski
Front-paged by newrisedesigns
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chilling in the Monastery: (10)
As of 2014-12-29 16:45 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (193 votes), past polls