Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses

Daemon causing zombies under 5.8

by trs80 (Priest)
on Jan 13, 2003 at 19:46 UTC ( #226574=perlquestion: print w/ replies, xml ) Need Help??
trs80 has asked for the wisdom of the Perl Monks concerning the following question:

I have a daemonized script that is producing zombies under 5.8, but running prefectly fine under 5.6.1. Here is the script:
#!/usr/bin/perl # report_server # this script will run in the background and pickup # .pld files waiting to be run. # load up our modules use Storable; use MyModule::SEO; use Proc::Fork; use strict; use Data::Dumper; use POSIX; # set some globals use vars ('%processing','$seo_master') ; # this code was borrowed from Net::Daemonize # this code runs in conjunction with Apache and needs to # share ownership with it for the files it generates my $uid = getpwnam('nobody'); $< = $> = $uid; if( $< != $uid ){ die "Couldn't become uid \"$uid\"\n"; } POSIX::setuid( $uid ) || die "Couldn't POSIX::setuid to \"$uid\" [$!]\ +n"; # end borrowed code use Proc::Daemon; Proc::Daemon::Init; # create our master object # should this be in the loop? # Its only purpose is to log file pickup # to the master log file, handled by the # error_to_log method my $master_seo = MyModule::SEO->new; $master_seo->_debug(1); &loop; sub loop { # create our array of temp files to check my @pld_file = </web/tmp/*.pld>; foreach my $file (@pld_file) { # delete a file from the processing hash if it # has been in has for more then 20 minutes and # the file still exists if (time > ($processing{$file}[1] + 1200) ) { delete $processing{$file}; } # skip file if it is still flagged as being processed next if $processing{$file}[0] == 1; # log file pick $master_seo->error_to_log("Picked up $file for processing",1); # set our values to check against on future requests $processing{$file}[0] = 1; $processing{$file}[1] = time; $file =~ /(\d{2,})\.pld/; my $number = $1; child { exec('/web/bin/',$number); # in case it doesn't spawn just exit exit; }; } # wait 15 seconds and then check again sleep 15; &loop; }
The code is run unchanged between the development server (5.8) and the production server (5.6.1). The script has been in place in production and running without a problem for over 46 days. A recent upgrade of the development server to 5.8 exposed the zombie issue.

Is there a better way to daemonize the script and run as a specific none root user?

What is a way to debug to find the cause of zombies?

Comment on Daemon causing zombies under 5.8
Download Code
Re: Daemon causing zombies under 5.8
by Helter (Chaplain) on Jan 13, 2003 at 20:45 UTC
    Well, I'm not sure if you already know this, but I didn't when I had a similar problem. A zombie is a process that is ready to exit, but is waiting for it's parent to let it die.

    In my case I was using Open2 to open some new processes and not calling waitpid() to let them die.
    I have never coded a Deamon, but I would suspect a change in that now you have to do the reaping in 5.8, or it is at least more strict about it.

    Hope this helps!
      I was intrigued so I looked up deamon and fork to see if I could find an issue.
      On the fork page I answered one of my questions, what does the code:
      child{ .....}
      do. Looking at this page there is this code as well, which is probably what you want to add, and is probably your source of zombies:
      use Proc::Fork; child { # child code goes here. } parent { my $child_pid = shift; # parent code goes here. waitpid $child, 0; } error { # Error-handling code goes here (if fork() fails). }; # Note the semicolon at the end. Necessary if other statements follow +.

      So as I suspected it looks like you need the waitpid() call.

      One question I still have, is it normal for your loop to be recursive? If this code is supposed to run indefinitly are you not going to run into stack issues? Or is there something else that prevents this?

        I see the error in my ways and have converted it a while loop similar to the one suggested by virtualsue.

        I don't believe there is a stack issue since I am not concerned with the order in which they are selected at this point. The files are deleted by the program that is execed from the daemon. They are only rerun if they are still there after 20 minutes, which can occur if the system is disconnected from the secondary server.
Re: Daemon causing zombies under 5.8
by virtualsue (Vicar) on Jan 13, 2003 at 21:42 UTC
    What is a way to debug to find the cause of zombies?

    There is only one cause for the creation of a zombie process: the parent process hasn't collected a child process exit status via waitpid or set $SIG{CHLD} = 'IGNORE' (this signals your program's lack of interest in its offspring). In other words, if your program doesn't explicitly ignore SIGCHLD, all its child processes will hang around like the undead until waitpid is called for each one. If you look at the perldoc for Proc::Fork, the synopsis shows the basic format for creating and disposing of child processes, and the waitpid call is clearly shown. If your program really doesn't make zombies under 5.6.1, I'd say something has been fixed in 5.8. :-)

      Thanks for the second set of eyes virtualsue. I had been over the $SIG{CHLD} = 'IGNORE' problem before when working on a different solution, I can't belive I didn't notice it was missing from this code.
        You're welcome. I suggest you think about removing the recursion that Helter pointed out, too. I never noticed it when I skimmed your program the first time, because I was only looking for the cause of your zombie problem. Wouldn't it be much cleaner to structure your code like
        while (1) { process_files(); # loop(), essentially sleep $interval; }

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://226574]
Approved by Mr. Muskrat
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others imbibing at the Monastery: (11)
As of 2014-07-10 22:11 GMT
Find Nodes?
    Voting Booth?

    When choosing user names for websites, I prefer to use:

    Results (217 votes), past polls