Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

Parallel ForkManager error with run_on_wait()

by sojourn548 (Acolyte)
on May 16, 2013 at 07:37 UTC ( #1033780=perlquestion: print w/ replies, xml ) Need Help??
sojourn548 has asked for the wisdom of the Perl Monks concerning the following question:

When using Parallel ForkManager 0.7.5 (and the latest version 1.03) with run_on_wait(), I am seeing this error before the script terminates abnormally:

Use of uninitialized value in block exit at /lib/site_perl/5.8.7/Paral +lel/ForkManager.pm line 365. Use of uninitialized value in block exit at /lib/site_perl/5.8.7/Paral +lel/ForkManager.pm line 365. Unable to create sub named "" at /lib/site_perl/5.8.7/Parallel/ForkMan +ager.pm line 365. Good-Bye...

This is a constantly running process that seems to error out with the same message about once every 10 days or so, and it started occurring after I updated the script to use run_on_wait(). run_on_wait() is used to call a routine every second, and keeps track of forked processes and sends TERM signal to those child processes that have been running longer than 4 seconds. I am unsure how to go about debugging this, as this error seems to occur in rare occasions. I appreciate taking the time to review this, and thanks in advance.

our %procs; use constant LIMIT => 4 $pm->run_on_wait(\&term_process, 1); $pm->run_on_finish( sub { my ($pid, $exit_code, $ident) = @_; my ($check_id, $host) = $ident =~ /^(.*?) on (.*)/s; print("run_on_finish: $ident (pid: $pid) exited with code: [$e +xit_code] host: [$host]\n"); delete $procs{$pid}; print("proc_mgmt: $ident: deleting (pid: $pid) from list\n"); } ); $pm->run_on_start( sub { my ($pid,$ident)=@_; print("** $ident started, pid: $pid\n"); $procs{$pid} = time(); } ); sub term_process{ my $debug_time; my $total_time; while (my ($pid, $started_at) = each %procs) { next unless time() - $started_at > LIMIT; $debug_time = time(); $total_time = $debug_time - $started_at; print("[$pid] hung. time now: [$debug_time] - [$started_at] = [$t +otal_time] sending KILL."); kill TERM => $pid; delete $procs{$pid}; } }

Looking at ForkManager.pm:

357 sub run_on_wait { my ($s,$code, $period)=@_; 358 $s->{on_wait}=$code; 359 $s->{on_wait_period} = $period; 360 } 361 362 sub on_wait { my ($s)=@_; 363 if(ref($s->{on_wait}) eq 'CODE') { 364 $s->{on_wait}->(); 365 if (defined $s->{on_wait_period}) { 366 local $SIG{CHLD} = sub { } if ! defined $SIG{CHLD}; 367 select undef, undef, undef, $s->{on_wait_period} 368 }; 369 }; 370 };

Comment on Parallel ForkManager error with run_on_wait()
Select or Download Code
Re: Parallel ForkManager error with run_on_wait() (in 10 year old version)
by Anonymous Monk on May 16, 2013 at 08:57 UTC

    When using Parallel ForkManager 0.7.5 ... and maybe I discovered a bug?

    Try the latest, try  cpan SZABGAB/Parallel-ForkManager-1.03.tar.gz

    0.7.5 is over 10 years old

      That was one of my suspicions, so before I posted on PerlMonks, I downloaded and viewed the source for Parallel-ForkManager-1.03. The subs run_on_wait() and run_wait() hasn't changed, and I did not see any bug reports/changes related to the errors that I was seeing.. So I decided to post here to see if it was something that I was doing incorrectly.

        I downloaded and viewed the source

        How about you try running the code?

Re: Parallel ForkManager error with run_on_wait()
by Anonymous Monk on Dec 12, 2013 at 10:29 UTC

    After having a look at the code and testing further, I think that the problem is the local scope of the $SIG{CHILD} in ForkManager's sub wait_on. Obviously, the empty subroutine is set in order to yank the process free of the following sleep (select) statement, otherwise IGNORE/undef would do just fine. When the process leaves the sub on_wait, then $SIG{CHILD} is reset to undef (likely meaning IGNORE). However, if more signals hit the process at just the right time (when exiting the sub/resetting $SIG{CHILD}), then the error is triggered. I can reproduce it outside ForkManager, so it is nothing special in ForkManager.

    This error should only occur in forkmanager if the user has not set $SIG{CHILD}.

    I see two possible solutions. Either remove the "local" clause on $SIG{CHLD} in ForkManager sub wait_on, OR simply set the following before you call ForkManager:

    $SIG{CHLD} = sub { };

    Best regards, /Bjarne

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1033780]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (5)
As of 2015-07-04 10:25 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (59 votes), past polls