Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

forked process silently dies without coredump. No clue what's going on

by marcelser (Initiate)
on Oct 27, 2019 at 12:18 UTC ( [id://11108014]=perlquestion: print w/replies, xml ) Need Help??

marcelser has asked for the wisdom of the Perl Monks concerning the following question:

Hi, I have a script that forks itself into background and after some other work it checks system load with Linux::loadavg. At first I also tried Sys::Statistics::Linux::LoadAVG. Whatever I try when the system load is higher and loop which check it kicks in after 2-3 iterations the process just silently dies. At first I tried wrapping it in an eval block. Then I enabled core dumps on my machine and at least I tried to use Debug::FaultAutoBT to check why the process is just dying. Here's the sub which checks for systemload which just dies after 2-3 iterations in the loop
# config my $avg1_limit = '4.0'; my $avg5_limit = '2.0'; sub check_load { DEBUG("Enabling Auto Backtrace with output directory: '" . $tmp_dir +. "'"); Debug::FaultAutoBT->new(dir => "$tmp_dir")->ready; my $current_dir = getcwd; DEBUG("check load is running from directory: '" . $current_dir . "'" +); my @avgs = loadavg(); my $load = { avg_1 => $avgs[LOADAVG_1MIN], avg_5 => $avgs[LOADAVG_5M +IN], avg_15 => $avgs[LOADAVG_15MIN] }; my $dumper = Data::Dumper->new([$load],[qw($load)]); DEBUG("current system load is: " . $dumper->Dump()); INFO("waiting for loads to drop below '1 min=$avg1_limit, 5 min=$avg +5_limit'"); eval { while ($load->{'avg_1'} > $avg1_limit or $load->{'avg_5'} > +$avg5_limit) { sleep(5); DEBUG('about to call "loadavg"'); @avgs = loadavg(); $load = { avg_1 => $avgs[LOADAVG_1MIN], avg_5 => $avgs[LOA +DAVG_5MIN], avg_15 => $avgs[LOADAVG_15MIN] }; DEBUG("... re-ckeck system load: " . $dumper->Dump()); } }; if ($@) { ERROR "failed checking load, error: $@"; } else { DEBUG "while loop ended nicely"; } INFO("finished waiting, system load is now: " . Data::Dumper->Dump([ +$load],[qw($load)])); }
Maybe also interesting is that I use Log4Perl to print DEBUG messages to logfile. Here's what it looks like when it stops in the middle of nowhere.
2019/10/24 04:56:21 DEBUG [29280] Enabling Auto Backtrace with output +directory: '/tmp/cores' 2019/10/24 04:56:21 DEBUG [29280] check load is running from directory +: '/' 2019/10/24 04:56:21 DEBUG [29280] current system load is: $load = { 'avg_5' => '2.34', 'avg_1' => '2.31', 'avg_15' => '2.28' }; 2019/10/24 04:56:21 INFO [29280] waiting for loads to drop below '1 mi +n=4.0, 5 min=2.0' 2019/10/24 04:56:26 DEBUG [29280] about to call "loadavg" 2019/10/24 04:56:26 DEBUG [29280] ... re-ckeck system load: $load = $l +oad; 2019/10/24 04:56:31 DEBUG [29280] about to call "loadavg" 2019/10/24 04:56:31 DEBUG [29280] ... re-ckeck system load: $load = $l +oad;
What's puzzling me here actuall is the 2nd and following print of the Dumper so the lines starting with "... re-check system load" which pints just $load = $load instead of $load = { .... } but even without this message the script just stops here dead in its tracks without triggerint Debug::FaultAutoBT or producing a coredump. I'm trying to figure why this isn't working since weeks now and I'm really out of ideas. Maybe it has to do with the fact that it's forked background process or that it repeatedly calls loadavg but I think it must be something else. Any help is greatly appreciated cause I'm really starting to question perl here greatly. I've actually 10+ year of pressional perl experience but I never encountered a think like this...

Replies are listed 'Best First'.
Re: forked process silently dies without coredump. No clue what's going on (updated)
by haukex (Archbishop) on Oct 27, 2019 at 12:41 UTC
    What's puzzling me here actuall is the 2nd and following print of the Dumper so the lines starting with "... re-check system load" which pints just $load = $load instead of $load = { .... }

    Data::Dumper remembers which references it's already output, and will just output a "reference" to the previous variabledata structure if you dump it twice. Use its ->Reset method to reset its Seen cache.

    use Data::Dumper; my $load = { foo => 123 }; my $dumper = Data::Dumper->new([$load],[qw($load)]); print $dumper->Dump(); # $load = { 'foo' => 123 }; print $dumper->Dump(); # $load = $load; $dumper->Reset; print $dumper->Dump(); # $load = { 'foo' => 123 };
    eval { ... }; if ($@) { ...

    See Bug in eval in pre-5.14, although I'm not sure if this has anything to do with your issues. Why do you have the eval in the first place? The only thing I see where I would consider dieing realistic is the call to loadavg(), although you don't guard the first call to that function. And since you haven't shown that function, we can't tell if that might be the source of the issues.

    As for the actual problem, this seems like it may be a case for stripping down the code to a minimal SSCCE that reproduces the issue. This will help you narrow down the source of the problems. For example, to make sure that it isn't something strange with the logging, perhaps remove that, and add your own simple print statements to a file (with autoflush turned on). From the code you've shown, I'm not able to see what else the issue might be - maybe another Monk does, but the major advantage of a Short, Self-Contained, Correct Example is that we just can download and run it to reproduce the issue on our end.

    Update: Also, what's happening with STDOUT and especially STDERR of this process, do you have the ability to see that somewhere?

Re: forked process silently dies without coredump. No clue what's going on
by jcb (Parson) on Oct 28, 2019 at 00:15 UTC

    Your logging code is also broken. You are constructing an anonymous hash on line 11 to hold the load average information and storing a reference to that hash. You then store a new hash into the $load lexical on line 22, but the $dumper object still has a reference to the hash that was built on line 11, and will dump that hash again and again, with the old data. Line 32 correctly uses the latest information.

    Either change line 22 to a slice assignment: (untested)

    @{$load}{qw/avg_1 avg_5 avg_15/} = @avgs[LOADAVG_1MIN, LOADAVG_5MIN, L +OADAVG_15MIN];

    ... which will actually change the hash that $dumper knows about. (Also call $dumper->Reset as another monk mentioned.) Or use the same approach on line 23 as you use on line 32.

    No idea why your program is bailing out; do you perhaps have a $SIG{__DIE__} handler active while this code is running that could be calling exit?

      Yes you're perfectly right. I normally don't use the object oriented interface of Dumper Just the procedural one hence I missed that I passed the anonymous hash to the dumper which it still references even if I assign a new hash to $load. I actually changed the code to always use Data::Dumper->Dump() now, that's easier then updating the anonymous hash through a slice assignment As for the die handler. There is a overall die handler like this, could this be the source of the issues?
      $SIG{__DIE__} = sub { LOGDIE $@ };

        What does the sub LOGDIE do?

Re: forked process silently dies without coredump. No clue what's going on
by Anonymous Monk on Oct 27, 2019 at 12:44 UTC
    Are you sure it doesnt simply "exit"? strace
      Yes I am sure because the logging just stops and after this subroutine there is more actions which are also not called. Also I can not strace the process easily as it only happens if the system load rises above a certain level. The question would be how to increase system load above threshold easily without killing the whole machine then I could debug it probably easier

        Yes I am sure because the logging just stops and after this subroutine there is more actions which are also not called.

        exit exits

        POSIX::_exit exits faster

        grep the sources

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://11108014]
Approved by johngg
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others surveying the Monastery: (4)
As of 2024-04-23 22:33 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found