http://www.perlmonks.org?node_id=1036440


in reply to Re^2: Complex and reliable signal handling.
in thread Complex and reliable signal handling.

I did “read it carefully,” thank you all very much.   I intended to use the term, “thread-safe,” loosely in this particular case, and in fact I did it inappropriately.

The race that I hypothesize, does not concern resources being shared by threads in a single process-context, but rather, the resources that are being shared by all of the processes in the login-shell environment.   A statement as simple as $ENV{'FOOBAR'} = $ENV{'FOOBAR'} + 1; would exhibit this sort of race-condition conflict.

It could also be that you should not exit() within the signal handler, but instead should set a flag which causes the main loop of the child process to end as-soon as-possible.   When this is done, then, no matter what the process was doing at the unpredictable instant in which the termination-signal arrived, you know that it will end at a predictable point and in a predictable state ... real soon now, but not at this very instant.

Looking, now, more closely than I did before at the relevant perlguts, I see that JUMPENV_PUSH has to do with longjmp() state-saving.   Maybe there’s a hole there somewhere, and if so your task is to avoid it not to fix it.   By delaying the actual terminate to “real soon now” instead of “right now,” you’d avoid such a hole.

This is specifically what I would do in the child:

my $interrupted = 0; my @signals = qw/INT TERM USR2 HUP/; for my $sig (@signals) { $SIG{$sig} = sub { $interrupted = 1; } } #... then, throughout the code test for it ... #main loop: while (not $interrupted) { ... ... blah blah ... last if $interrupted; ... or ... exit(1) if $interrupted; }

The only immediate “response to” the signal is to set a flag which indicates that a signal has been received.   The child’s processing-loop frequently tests this flag at strategic places, and busts-out of the processing loop gracefully.   The arrival of a signal should also knock the process out of various kinds of voluntary sleep, so that it will always be responded-to.   But it no longer matters precisely what the process was doing at the precise instant that the telephone rings.

The parent-process should also explicitly wait-for children to terminate, before finally exiting itself, and before tearing-down any data structures that the children might depend on.   The second most-common place where applications sporadically fail, is when they are ending, because the parent jerks the rug out from under the children “sometimes.”