I don't understand why that works for you, unless you have PERL_SIGNALS=unsafe?
PERL_SIGNALS is not set in my environment.
But PERL_SIGNALS does not affect the operating system in any way. Linux deliveres SIGALRM as it would for any other program. That means that the flock() system call (deep inside the perl flock() function) will be interrupted and will return EINTR immediately after the signal handler (deep inside the perl executable, not $SIG{'ALRM'}) has returned. The signal handler will - according to Deferred Signals (Safe Signals) - just set a flag. So a few CPU cycles after SIGALRM, the flock() system call will return. Standard Unix behaviour. Perl obviously does not attempt to restart the interrupted flock() system call, but makes the perl-level flock() function return false after setting $! to EINTR. Just before the perl-level flock() function returns, perl consideres the situation safe for signals, checks the flag set in the real signal handler, finds it set, and calls the perl-level signal handler stored in $SIG{ALRM}.
With unsafe signals, the flock() system call would be interrupted as above, but instead of setting a flag, the signal handler (deep inside the perl executable) would directly invoke the perl-level signal handler stored in $SIG{ALRM}. Everything else would behave the same: If &$SIG{ALRM} returns, control returns to the place where the fcntl() system call was invoked (i.e. the implementation of the fcntl perl function), with fcntl() returning EINTR. That would make the perl function return false after setting $! to EINTR.
The only two differences here are:
- With unsafe signals, perl internals could be damaged by the code in $SIG{ALRM}.
- Safe signals guarantee that the code that handles the return value of the system call flock() inside the perl function flock() is executed, so $! will be set properly. Perl with unsafe signals and with a $SIG{ALRM} that dies or exits perhaps will not reach that code.
Another demo:
This code only warns when a signal was caught, and all of the Win32 workarounds and timing is stripped from the orignal demo. say also writes time and PID.
#!/usr/bin/perl
use strict;
use warnings;
use autodie qw( open close );
use Fcntl qw( LOCK_EX LOCK_UN );
sub say
{
print scalar localtime(),' pid=',$$,': ',@_,"\n";
}
my $mainpid=$$;
my $pid=fork() // die "Can't fork: $!";
# in both parent and child process, install signal handler
$SIG{'ALRM'}=sub { warn 'SIGALRM in ',$$==$mainpid ? 'main' : 'helper'
+ };
if ($pid) {
# parent process
say "main starts";
say "main waiting for helper to lock";
sleep 1;
open my $f,'>','tempfile.tmp';
alarm(5);
say "main flock";
flock($f,LOCK_EX) or say "main can't lock: $! (error code ",0+
+$!,')';
say "main flock done";
alarm(0);
close $f;
say "main ends";
wait;
} else {
# child process
say "helper starts";
open my $f,'>','tempfile.tmp';
flock($f,LOCK_EX) or die "Helper can't lock: $!";
select(undef,undef,undef,10);
flock($f,LOCK_UN) or die "Helper can't unlock: $!";
close $f;
say "helper ends";
exit(0);
}
Output:
>env - perl interrupted.pl
Sun May 24 20:24:11 2015 pid=31389: main starts
Sun May 24 20:24:11 2015 pid=31389: main waiting for helper to lock
Sun May 24 20:24:11 2015 pid=31390: helper starts
Sun May 24 20:24:12 2015 pid=31389: main flock
SIGALRM in main at interrupted.pl line 16.
Sun May 24 20:24:17 2015 pid=31389: main can't lock: Interrupted syste
+m call (error code 4)
Sun May 24 20:24:17 2015 pid=31389: main flock done
Sun May 24 20:24:17 2015 pid=31389: main ends
Sun May 24 20:24:21 2015 pid=31390: helper ends
>env - PERL_SIGNALS=unsafe perl interrupted.pl
Sun May 24 20:24:39 2015 pid=31396: main starts
Sun May 24 20:24:39 2015 pid=31396: main waiting for helper to lock
Sun May 24 20:24:39 2015 pid=31397: helper starts
Sun May 24 20:24:40 2015 pid=31396: main flock
SIGALRM in main at interrupted.pl line 16.
Sun May 24 20:24:49 2015 pid=31397: helper ends
Sun May 24 20:24:49 2015 pid=31396: main flock done
Sun May 24 20:24:49 2015 pid=31396: main ends
alex@enterprise pts/0 20:24:49
/home/alex/tmp/lockdemo>
The interesting effect of unsafe signals is that flock() in main returns true, there is no warning message. But as you can see from the timing, flock() in main returns only after the helper has released its lock. So in that case, perl must have restarted the interrupted flock() system call. strace confirms that:
(Note that strace traces only the main process, not the fork()ed child.)
>env - PERL_SIGNALS=unsafe strace -o trace.txt /usr/bin/perl interrupt
+ed.pl
Sun May 24 20:39:02 2015 pid=31774: helper starts
Sun May 24 20:39:02 2015 pid=31773: main starts
Sun May 24 20:39:02 2015 pid=31773: main waiting for helper to lock
Sun May 24 20:39:03 2015 pid=31773: main flock
SIGALRM in main at interrupted.pl line 16.
Sun May 24 20:39:12 2015 pid=31774: helper ends
Sun May 24 20:39:12 2015 pid=31773: main flock done
Sun May 24 20:39:12 2015 pid=31773: main ends
>grep -C5 flock trace.txt
fstat(3, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
fcntl(3, F_SETFD, FD_CLOEXEC) = 0
alarm(5) = 0
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2335, ...}) = 0
write(1, "Sun May 24 20:39:03 2015 pid=317"..., 47) = 47
flock(3, LOCK_EX) = ? ERESTARTSYS (To be restart
+ed if SA_RESTART is set)
--- SIGALRM {si_signo=SIGALRM, si_code=SI_KERNEL} ---
rt_sigaction(SIGALRM, NULL, {0x7fdcf683dde0, [], SA_RESTORER|SA_RESTAR
+T, 0x7fdcf5a2f670}, 8) = 0
write(2, "SIGALRM in main at interrupted.p"..., 43) = 43
rt_sigreturn() = 73
flock(3, LOCK_EX) = 0
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2335, ...}) = 0
write(1, "Sun May 24 20:39:12 2015 pid=317"..., 52) = 52
alarm(0) = 0
close(3) = 0
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2335, ...}) = 0
>env - strace -o trace.txt /usr/bin/perl interrupted.pl
Sun May 24 20:43:14 2015 pid=31859: helper starts
Sun May 24 20:43:14 2015 pid=31858: main starts
Sun May 24 20:43:14 2015 pid=31858: main waiting for helper to lock
Sun May 24 20:43:15 2015 pid=31858: main flock
SIGALRM in main at interrupted.pl line 16.
Sun May 24 20:43:20 2015 pid=31858: main can't lock: Interrupted syste
+m call (error code 4)
Sun May 24 20:43:20 2015 pid=31858: main flock done
Sun May 24 20:43:20 2015 pid=31858: main ends
Sun May 24 20:43:24 2015 pid=31859: helper ends
>grep -C5 flock trace.txt
fstat(3, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
fcntl(3, F_SETFD, FD_CLOEXEC) = 0
alarm(5) = 0
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2335, ...}) = 0
write(1, "Sun May 24 20:43:15 2015 pid=318"..., 47) = 47
flock(3, LOCK_EX) = ? ERESTARTSYS (To be restart
+ed if SA_RESTART is set)
--- SIGALRM {si_signo=SIGALRM, si_code=SI_KERNEL} ---
rt_sigreturn() = -1 EINTR (Interrupted system
+ call)
rt_sigprocmask(SIG_BLOCK, [ALRM], [], 8) = 0
rt_sigaction(SIGALRM, NULL, {0x7f43b0eaede0, [], SA_RESTORER, 0x7f43b0
+0a0670}, 8) = 0
write(2, "SIGALRM in main at interrupted.p"..., 43) = 43
>
Unsafe signals seem to change how rt_sigaction is called: SA_RESTART is set only for unsafe signals. So fcntl() is automatically restarted (not by perl, as I wrote above) only with unsafe signals. Another round of strace also confirmes that:
>env - PERL_SIGNALS=unsafe strace -o trace.txt /usr/bin/perl interrupt
+ed.pl
Sun May 24 20:51:14 2015 pid=32083: helper starts
Sun May 24 20:51:14 2015 pid=32082: main starts
Sun May 24 20:51:14 2015 pid=32082: main waiting for helper to lock
Sun May 24 20:51:15 2015 pid=32082: main flock
SIGALRM in main at interrupted.pl line 16.
Sun May 24 20:51:24 2015 pid=32083: helper ends
Sun May 24 20:51:24 2015 pid=32082: main flock done
Sun May 24 20:51:24 2015 pid=32082: main ends
>grep -C5 SIGALRM trace.txt
close(3) = 0
rt_sigprocmask(SIG_SETMASK, ~[RTMIN RT_1], [], 8) = 0
clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIG
+CHLD, child_tidptr=0x7fdbc5fc2a10) = 32083
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
rt_sigprocmask(SIG_BLOCK, [ALRM], [], 8) = 0
rt_sigaction(SIGALRM, {0x7fdbc5ad9de0, [], SA_RESTORER|SA_RESTART, 0x7
+fdbc4ccb670}, {SIG_DFL, [], 0}, 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
open("/etc/localtime", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=2335, ...}) = 0
fstat(3, {st_mode=S_IFREG|0644, st_size=2335, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
+0) = 0x7fdbc5fe6000
--
fcntl(3, F_SETFD, FD_CLOEXEC) = 0
alarm(5) = 0
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2335, ...}) = 0
write(1, "Sun May 24 20:51:15 2015 pid=320"..., 47) = 47
flock(3, LOCK_EX) = ? ERESTARTSYS (To be restart
+ed if SA_RESTART is set)
--- SIGALRM {si_signo=SIGALRM, si_code=SI_KERNEL} ---
rt_sigaction(SIGALRM, NULL, {0x7fdbc5ad9de0, [], SA_RESTORER|SA_RESTAR
+T, 0x7fdbc4ccb670}, 8) = 0
write(2, "SIGALRM in main at interrupted.p"..., 43) = 43
rt_sigreturn() = 73
flock(3, LOCK_EX) = 0
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2335, ...}) = 0
write(1, "Sun May 24 20:51:24 2015 pid=320"..., 52) = 52
alarm(0) = 0
--
rt_sigaction(SIGKILL, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGUSR1, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGSEGV, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGUSR2, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGPIPE, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGALRM, NULL, {0x7fdbc5ad9de0, [], SA_RESTORER|SA_RESTAR
+T, 0x7fdbc4ccb670}, 8) = 0
rt_sigaction(SIGALRM, {SIG_DFL, [], SA_RESTORER|SA_RESTART, 0x7fdbc4cc
+b670}, {0x7fdbc5ad9de0, [], SA_RESTORER|SA_RESTART, 0x7fdbc4ccb670},
+8) = 0
rt_sigaction(SIGTERM, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGSTKFLT, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGCHLD, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGCONT, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGSTOP, NULL, {SIG_DFL, [], 0}, 8) = 0
>env - strace -o trace.txt /usr/bin/perl interrupted.pl
Sun May 24 20:51:35 2015 pid=32103: helper starts
Sun May 24 20:51:35 2015 pid=32102: main starts
Sun May 24 20:51:35 2015 pid=32102: main waiting for helper to lock
Sun May 24 20:51:36 2015 pid=32102: main flock
SIGALRM in main at interrupted.pl line 16.
Sun May 24 20:51:41 2015 pid=32102: main can't lock: Interrupted syste
+m call (error code 4)
Sun May 24 20:51:41 2015 pid=32102: main flock done
Sun May 24 20:51:41 2015 pid=32102: main ends
Sun May 24 20:51:45 2015 pid=32103: helper ends
>grep -C5 SIGALRM trace.txt
close(3) = 0
rt_sigprocmask(SIG_SETMASK, ~[RTMIN RT_1], [], 8) = 0
clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIG
+CHLD, child_tidptr=0x7f45f31f1a10) = 32103
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
rt_sigprocmask(SIG_BLOCK, [ALRM], [], 8) = 0
rt_sigaction(SIGALRM, {0x7f45f2d08de0, [], SA_RESTORER, 0x7f45f1efa670
+}, {SIG_DFL, [], 0}, 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
open("/etc/localtime", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=2335, ...}) = 0
fstat(3, {st_mode=S_IFREG|0644, st_size=2335, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
+0) = 0x7f45f3215000
--
fcntl(3, F_SETFD, FD_CLOEXEC) = 0
alarm(5) = 0
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2335, ...}) = 0
write(1, "Sun May 24 20:51:36 2015 pid=321"..., 47) = 47
flock(3, LOCK_EX) = ? ERESTARTSYS (To be restart
+ed if SA_RESTART is set)
--- SIGALRM {si_signo=SIGALRM, si_code=SI_KERNEL} ---
rt_sigreturn() = -1 EINTR (Interrupted system
+ call)
rt_sigprocmask(SIG_BLOCK, [ALRM], [], 8) = 0
rt_sigaction(SIGALRM, NULL, {0x7f45f2d08de0, [], SA_RESTORER, 0x7f45f1
+efa670}, 8) = 0
write(2, "SIGALRM in main at interrupted.p"..., 43) = 43
rt_sigprocmask(SIG_UNBLOCK, [ALRM], NULL, 8) = 0
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2335, ...}) = 0
write(1, "Sun May 24 20:51:41 2015 pid=321"..., 92) = 92
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2335, ...}) = 0
write(1, "Sun May 24 20:51:41 2015 pid=321"..., 52) = 52
--
rt_sigaction(SIGKILL, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGUSR1, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGSEGV, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGUSR2, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGPIPE, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGALRM, NULL, {0x7f45f2d08de0, [], SA_RESTORER, 0x7f45f1
+efa670}, 8) = 0
rt_sigaction(SIGALRM, {SIG_DFL, [], SA_RESTORER, 0x7f45f1efa670}, {0x7
+f45f2d08de0, [], SA_RESTORER, 0x7f45f1efa670}, 8) = 0
rt_sigaction(SIGTERM, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGSTKFLT, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGCHLD, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGCONT, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGSTOP, NULL, {SIG_DFL, [], 0}, 8) = 0
>
SA_RESTART is only set for unsafe signals. So with unsafe signals, interrupted system calls are restarted if possible; with safe signals, interrupted system calls are not restarted.
That is at least strange, perhaps even a bug. I think that perl should either always or never restart interrupted system calls, not depending on safe or unsafe signal handlers.
Deferred Signals (Safe Signals) says unter "Restartable system calls":
On systems that supported it, older versions of Perl used the SA_RESTART flag when installing %SIG handlers. This meant that restartable system calls would continue rather than returning when a signal arrived. In order to deliver deferred signals promptly, Perl 5.8.0 and later do not use SA_RESTART. Consequently, restartable system calls can fail (with $! set to EINTR) in places where they previously would have succeeded.
The default :perlio layer retries read, write and close as described above; interrupted wait and waitpid calls will always be retried.
So, this starts to look like a bug in unsafe signals. SA_RESTART should not be set in Perl 5.8.0 and later, no matter what type of signal handlers is used.
<UPDATE>
Relevant code is in util.c, functions Sighandler_t Perl_rsignal(pTHX_ int signo, Sighandler_t handler) and int Perl_rsignal_save(pTHX_ int signo, Sighandler_t handler, Sigsave_t *save). Both have the following four lines:
#ifdef SA_RESTART
if (PL_signals & PERL_SIGNALS_UNSAFE_FLAG)
act.sa_flags |= SA_RESTART; /* SVR4, 4.3+BSD */
#endif
</UPDATE>
See also: Linux signal handling
Alexander
--
Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)
|