Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

linux perl - interrupted system calls not restarted?

by flipper (Beadle)
on Oct 22, 2009 at 12:44 UTC ( #802731=perlquestion: print w/ replies, xml ) Need Help??
flipper has asked for the wisdom of the Perl Monks concerning the following question:

Hi monks,

I've spent some time tracking down a strange problem in a daemon script - ultimately I've no-one to blame but myself, as I wasn't checking my return codes and/or $!, but I've found some surprising behaviour... Much reduced code:

#!/usr/bin/perl -w use strict; use IO::Socket::INET; my $server =IO::Socket::INET->new(LocalPort => 4848, Listen=> 1,ReuseA +ddr => 1) or die "listen: $!"; my $client; USER: while($client= $server->accept()){ if (my $pid = fork()){ #parent close $client; $SIG{CHLD} = sub {}; }else{ print $client "hello, world!\n"; select(undef,undef,undef,2); exit; } } warn "fell out of loop - $!";

The sleep in the child process causes SIGCHLD to be delivered to the parent when it is in blocking in accept(), and sure enough I get fell out of loop - Interrupted system call at /tmp/z.pl line 17.

I understand from chapter 16 of the Camel Book that this could happen with old, horrible Unices, but I'm surprised it happens with This is perl, v5.10.0 built for i486-linux-gnu-thread-multi on Debian 5.0.3

Can anyone shed some light on this?? An strace looks like it's going to work, but doesn't...

bind(3, {sa_family=AF_INET, sin_port=htons(4848), sin_addr=inet_addr(" +0.0.0.0")}, 16) = 0 listen(3, 1) = 0 accept(3, {sa_family=AF_INET, sin_port=htons(23694), sin_addr=inet_add +r("127.0.0.1")}, [16]) = 4 ioctl(4, SNDCTL_TMR_TIMEBASE or TCGETS, 0xbffa26c8) = -1 EINVAL (Inval +id argument) _llseek(4, 0, 0xbffa2710, SEEK_CUR) = -1 ESPIPE (Illegal seek) ioctl(4, SNDCTL_TMR_TIMEBASE or TCGETS, 0xbffa26c8) = -1 EINVAL (Inval +id argument) _llseek(4, 0, 0xbffa2710, SEEK_CUR) = -1 ESPIPE (Illegal seek) fcntl64(4, F_SETFD, FD_CLOEXEC) = 0 clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIG +CHLD, child_tidptr=0xb7d9f908) = 5206 close(4) = 0 rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0 rt_sigaction(SIGCHLD, {0x809a270, [], 0}, {SIG_DFL}, 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 accept(3, 0xbffa28c8, [4096]) = ? ERESTARTSYS (To be restart +ed) --- SIGCHLD (Child exited) @ 0 (0) --- sigreturn() = ? (mask now []) rt_sigprocmask(SIG_BLOCK, [CHLD], NULL, 8) = 0 rt_sigaction(SIGCHLD, NULL, {0x809a270, [], 0}, 8) = 0 rt_sigprocmask(SIG_UNBLOCK, [CHLD], NULL, 8) = 0 open("/usr/share/locale/locale.alias", O_RDONLY) = 4 fstat64(4, {st_mode=S_IFREG|0644, st_size=2586, ...}) = 0 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, + 0) = 0xb7f7a000 read(4, "# Locale name alias data base.\n# "..., 4096) = 2586 read(4, ""..., 4096) = 0 close(4) = 0 munmap(0xb7f7a000, 4096) = 0 open("/usr/share/locale/en_GB.UTF-8/LC_MESSAGES/libc.mo", O_RDONLY) = +-1 ENOENT (No such file or directory) open("/usr/share/locale/en_GB.utf8/LC_MESSAGES/libc.mo", O_RDONLY) = - +1 ENOENT (No such file or directory) open("/usr/share/locale/en_GB/LC_MESSAGES/libc.mo", O_RDONLY) = 4 fstat64(4, {st_mode=S_IFREG|0644, st_size=1474, ...}) = 0 mmap2(NULL, 1474, PROT_READ, MAP_PRIVATE, 4, 0) = 0xb7f7a000 close(4) = 0 open("/usr/share/locale/en.UTF-8/LC_MESSAGES/libc.mo", O_RDONLY) = -1 +ENOENT (No such file or directory) open("/usr/share/locale/en.utf8/LC_MESSAGES/libc.mo", O_RDONLY) = -1 E +NOENT (No such file or directory) open("/usr/share/locale/en/LC_MESSAGES/libc.mo", O_RDONLY) = -1 ENOENT + (No such file or directory) write(2, "fell out of loop - Interrupted sy"..., 65) = 65 close(3) = 0 exit_group(0) = ?


Thanks!

Comment on linux perl - interrupted system calls not restarted?
Select or Download Code
Re: linux perl - interrupted system calls not restarted?
by jakobi (Pilgrim) on Oct 22, 2009 at 13:29 UTC
    consider $SIG{CHLD} = 'IGNORE', which indeed avoids both ending the loop and zombies. Search for reaper and IGNORE.

    As you said you need IO::Socket below:

    Possibly IO::Socket gets irritated by some %SIG handlers (couldn't find anything on the quick in the source or docs though). 2 possible workarounds I currently see:

    • perlipc: search for sigaction and SA_RESTART then read the section on EINTR directly below
    • IGNORE as handler and place the 'rc' into a file in the child, then in the mainloop regularly waitpid over the children and check for possible rc files from children died with rc!=0. Missing 'rc' and missing child is indicating a more severe error.

    Seems a known issue: SOLVED: Re: TCP Client-Server: Server exits though it shouldn't loops forever, and retries accept in case of EINTR. But the loop in the reaper looks like it can stop early; and other syscalls may mess up the detection of EINTR, which seems quite far away from the interrupted syscall.

    Maybe just retry accept() upto n times in a row if $client is false.

    Note that's there's a small race of the parent running w/o SIG handler, and later children possibly running with it (if it's inheritable by fork; CHECKED: doesn't seem to be inherited)


    Please also post your updated code when done, thanx, Peter

      I need to use SIG{CHLD} in this case - the parent accepts connections, if there is no child running, it starts a child to service the new connection. If there is an existing child, it tells the new client it can't connect as it is in use from ip:port. The child can exit nonzero - when this happens, the parent needs to exit immediately (not at the next accept().

      So the parent needs to detect new connections, and a child exiting. The issue I'm concerned with is more general - If I'm using signals anywhere (eg sleep()), do I need to wrap every system call in a loop to retry it??

        I need to use SIG{CHLD} in this case

        Then you need to check for EINTR

        use Errno qw( EINTR ); for (;;) { my $client = $server->accept(); if (!$client) { next if $! == EINTR; die("Can't accept: $!\n"); } ... }

        If I'm using signals anywhere (eg sleep()), do I need to wrap every system call in a loop to retry it??

        Well, those that are interruptable, yes. That includes sysread. But sysread should already be in a loop since it's not guaranteed to return as many bytes as you requested.

      Indeed - perldoc perlipc suggests that the info in my camel book is out of date, guess I should update!

      Thanks for your help!

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://802731]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others studying the Monastery: (3)
As of 2015-07-05 09:12 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (61 votes), past polls