http://www.perlmonks.org?node_id=968602

topher has asked for the wisdom of the Perl Monks concerning the following question:

I'm trying to have a process start a couple of child processes, and then continually watch the output from the child processes to monitor status. However, when I try looping through the filehandles associated with the child processes (using IO::Select), the parent reads from the first child, and then seems to (internally) close() the filehandle and waitpid() on the child PID.

I feel like I'm probably missing something small or I have an error I'm not seeing here, but I'm running out of ideas. Here's a stripped down piece of code that shows what I'm seeing:

use strict; use warnings; use IO::Select; use IO::File; use Data::Dumper; my @childs; foreach my $kid(qw(foo bar baz quux)) { # Create and gather child processes/fd's push @childs, child_labor($kid); } my @results; my @files_ready; my $file_iter = IO::Select->new(); $file_iter->add(@childs); while (@files_ready = $file_iter->can_read(1)) { for (@files_ready) { my $fd = shift @{$_}; my $child = shift @{$_}; chomp(my $data = <$fd>); if ($data) { push @results, [ split ":", $data ]; } sleep 10; } } # Printing some results... print Dumper(@results); sub child_labor { my $name = shift; my $fd; defined (my $pid = open $fd, "-|") # Indirect fork, read child st +dout or die "Couldn't fork: $!"; # Parent... if ($pid) { return [ $fd, { pid => $pid, name => $name } ]; } # Child... else { $0 .= " $name"; my $counter = 0; while (1) { # Generate some output for our parent to read print "$name:" . 100 * (int(rand(100))+1) . "\n"; sleep int(rand(10)); last if $counter++ > 13; } # Important to exit from the child. exit 0; } }

And here's what's happening "under the hood" with the parent process when I run it:

ioctl(6, SNDCTL_TMR_TIMEBASE or TCGETS, 0x7fff0f84f2f0) = -1 EINVAL (I +nvalid arg ument) lseek(6, 0, SEEK_CUR) = -1 ESPIPE (Illegal seek) fcntl(6, F_SETFD, FD_CLOEXEC) = 0 select(8, [3 4 5 6], NULL, NULL, {1, 0}) = 1 (in [3], left {0, 999994} +) read(3, "foo:7100\n", 8192) = 9 rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0 rt_sigaction(SIGCHLD, NULL, {SIG_DFL, [], 0}, 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 nanosleep({10, 0}, 0x7fff0f84f7d0) = 0 close(3) = 0 rt_sigaction(SIGHUP, {SIG_IGN, [], SA_RESTORER, 0x7fdd25eb2030}, {SIG_ +DFL, [], 0 }, 8) = 0 rt_sigaction(SIGINT, {SIG_IGN, [], SA_RESTORER, 0x7fdd25eb2030}, {SIG_ +DFL, [], 0 }, 8) = 0 rt_sigaction(SIGQUIT, {SIG_IGN, [], SA_RESTORER, 0x7fdd25eb2030}, {SIG +_DFL, [], 0}, 8) = 0 wait4(10129,

You can see the select, followed by the read. That part works fine. You can also see the sleep call that I put at the end of the loop as a marker. However, as soon as it finishes the sleep and hits the end of the for loop, it closes the filehandle and wait()'s for the child PID to end. I'm just not sure why, or how to fix this.

If there's an overall better way of doing this, I'd love to hear it. However, I'd still like to figure this out so I understand why it's happening.

UPDATE: I think I figured it out. At the least, I've got things working now. For details, see my response below: Re: Select on child output problem

Replies are listed 'Best First'.
Re: Select on child output problem
by topher (Scribe) on May 03, 2012 at 04:13 UTC

    Ok, I think I've solved it. I won't even start on how annoying this is, and how much I wish I'd figured it out sooner. I'm still a little mystified on why it's failing in the original code, though. I have a theory, but I don't know Perl internals well enough to know if it's correct.

    Basically, it's the part where I grab the filehandle via shift that is blowing up the program. It seems to be tightly associating my lexical loop variable with the filehandle, and then when the variable (and thus the filehandle) go out of scope at the end of the loop, it's trying to cleanup the filehandle (by closeing and waiting on it.

    my $fd = shift @{$_}; my $child = shift @{$_};

    When I replace that with the following:

    my $fd = $_->[0]; my $child = $_->[1];

    It works.

    My theory is that because the underlying container where the filehandle is being stored is an array reference, that when I shift on the (dereferenced) array reference, I'm stomping on the original, even though it's in a for loop and using $_.

    Does that sound right? Chalk it up to the care needed for references and reference "copies", I suppose.

      My theory is that because the underlying container where the filehandle is being stored is an array reference, that when I shift on the (dereferenced) array reference, I'm stomping on the original...

      Oh yes, that sounds right. IO::Select probably isn't making a deep copy when you call can_read().

      ... even though it's in a for loop and using $_.

      That's a red herring. Your original code modified the data structure out from under IO::Select and your code. Loop aliasing doesn't apply at all, as it doesn't make deep copies either.

        Thanks for the confirmation.

        Frustrating experience, but at least I learned something. And something tells me I'll not be making this mistake again any time soon.

Re: Select on child output problem
by chromatic (Archbishop) on May 03, 2012 at 02:09 UTC

    What happens if you use eof on the current filehandle? You want to exhaust everything you can read without blocking, right? (You can control that the children always send a newline to terminate each line too.)

    Something like this might get you further:

    while (!eof( $fd )) { chomp( my $data = <$fd> ); if ($data) { push @results, [ split ":", $data ]; } else { # Other side (child) has closed connection. $file_iter->remove($_); close $fd; # Closing fd cleans up child } } sleep 10;

      It looks like that leaves it blocking (and continuing to read) on the first filehandle it processes. I don't see it ever move on through the loop.

      From previous researching on this, it seems that there's no real equivalent to EOF for this type of communication without closing the file descriptor.

      Pulling all available data without blocking would be ideal, but my bigger problem is just getting to the next file handle from the group returned by IO::Select without waiting for a child to die first. I can't understand why it's trying to close and wait on the child PID. Even if I comment out the close bit (which isn't reached anyway), I still get the same result with the close on the FD and wait on the child PID.

Re: Select on child output problem
by BrowserUk (Patriarch) on May 03, 2012 at 03:11 UTC

    This is nothing more than a hunch based on some things I saw on a non-*nix system; so treat it with the skepticism it deserves.

    If you are running a version of perl with PerlIO enabled, try it on a version that use stdio before concluding your code is in error.


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

    The start of some sanity?