Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Re^7: Forks, Pipes and Exec (file descriptors)

by diabelek (Beadle)
on Nov 07, 2008 at 20:59 UTC ( #722297=note: print w/replies, xml ) Need Help??


in reply to Re^6: Forks, Pipes and Exec (file descriptors)
in thread Forks, Pipes and Exec

I'll answer some of the points you brought up first so you can point out any flaws in my thinking:

  1. I'm using the non-blocking because I have a thread that executes a requested command that should only take a few milliseconds. It then reads the output pipes from all my external applications and stores whatever it read into a buffer. If there isn't anything on the pipe that doesn't matter, I just continue and service the next request that may/may not be waiting. Select is never used so I didn't see this as an issue.
  2. The only signal I need is to tell the process to die. I've been using SigKill since I just want the forked executable to die. This seems to work in Windows and Linux.
  3. Thank you for the info on open. I thought the open( fh, "cmd |") ran in the foreground. I've switched to using that and it has cut out quite a bit of code. It also seems to work even better for killing the process. I'm still using sigkill since I don't know of a better way to do it in perl. Suggestions?
  4. The goal of the module, briefly addressed in the first point, is to provide an instance of the module that can start another process/thread/etc. The two processes will communicate with each other in a command/result xml format (all commands come from the instantiated module and are sent to the child process). The child process is the responsible for servicing those requests which includes starting up additional cli programs and buffering their output. The buffer will be regularly checked for errors and when the parent process requests the errors, they are passed back up. I hope that explains things.

From what I've read POE might have a been something to look at but I need this done sooner than later and I already have this module working in Linux... Windows is always the problem (jab). Plus I would like to get a better understanding of the inner workings of perl

The last problem I seem to be fighting is that when I run the code to set a pipe as nonblocking, it isn't really nonblocking. I seem to get a chunk of data every 15 seconds or so on Win2k3 using this code:

pipetest.pl

use IO::Handle; print "before the pipe\n"; sub getpipe { my $hash2 = shift; $hash2->{pid} = open( $hash2->{stdout}, "c:\\perl\\bin\\perl.exe ye +s.pl |") or die; $hash2->{buffer} = ""; $hash2->{stdout}->blocking(0); sleep( 3 ); return $hash2->{pid}; } my $hash = {}; getpipe($hash); my $fh = $hash->{stdout}; print( "pipe created\n" ); while( <$fh> ) { print "got from pipe: $_"; sleep( 1 ); } print( "done\n" );

yes.pl

while( 1 ) { printf "%s yes\n", time(); sleep( 1 ); }

sample output:

got from pipe: 1226094472 yes 1226094473 yes 1226094474 yes 1226094475 yes 1226094476 yes 1226094477 yes 1226094478 yes 1226094479 yes 1226094480 yes 1226094481 yes 1226094482 yes 1226094483 yes 1226094484 yes 1226094485 yes 1226094486 yes 1226094487 yes

Replies are listed 'Best First'.
Re^8: Forks, Pipes and Exec (file descriptors)
by BrowserUk (Pope) on Nov 08, 2008 at 00:49 UTC
    The last problem I seem to be fighting is that when I run the code to set a pipe as nonblocking, it isn't really nonblocking. I seem to get a chunk of data every 15 seconds or so on Win2k3 using this code:

    That problem has nothing to do with blocking. You are simply Suffering from buffering. The addition of $|++; to the top of yes.pl and you will get your output once per second.

    However, if the connected process is not a perl script, but some executable that doesn't disable buffering, then there may be nothing that you can do about this unless you can modify and re-build that executable. If you were using the underlying OS api calls to create the pipe, then you can disable buffering from either end, but Perl doesn't expose that functionality.

    If there isn't anything on the pipe that doesn't matter, I just continue and service the next request that may/may not be waiting. Select is never used so I didn't see this as an issue.

    Hm. Unless I am misunderstanding you, it does matter. If you try to read from a pipe and there is nothing available, then the read will block until there is. Even if the pipe is non-blocking. Which means you won't be able to "continue and service the next request" once you enter a readstate on one pipe until something becomes available on that pipe.

    Setting the pipe non-blocking allows you to use select to discover when the is something to be read and only enter the read when you know it will be satisfied immediately. But, I said above, there is no way to set an anonymous pipe non-blocking on win32.


    With respect to an alternative way to code yout application so that it will work on Win32. I'm afraid I still find your descriiption insuffucuently clear to suggest anything. There are several bit of this latest post that leave me confused. For example,

    • when you say
      "I'm using the non-blocking because I have a thread that executes..."

      Do you mean you are already using threads--explicitly? Or are you using (pseudo-)processes via fork?

    • And
      " is to provide an instance of the module that can start another process/thread/etc."

      Again, which is it? Processes? Threads? Both? And what is "etc."

      Please don't take that as pendantry. Too often people will use these terms interchangably, but it is important to distinguish between them.

    • The two processes will communicate with each other in a command/result xml format (all commands come from the instantiated module and are sent to the child process). The child process is the responsible for servicing those requests which includes starting up additional cli programs and buffering their output. The buffer will be regularly checked for errors and when the parent process requests the errors, they are passed back up.

      That implies you are talking about bi-directional communications via pipes--Expect style. I don't think anyone has got that to work from Perl on Win32.

    I hope that explains things.

    Sorry, but no it doesn't. At this point, the overall architecture of your application is about as clear as mud to me. It involves child processes, and pipes, and sending commands to the children, and getting replies back, and running multiple of these concurrently, but I have no overview.

    And quite how you are achieving this on linux without using select is beyond me?

    If you have code that works on Linux and don't mind letting me see it--either here or via email--that would certainly be the quickest way for me to understand what you are doing and perhaps be able to suggest how to make it work on Win32.

    From what I've understood of what you've said POE certainly sounds like an option, though as you imply it would probably mean re-writing everything you have and involve a pretty steep learning curve. I'm also not sure how portable POE code is to Win32.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

      I can definitely change the buffering on the other application and that helps a bit.

      As for the non-blocking, I thought non-blocking allowed a read to complete even when there was no data. If there was no data, I thought either EOF or EAGAIN or something was set. The example above can be changed to this to get it working in linux and loop as fast as possible.

      use IO::Handle; use Timer::HiRes qw(usleep); print "before the pipe\n"; sub getpipe { my $hash2 = shift; print( "starting yes.pl " . time() . "\n" ); $hash2->{pid} = open( $hash2->{stdout}, "/usr/bin/perl yes.pl |") o +r die; $hash2->{buffer} = ""; $hash2->{stdout}->blocking(0); sleep( 1 ); return $hash2->{pid}; } my $hash = {}; getpipe($hash); my $fh = $hash->{stdout}; print( "pipe created\n" ); my $count = 0; while( 1 ) { my $data = <$fh>; if( $data != "" ) { print "got from pipe: $data"; } print( "." ); usleep( 10000 ); } print( "done\n" );

      I will try to post the script externally and provide a link so you understand what I'm trying to do a little better. To answer a couple of your questions about threads/processes, I'm really using both. I start the thread/process by calling fork. So in Linux, I understand that I'm using the true fork command. In Windows, I'm using the pseudo fork that is really thread when you get to the bottom of things. Which one I use doesn't matter. I just need a separate branch of execution so the parent script can continue doing its normal thing while my branch of execution handles the external processes. Thats probably an ignorant statement but that's why I'm posting :)

      I've found another perl script that sounds like what I'm doing: logtail. I believe the basic idea is the same. I need to review the code more to see what they have done.

        I've found another perl script that sounds like what I'm doing: logtail.

        Here's a simplified and somewhat crude equivalent that uses threads and should run anywhere you have a threaded-perl and a tail command. Maybe it'll be useful to you.

        #! perl -slw use strict; use threads ( stack_size => 4096 ); use threads::shared; use Thread::Queue; $|++; our $VERBOSE :shared; our $REMOTE :shared; my $stop :shared = 0; ## Set true to terminate threads my @logs = map glob, @ARGV; ## expand wildcards my $Q = new Thread::Queue; ## One trhead per log file threads->create( \&tail, $Q, $_, 1 )->detach for @logs; my $remote; ## Remote watcher socket if( $REMOTE ) { require IO::Socket; $remote = IO::Socket::INET->new( $REMOTE ) or warn "Couldn't connect to $REMOTE : $!, $^E"; print $remote "Hi there, Got your ears on?"; } ## Thread to monitor the Q, print locally and/or forward to remote my $relay = async { for( 1 .. @logs ) { ## Waits for all tals to terminate while( my $line = $Q->dequeue ){ chomp $line; print $line if $VERBOSE; print $remote $line if $remote; } } }; ## Local command loop while( <STDIN> ) { my( $command, $value ) = split; if( $command =~ m[^(END|QUIT)]i ) { $stop = 1; warn "Quiting...\n"; $relay->join; exit 0; } if( $command =~ m[^VERBOSE]i ) { $VERBOSE = $value; print "VERBOSE set to $value"; } elsif( $command eq 'qs' ) { print $Q->pending; } else { print "Unrecognised command: $command"; } } ## Tail threads sub tail { print threads->tid, ' : ', threads->self->get_stack_size; my( $Q, $path, $seconds ) = @_; $seconds = 1 unless $seconds; my $pid = open my $log, "tail -Fs $seconds $path |" or die $@; printf "Thread %d Following $path\n", threads->tid; $Q->enqueue( $_ ) while not $stop and defined( $_ = <$log> ); kill 3, $pid; close $log; $Q->enqueue( undef ); }

        It happily follows 100 logs, simultaneously logging them local and transmitting them to a remote watcher. You can enter a few commands at the local console to turn local verbose on and off, monitor the size of the queue, and quit:

        c:\test>logwatch.pl -REMOTE=localhost:35007 log\log*.txt Thread 2 Following log\log0002.txt Thread 1 Following log\log0001.txt Thread 3 Following log\log0003.txt Thread 4 Following log\log0004.txt Thread 5 Following log\log0005.txt Thread 6 Following log\log0006.txt Thread 7 Following log\log0007.txt Thread 8 Following log\log0008.txt Thread 9 Following log\log0009.txt Thread 10 Following log\log0010.txt verbose 1 VERBOSE set to 1 Sat Nov 8 15:09:45.575 2008 : Message from log 6 Sat Nov 8 15:09:45.684 2008 : Message from log 0 Sat Nov 8 15:09:45.701 2008 : Message from log 3 Sat Nov 8 15:09:45.794 2008 : Message from log 3 ... ver ... Sat Nov 8 15:09:48.279 2008 : Message from log 9 ... bose 0 ... Sat Nov 8 15:09:48.309 2008 : Message from log 1 Sat Nov 8 15:09:48.419 2008 : Message from log 9 VERBOSE set to 0 qs 0 quit Quiting...

        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
Re^8: Forks, Pipes and Exec (file descriptors)
by diabelek (Beadle) on Nov 07, 2008 at 23:14 UTC

    Time to go home for the weekend but here's a little bit more I found out.

    I tried to pump various amounts of data through the pipe and found that the pipe seems to complete its first read after 4096 bytes. I'm curious if there is a timeout as well that will let it complete the read.

    This makes me ask the question though, if I set the file handle to non blocking and Windows actually understands the non blocking, why can I only read data in after 4096 bytes have been written in? That tells me that Windows isn't doing non blocking IO but supposedly it does support it.

    I'll play more this weekend but if anyone has any pointers on what to look for, I'd appreciate it. I'm in the dark on this.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://722297]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others meditating upon the Monastery: (6)
As of 2019-08-21 22:29 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?