Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Re^6: Forks, Pipes and Exec (file descriptors)

by BrowserUk (Pope)
on Nov 06, 2008 at 20:48 UTC ( #722076=note: print w/replies, xml ) Need Help??


in reply to Re^5: Forks, Pipes and Exec (file descriptors)
in thread Forks, Pipes and Exec

I can make a change in the parent to STDOUT that affects the child's STDOUT.

It depends what change you are making and how you are perceiving that the the child has been affected.

In general, I do not consider Perl's Win32 fork emulation worth the effort of bothering with. There are simply too many differences between the platforms and holes in the emulation, that make *nix techniques fail to work. A few examples:

  • In your snippet above, you create a pipe and set it up as stdout and then you attempt to set it non-blocking, presumably with the intent of use select upon it.

    But win32 anonymous pipes cannot be set non-blocking and so cannot be used in conjunction with select.

    Note:Win32 pipes can be used in a non-blocking fashion--using peeking (polling) or asynchronous IO or overlapped IO, but none of these fit well with the select mode of operations. This is not a limitation, but rather a difference in design philosophy that is hard to reconcile in a portable fashion.

  • If you do a fork followed by exec on *nix, then the pid returned to the 'parent' is process handle to the newly started process.

    If you do the same thing using the win32 emulation, the 'pid' is actually a disguised thread handle to a thread, and the process you "exec'd" is actually started as an entirely new process (using what is effectively the same as system), with an entirely different pid to that returned by fork--and to which you can never get access.

    Consequently, anything the parent does with that pseudo-pid only affects the thread--not the "exec'd" process. Eg. killing the pseudo-pid (or attempting send almost any signal) will terminate the thread, but leave the process running.

    Again, it is perfectly possible to set up bi-directional communications between a parent and child process under win32 at the C-level, but far harder to see how to make this available to Perl programs ina platform agnostic way.

  • Signals are not implemented (by the system) on win32, and only a handful (2,3,15,21) are emulated in a way that can be trapped.

    That means you will never receive a SIGCHLD in the parent. Whilst wait and waitpid are emulated, it is done by having the child thread block in a system wait on the child process, which makes for very different and confusing semantics.

Those are just a few of the limitations that come to mind. I've encountered several more anomalies that don't. IMO these difference make trying to code portable programs using fork unworkable if win32 is a target platform.

If you application calls for single direction piping of data to or from the child, then using the forked-open is effective and far more portable. If you bi-directional communications between parent and child, then threads can provide an effective solution that is again, far more portable than fork.

Your question to date simply asks about forking and pipes ,without giving any hint as to the actual application, so it is impossible to suggest alternatives to that approach.


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
  • Comment on Re^6: Forks, Pipes and Exec (file descriptors)

Replies are listed 'Best First'.
Re^7: Forks, Pipes and Exec (file descriptors)
by diabelek (Beadle) on Nov 07, 2008 at 20:59 UTC

    I'll answer some of the points you brought up first so you can point out any flaws in my thinking:

    1. I'm using the non-blocking because I have a thread that executes a requested command that should only take a few milliseconds. It then reads the output pipes from all my external applications and stores whatever it read into a buffer. If there isn't anything on the pipe that doesn't matter, I just continue and service the next request that may/may not be waiting. Select is never used so I didn't see this as an issue.
    2. The only signal I need is to tell the process to die. I've been using SigKill since I just want the forked executable to die. This seems to work in Windows and Linux.
    3. Thank you for the info on open. I thought the open( fh, "cmd |") ran in the foreground. I've switched to using that and it has cut out quite a bit of code. It also seems to work even better for killing the process. I'm still using sigkill since I don't know of a better way to do it in perl. Suggestions?
    4. The goal of the module, briefly addressed in the first point, is to provide an instance of the module that can start another process/thread/etc. The two processes will communicate with each other in a command/result xml format (all commands come from the instantiated module and are sent to the child process). The child process is the responsible for servicing those requests which includes starting up additional cli programs and buffering their output. The buffer will be regularly checked for errors and when the parent process requests the errors, they are passed back up. I hope that explains things.

    From what I've read POE might have a been something to look at but I need this done sooner than later and I already have this module working in Linux... Windows is always the problem (jab). Plus I would like to get a better understanding of the inner workings of perl

    The last problem I seem to be fighting is that when I run the code to set a pipe as nonblocking, it isn't really nonblocking. I seem to get a chunk of data every 15 seconds or so on Win2k3 using this code:

    pipetest.pl

    use IO::Handle; print "before the pipe\n"; sub getpipe { my $hash2 = shift; $hash2->{pid} = open( $hash2->{stdout}, "c:\\perl\\bin\\perl.exe ye +s.pl |") or die; $hash2->{buffer} = ""; $hash2->{stdout}->blocking(0); sleep( 3 ); return $hash2->{pid}; } my $hash = {}; getpipe($hash); my $fh = $hash->{stdout}; print( "pipe created\n" ); while( <$fh> ) { print "got from pipe: $_"; sleep( 1 ); } print( "done\n" );

    yes.pl

    while( 1 ) { printf "%s yes\n", time(); sleep( 1 ); }

    sample output:

    got from pipe: 1226094472 yes 1226094473 yes 1226094474 yes 1226094475 yes 1226094476 yes 1226094477 yes 1226094478 yes 1226094479 yes 1226094480 yes 1226094481 yes 1226094482 yes 1226094483 yes 1226094484 yes 1226094485 yes 1226094486 yes 1226094487 yes

      The last problem I seem to be fighting is that when I run the code to set a pipe as nonblocking, it isn't really nonblocking. I seem to get a chunk of data every 15 seconds or so on Win2k3 using this code:

      That problem has nothing to do with blocking. You are simply Suffering from buffering. The addition of $|++; to the top of yes.pl and you will get your output once per second.

      However, if the connected process is not a perl script, but some executable that doesn't disable buffering, then there may be nothing that you can do about this unless you can modify and re-build that executable. If you were using the underlying OS api calls to create the pipe, then you can disable buffering from either end, but Perl doesn't expose that functionality.

      If there isn't anything on the pipe that doesn't matter, I just continue and service the next request that may/may not be waiting. Select is never used so I didn't see this as an issue.

      Hm. Unless I am misunderstanding you, it does matter. If you try to read from a pipe and there is nothing available, then the read will block until there is. Even if the pipe is non-blocking. Which means you won't be able to "continue and service the next request" once you enter a readstate on one pipe until something becomes available on that pipe.

      Setting the pipe non-blocking allows you to use select to discover when the is something to be read and only enter the read when you know it will be satisfied immediately. But, I said above, there is no way to set an anonymous pipe non-blocking on win32.


      With respect to an alternative way to code yout application so that it will work on Win32. I'm afraid I still find your descriiption insuffucuently clear to suggest anything. There are several bit of this latest post that leave me confused. For example,

      • when you say
        "I'm using the non-blocking because I have a thread that executes..."

        Do you mean you are already using threads--explicitly? Or are you using (pseudo-)processes via fork?

      • And
        " is to provide an instance of the module that can start another process/thread/etc."

        Again, which is it? Processes? Threads? Both? And what is "etc."

        Please don't take that as pendantry. Too often people will use these terms interchangably, but it is important to distinguish between them.

      • The two processes will communicate with each other in a command/result xml format (all commands come from the instantiated module and are sent to the child process). The child process is the responsible for servicing those requests which includes starting up additional cli programs and buffering their output. The buffer will be regularly checked for errors and when the parent process requests the errors, they are passed back up.

        That implies you are talking about bi-directional communications via pipes--Expect style. I don't think anyone has got that to work from Perl on Win32.

      I hope that explains things.

      Sorry, but no it doesn't. At this point, the overall architecture of your application is about as clear as mud to me. It involves child processes, and pipes, and sending commands to the children, and getting replies back, and running multiple of these concurrently, but I have no overview.

      And quite how you are achieving this on linux without using select is beyond me?

      If you have code that works on Linux and don't mind letting me see it--either here or via email--that would certainly be the quickest way for me to understand what you are doing and perhaps be able to suggest how to make it work on Win32.

      From what I've understood of what you've said POE certainly sounds like an option, though as you imply it would probably mean re-writing everything you have and involve a pretty steep learning curve. I'm also not sure how portable POE code is to Win32.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

        I can definitely change the buffering on the other application and that helps a bit.

        As for the non-blocking, I thought non-blocking allowed a read to complete even when there was no data. If there was no data, I thought either EOF or EAGAIN or something was set. The example above can be changed to this to get it working in linux and loop as fast as possible.

        use IO::Handle; use Timer::HiRes qw(usleep); print "before the pipe\n"; sub getpipe { my $hash2 = shift; print( "starting yes.pl " . time() . "\n" ); $hash2->{pid} = open( $hash2->{stdout}, "/usr/bin/perl yes.pl |") o +r die; $hash2->{buffer} = ""; $hash2->{stdout}->blocking(0); sleep( 1 ); return $hash2->{pid}; } my $hash = {}; getpipe($hash); my $fh = $hash->{stdout}; print( "pipe created\n" ); my $count = 0; while( 1 ) { my $data = <$fh>; if( $data != "" ) { print "got from pipe: $data"; } print( "." ); usleep( 10000 ); } print( "done\n" );

        I will try to post the script externally and provide a link so you understand what I'm trying to do a little better. To answer a couple of your questions about threads/processes, I'm really using both. I start the thread/process by calling fork. So in Linux, I understand that I'm using the true fork command. In Windows, I'm using the pseudo fork that is really thread when you get to the bottom of things. Which one I use doesn't matter. I just need a separate branch of execution so the parent script can continue doing its normal thing while my branch of execution handles the external processes. Thats probably an ignorant statement but that's why I'm posting :)

        I've found another perl script that sounds like what I'm doing: logtail. I believe the basic idea is the same. I need to review the code more to see what they have done.

      Time to go home for the weekend but here's a little bit more I found out.

      I tried to pump various amounts of data through the pipe and found that the pipe seems to complete its first read after 4096 bytes. I'm curious if there is a timeout as well that will let it complete the read.

      This makes me ask the question though, if I set the file handle to non blocking and Windows actually understands the non blocking, why can I only read data in after 4096 bytes have been written in? That tells me that Windows isn't doing non blocking IO but supposedly it does support it.

      I'll play more this weekend but if anyone has any pointers on what to look for, I'd appreciate it. I'm in the dark on this.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://722076]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others perusing the Monastery: (8)
As of 2019-08-20 09:26 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?