http://www.perlmonks.org?node_id=903487

steve has asked for the wisdom of the Perl Monks concerning the following question:

I would like to write a wrapper script for a *nix-based command line utility. Said utility while running is almost constantly printing information, and accepts input regularly regarding this information. What I would like to do is write a script that meets the following criteria:

  1. Opens some sort of connection with the target program and prints to STDOUT the same information that it receives.
  2. Accepts input from STDIN and prints the command directly to the same bidirectional connection.
  3. Can use the information from the program output to determine automated input based on a predefined set of rules (like Expect).

The first two items here amount to straight passthrough as far as I am concerned. This is essentially the same user experience as running the program by itself outside of perl. The last item is essentially an expect script.

A good test case for this (although practically less useful) would be semi-automated control of some process like top.

Pipe Opens is documented as insufficient to meet this set of criteria. Bidirectional Communication with Another Process has some great information, but cautions that working with buffering and pseudo-ttys will ruin my day. Expect is suggested as the best solution for compatible environments.

With that wonderful documentation it seems that the solution lies with Expect as I had presupposed, but I am having some difficulty formulating a mental map of how this works logically. I have used Expect in the past for many things, but only to automate entire workflows, and not in an environment where user input may happen at any time, and the expect-controlled target never ceases to return results (does not wait for input).

$process->expect(undef); # Forever until EOF

Will always emulate the target until EOF, but does not afford any interactivity.

I fully Expect that the interconnect or interact method will be the solution to address these criteria simultaneously, but the documentation for such is rather sparse. Additionally I am finding the search for information on the Expect module to be very difficult due to the case-insensitive nature of most searching algorithms and the ubiquitous usage of the word expect in entirely unrelated topics.

While I am working on this I would like to ask if anyone else has dealt with this, and if they have any advice, other reference points, or sample code I could review.

Replies are listed 'Best First'.
Re: Bidirectional IPC with Expect and Passthrough
by BrowserUk (Patriarch) on May 07, 2011 at 09:22 UTC

    Nobody has mentioned IPC::Open2 or IPC::Open3 yet.

    The tough part about doing this is dealing with buffering.

    Assume the conversation goes: Supply input -> get output -> supply input dependant on last output -> get output ...

    If the first set of output is insufficient in quantity to cause all the buffers (the program's output buffer and the pipe buffer) to be flushed, you end up in a position where the driven program is waiting for new input, and the driving script is still waiting for the last set of output.

    The former won't produce any more output until it gets more input; and the latter won't supply any more input until it has seen the last output. Result: deadlock.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      This might not be relevant to the OP, but I'm not so sure about Open3, on win32 it can hang and create zombies.

      See Pod::Wrap t/deparse_cmp.t, this test hangs/fails with "An operation was attempted on something that is not a socket." and creates zombies

        The problem with that test is that it tries to use select on pipe handles, which doesn't work on Windows.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.

      IPC::Open2 may also work. IPC::Open3 is not necessary for my application as described in Bidirectional Communication with Another Process:

      "There's also an open3() for tridirectional I/O so you can also catch your child's STDERR, but doing so would then require an awkward select() loop and wouldn't allow you to use normal Perl input operations."

      The difficulty here is that the conversation follows a slightly different path - user input and program output may occur simultaneously.

      Perhaps an a more specific example would be useful. Let's say that I want to make a wrapper for top which allows me to receive the output of the top program (as regularly as possible) and also allows for me to parse the output and if a particular listing is present, return a keystroke/string automatically to top to switch the view so that that process is more easily noticed.

      So with output such as:

      PID COMMAND %CPU TIME 6555 top 1.5 00:01.42 6537 bash 0.0 00:00.04
      no automated action would need to be performed, whereas with this output:
      PID COMMAND %CPU TIME 6555 top 1.5 00:02.42 6591 nasty_script 9.9 00:01.01 6537 bash 0.0 00:00.04
      would automatically send a string to top to update the view to be more like:
      PID COMMAND %CPU TIME 6591 nasty_script 9.7 00:01.11 6555 top 1.1 00:02.52 6537 bash 0.0 00:00.04

      The crux of this issue is that user input and program output may occur at any time. I am not aware of a method in perl that allows for receiving data from two sources simultaneously.

      With that being stated, I do not need instant reads/writes from either the program output or the user input. What I do need is a way to not lose any potential data in between. If it takes a few milliseconds to read user input, and the program has output during that time I will need to read that after the user input. I would also like to avoid losing keystrokes for user input while the script is reading/checking program output.

      Perhaps this would also require a fork so one child could always read/write from/to the program, and the other would always read/write from/to the user?

        Perhaps this would also require a fork so one child could always read/write from/to the program, and the other would always read/write from/to the user?

        T'is easier using threads.

        I don't have top or anything I can easily substitute for it in an example, but what you are asking for should be relatively trivial using threads.

        You have two distinct, and extremely simple loops you need to perform:

        1. Read whatever comes doen the pipe from the child process and print it to the terminal.
          print while <$fromKid>;
        2. Read anything that the user types and forward it to the child process via the pipe:
          print $toKid while <STDIN>;

        But, as you say, the problem is you need to do both at the same time.

        There are two traditional ways of approaching this problem.

        • A select loop.

          Where you loop around in a busy loop asking is there anything to do? And O I got something, now if it is this do that, or if it is that do something else etc.

          But select loop processing doesn't really play well with buffered IO, so you have to use sysread to avoid blocking. But that means that you are now responsible for doing your own line buffering. And if you have multiple sources, then you have to juggle multiple lines buffers.

          And then there is the problem of dealing with the situation where some of the processing in response to some input takes longer than you can afford to ignore new input, so you have to break that processing up into chunks and that means storing global state to decide what you;ve already done and what else need to be done.

          All very messy and nasty and easy to get wrong.

        • Fork a different process to deal with each source of input and pipe the results back to a parent.

          Except that all you've done is move the goal posts. Ultimately, if the parent is going to control the child, it still needs to monitor and respond to multiple inputs and you are back to needing a select loop with all its inherent problems.

        Now consider a threaded solution:

        #! perl -sw use strict; use threads; use IPC::Open2; my( $toKid, $fromKid ); my $pid = open2( $fromKid, $toKid, 'not-top' ); async{ print while <$fromKid>; }->detach; print $toKid while <STDIN>; kill 9, $pid

        And that is pretty much it. Some extra stuff to handle closing things down and possible errors. But essentially, exactly what you'd like to use. Two loops that run concurrently. Simplicity incarnate.

        Note: That almost certainly won't work with top as is. The reason is that top uses non-blocking, delimiter-less input for its commands, and that doesn;t work via a pipe. Basically, pipes buffer their input and only pass it on to the other end once they've accumulated a buffer full. With a (typically) 4096 character buffer, you'd have to send 4096 single character commands before top would see the first one, and then it would see all 4096 in quick succession and the result would be very confused.

        So, if the program you actually want to control is top, or any program that does delimiter-less input for commands, then you are probably out of luck--using open2 or expect or anything else. But, if you want to control a program that takes normal line-oriented input for commands, then threading is far simpler than other approaches.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Bidirectional IPC with Expect and Passthrough
by John M. Dlugosz (Monsignor) on May 07, 2011 at 03:55 UTC
    How is using pipes connected to standard input and output of the target process not good enough to solve this problem?

      Thank you for the quick response.

      From Pipe Opens:

      "If you would like to open a bidirectional pipe, the IPC::Open2 library will handle this for you."

      From Bidirectional Communication with Another Process:

      "While this works reasonably well for unidirectional communication, what about bidirectional communication? The obvious thing you'd like to do doesn't actually work:"
      open(PROG_FOR_READING_AND_WRITING, "| some program |")
      "and if you forget to use the use warnings pragma or the -w flag, then you'll miss out entirely on the diagnostic message:"
      Can't do bidirectional pipe at -e line 1.

      Am I misinterpreting what that says? I may need to take a step back and revisit this in a bit to get a clearer perspective.

        You need to start "some program" and make two pipes: one feeding to its standard input, and one reading from its standard output. I think there is even a pipe3 version, but don't recall if that's the right thing.

        Without the fancy syntax, you can create a named pipe, and then as a separate step use the redirection syntax ">" to send the output of the program to the pipe you previously opened. You have the other end and can read from that. You can use the pipe syntax for the other direction (only) or use "<" and ">" in the `system` construct to do them both with named pipes.