Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

When starting a process, at what point does "open()" return?

by tid (Beadle)
on Aug 18, 2003 at 00:56 UTC ( #284462=perlquestion: print w/replies, xml ) Need Help??

tid has asked for the wisdom of the Perl Monks concerning the following question:

Greetings enlightened monks,

At what point does the open call return, when used to open a process using the pipe symbol ("|")? Obviously, open is asynchronous (as perl knows nothing about how the long the process will hang around when it starts it), but what criteria does Perl have to satisfy before the open call returns?

Our problem comes from the fact that we have a somewhat flakey executable which we use for testing. Sometimes it hangs on start up and manages to hang the test scripts as well. As we're using about 30 of these processes for testing, we would like to determine which executable is causing the problem. The script that spawns the process could write the process IDs of processes it has started to a log file, but if the parent process is blocked on the open call, we're stuck.....

Many thanks,
Mike

Replies are listed 'Best First'.
Re: When starting a process, at what point does "open()" return?
by esh (Pilgrim) on Aug 18, 2003 at 02:17 UTC

    I'm not speaking from a personal knowledge of the Perl source code, but I have always been under the assumption that open calls with leading or trailing pipes do a fork/exec to start the subprocess.

    If the exec fails (e.g., can't find the command) then the open returns a failure (in the parent process). If the exec succeeds, then the open returns success (in the parent process) and your program can start reading or writing to the sub-process.

    You may get more clues about behavior you're seeing from the fork and exec documentation.

    perldoc -f fork perldoc -f exec

    (All of my posts should be implicitly prefaced with the fact that I only speak of Unix/Linux/Solaris/... In my world I tend to forget that there are environments like Windows/OS2/VMS...)

    -- Eric Hammond

Re: When starting a process, at what point does "open()" return?
by graff (Chancellor) on Aug 18, 2003 at 03:11 UTC
    Our problem comes from the fact that we have a somewhat flakey executable which we use for testing. Sometimes it hangs on start up and manages to hang the test scripts as well.

    And you're using a flakey executable because...? What makes you sure that you're getting stuck in the open statement? (Showing us some code around the open statement might help.)

    As we're using about 30 of these processes for testing, we would like to determine which executable is causing the problem.

    Do you mean you have about 30 different flakey executables, or that you are trying to run 30 instances of the same piece of crap? Are you talking about looping 30 times over an "open; do something; close" type of block, or are you trying to have 30 pipeline file handles open at once? (Don't some OS's have a problem with opening too many file handles?)

    If you haven't tried this yet, you could do something like:

    $|++; # turn off output buffering on STDOUT my @pipeproc = ("flakey_writer |","| flakey_reader",...); for (0..29) { my $start = time(); my $pid = open( FH, $pipeproc[$_] ); my $now = time(); print "opening $pipeproc[$_] returned $pid in ",$now-$start," sec\n" +; close FH; }
    (or, some variant that would be more relevant to your needs). If all the processes start and you see how long each one took, then it must be some other stage in your script where you hang (i.e. a read or write, or maybe something unrelated to the pipeline process). If the loop doesn't complete, you'll at least know where the problem starts. Then you'll want to see whether the problem always happens at the same iteration.

    Maybe you've been over this ground already, but your post didn't really give enough information.

      Thanks for your response. My apologies for the lack of completeness of my post. I was trying to pose a more general question, rather than focus on the specifics of my situation. In response to your queries:

      And you're using a flakey executable because...?

      From bitter experience, I would say that complex software projects tend to require moderately complex tools to test some aspects of them. The unfortunate part is that it is rare to find the same effort put into testing tools as it is into the original project.

      I am currently on a short term contract (6 weeks total), in which I have not had the opportunity to spend the time debugging their test executables. I have access to the scripts, and I do what I can.

      What makes you sure that you're getting stuck in the open statement? (Showing us some code around the open statement might help.)

      The following code is a cut from the script:

      open(STDOUT, "> $l_cqr_std_output_dir\\cqr_out_$l_counter_x.tx +t"); $l_proc_id = open ($handleName, "| $command "); print STDOUT "CQRD Process ID: [".$l_proc_id."]\n";

      The results when it fails (which is only fairly occasionally) is that the output log file created by the first line is created, but the print line is never executed, and the script hangs. If you find the rogue process created by the open command and terminate it using the Task Manager, the script recovers, and the output log file shows the process ID of the CQRD that you just terminated.

      or that you are trying to run 30 instances of the same piece of crap

      Bingo.

      Don't some OS's have a problem with opening too many file handles?

      It seems to work fine unless the errant process has a problem on startup. While I can't completely discount the possibility that Windows is jamming its head firmly into something unholy, I think it's unlikely this time.

      Cheers
      Mike

        Ah. This makes it very clear. Thanks (and ++!!)

        So, while I don't understand what could cause the behavior that you observe (let's hope someone else looks more closely at this thread and can explain it), there might be some way to write another perl script that automates the recovery procedure you've been doing manually -- but I'm only guessing, because my exposure to process-control details on windows is nil.

        But consider... if you make the outputs to the log file "atomic" (add the extra overhead to open/write/close each time you append a message to the log, so some other process has a chance to read it while this test script is running), you might be able to run a separate script that loops on "check the log file; sleep". Make sure the problem script (which is trying to launch the flakey executable) logs when it's about to start a process (including the iteration number or some other distinct id), as well as when the open call returns.

        Now, the separate monitor script could figure out when the problem script is hanging; it knows the pids of the jobs that have been opened successfully (they're in the log file), and now it just needs to find a pid in the windows process table that is associated with the flakey executable, but isn't in the log file. The monitor script kills that "outlier" pid (could even append to the log file to report that), and the problem/test script would move on.

        As I said, I'm only guessing that something like this would work -- I don't know how you would actually implement it. And of course it doesn't really answer you initial question or solve the real problem. But if it works as intended, you could at least move on towards whatever your "true" objective may be.

Re: When starting a process, at what point does "open()" return?
by MarkM (Curate) on Aug 18, 2003 at 05:41 UTC

    On UNIX systems, open("...|") works similar to the C function popen(). It returns after calling fork(). There is no guarantee that the process can actually be loaded.

    On WIN32 systems, open("...|") creates a new process using CreateProcess(). CreateProcess() will fail immediately if the process cannot be initiated, therefore open() will fail as well. Portable code should not rely on this behaviour.

    UPDATE: Perl does have some magic implemented for most UNIX platforms that attempts to pass the error code from child process to parent before open() returns. This should catch most cases involving permissions or the executable not existing.

      On UNIX systems, open("...|") ... returns after calling fork().

      If this is true, how does open return an error if the program is not found, not executable, or has a bad shabang spec?

      perl -e 'open F, "nosuchprogram |" or die "open: $!\n"; print "don +e\n";'
      outputs:
      open: No such file or directory

      I had assumed that this was because it was only returning after the exec() but I'll admit this seems to take a bit more inter-process communication.

      -- Eric Hammond

        I did some extra checking for you. I hate unexplained behaviour as well... :-)

        With open("program |") Perl opens two pipes on systems that provide fork() (UNIX). The first pipe is for capturing STDOUT, and remains open after execve(). The second pipe is for passing an errno value from the new child to the parent, and is automatically closed as part of execve(). If execve() fails, the value of errno is written in native binary format over the second pipe. The caller monitors this second pipe, and waits until the second pipe closes before it succeeds.

        I wasn't able to quickly determine when this code was introduced into Perl. It does make things convenient. :-)

        Cheers,
        mark

Re: When starting a process, at what point does "open()" return?
by jasonk (Parson) on Aug 18, 2003 at 02:09 UTC

    The open call will return when the process exits, if the process hangs and does not exit, then the open will not return unless you arrange for it to be interrupted early (such as by setting an alarm()).

    Update: Doh! Of course open doesn't wait for the process to exit, how could you read from the pipe if it did? Apparently I need more sleep...


    We're not surrounded, we're in a target-rich environment!

      The open call will return when the process exits
      This is not true on operating systems I'm familiar with and I doubt it's true on Windows, either. The open call returns when the process starts not when it exits. You can then have the child process talking with the parent process while both are running at the same time.

      Here's proof:

      perl -e 'open F, "sleep 60 |" or die "$!\n"; print "done\n"'

      I know that "sleep 60" takes about 60 seconds to run, but my top perl program prints "done" right away and exits. It does not wait for the "sleep 60" to exit.

      The related system call does wait for the child return when the child process exits.

      -- Eric Hammond

      That's not how it works. If it worked that way, you could never read from a subprocess or write to it.
Re: When starting a process, at what point does "open()" return?
by tid (Beadle) on Aug 21, 2003 at 00:00 UTC

    Thanks all for your responses. The result in my specific situation is that I can't fix it unless I have access to the test tool code (I'm on windows, so I can't easily time out the calls - no access to signals etc etc).

    However, I have learned more about perl from the exchange, which is always valuable.

    Cheers!
    Mike

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://284462]
Approved by blokhead
Front-paged by broquaint
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others contemplating the Monastery: (5)
As of 2021-05-17 11:34 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Perl 7 will be out ...





    Results (154 votes). Check out past polls.

    Notices?