Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

what are the common causes of "system" command failure on Vista

by Workplane (Novice)
on Apr 03, 2013 at 10:22 UTC ( #1026815=perlquestion: print w/ replies, xml ) Need Help??
Workplane has asked for the wisdom of the Perl Monks concerning the following question:

Hello all. Thanks in advance for your help.

I have a Perl programme which makes "system" calls.

The problem is that sometimes the "system" fails with a return value of -1

What are the reasons that a "system" command might fail?

More detail:

I am running on Vista64 SP2.

Perl is revision 5 version 12 subversion 1

I am logging in remotely to the Vista machine, I believe no other significant processes are running.

I'm only running one instance of my programme on each machine.

Task manager says all the CPU are at 0% before I start.

Apart from the few files which are opened and closed before the "system" everything is local.

The machine has 12 G of physical memory and more processors that you can shake a stick at (OK, 6 dual core)

It is a Dell T5500.

The programme opens and closes a few files and then runs the "system" command.

The external process started by "system" writes a lock file to C:\temp\ and deletes it at the end of its run.

The perl programme, waits for the lock file to exist, then not exist (ie the external process has finished), then starts another instance of the external progamme with "system" again.

The real external programme, is a big CAD system running a macro, but I have replaced it with a local batch file which waits a random number of seconds (between 1 and 20) then returns.

This dummy external programme does the same create and delete of a lock file.

I capture the output of running the external batch file to local files.

In general the process works!

But at some point I always get the system command returning -1 instead of the PID (of something) which it does normally.

If I get a -1, I sleep for 5 seconds and try the system again.

If I get 5 of these failures in a row, I give up and die.

I have never seen only 1 "system" command return -1. If I get a -1, then I get 5 in a row (and die)

So far this has happened after as few as 3 "system" commands.

The most I have seen is 63 successful system commands before one (and then 5) fails.

I'm doing this on 10 machines. All of them are displaying the same problem. They should all be the same in terms of hardware and software.

The .pl is held centrally (as opposed to locally).

I can't compile it because we don't have a compiler loaded and there is 0% chance of getting one or any module that I don't already have.

I have to use "system" not Win32::Process::Create because I couldn't get that to work with a bat file (only a exe).

I don't have Win32::Process:Info.

---> What are the common causes of "system" failing?

---> What can I do in my programme to capture things which will tell me what if causing it?

---> Is it safe to ignore these "system" failures and just keep sleeping and trying again?

The code around the system command is:

my $pid = -999; while ($pid < 0 ) { # childLocalF is something like C:\temp\81.bat my $runMeLine = $childLocalF . ' ' . $runMeArgC; $runMeLine = $runMeLine . ' ' . id(); $runMeLine = $runMeLine . ' ' . '>C:\Temp\81.out 2>C:\Temp\81.err +'; print "runMeLine = $runMeLine\n"; $pid = system( 1, $runMeLine ); print "XXX child PID = $pid\n"; if( $pid < 0 ) { #something went wrong with the system command. print "ERROR: System call failed\n"; print "ERROR: return value from system was $pid\n"; print "ERROR: runMeLine = $runMeLine\n"; $countSystemCallFailed++; if( $countSystemCallFailed > $cfg->{'maxSystemCallFailed'} ) { die "System call failed $countSystemCallFailed times\n"; } print "Sleeping 5 and will try again\n"; sleep( 5 ); } }

Comment on what are the common causes of "system" command failure on Vista
Download Code
Re: what are the common causes of "system" command failure on Vista
by Anonymous Monk on Apr 03, 2013 at 10:26 UTC

    What are the reasons that a "system" command might fail?

    because the program that is executed, next question

Re: what are the common causes of "system" command failure on Vista
by BrowserUk (Pope) on Apr 03, 2013 at 10:31 UTC

    I'll bet 1 to 1p that you are able to run 64 external processes before getting a failure, Can you confirm this?


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
Re: what are the common causes of "system" command failure on Vista
by hdb (Prior) on Apr 03, 2013 at 11:23 UTC

    Pls ignore my ignorance but what does the first argument to system do? Should this not be the command? See system.

    $pid = system( 1, $runMeLine );
Re: what are the common causes of "system" command failure on Vista
by BrowserUk (Pope) on Apr 03, 2013 at 11:42 UTC

    (Assuming the answer to my question above is yes.)

    The problem is that you are creating asynchronous processes, but never gathering their return codes via wait or waitpid.

    In order to provide a way to retrieve the exit codes from asynchronous processes, the windows implementation has an internal limit of 64 un-waited asynchronous processes. Call one of the above two calls once the process has finished, and you will "fix" the problem.

    That said, if your code is starting one process at a time, and then hanging around polling a 'lock' file before continuing; why on earth are you using asynchronous processes and lock files?

    If you just called the synchronous variant of system, you wouldn't need the lock files, or to call wait/pid.


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

      Thanks BrowserUK.

      for some reason I thought it was the sync. system calls that I needed to wait for, but obviously not!

      I added the relevant "wait" (and fixed a few other things) and it works now. Thanks again.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1026815]
Approved by BrowserUk
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others perusing the Monastery: (6)
As of 2014-12-25 02:01 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (159 votes), past polls