Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

Win32 limit to number of calls to system()?

by Limbic~Region (Chancellor)
on Sep 12, 2011 at 14:56 UTC ( #925488=perlquestion: print w/replies, xml ) Need Help??
Limbic~Region has asked for the wisdom of the Perl Monks concerning the following question:

All,
I am working on a project on Windows XP 32 bit using ActiveState Perl 5.12. The project involves converting PDF files to text using an external application that doesn't have a command line variant. For this reason, I am using Win32::GuiTest++.

The code screams along until it has converted 64 PDFs then fails. It fails rather silently (just doesn't open the application). It took me quite a while to discover that it was failing after the same number of PDFs each time but once I did, I began to wonder - why the limit? I am closing the 3rd party conversion application using alt+f4 if that matters.

I intend to work around the limit by not closing the application. I didn't do this originally because the application provides no menu or short cut keys and needs to be automated by moving the mouse (absolute pixel positions). I was just wondering - is this limit documented somewhere? Also, is there a way to work around it?

Cheers - L~R

  • Comment on Win32 limit to number of calls to system()?

Replies are listed 'Best First'.
Re: Win32 limit to number of calls to system()?
by BrowserUk (Pope) on Sep 12, 2011 at 15:49 UTC

    The underlying cause is a Perl internal use of a system call WaitForMultipleObjects() which is limited to waiting on 64 objects at any given time.

    From what I remember, the limitation is 64 concurrent forks. Once one completes, you can initiate another.

    I forget all details, but if you'd show the basic layout of your code, I'd probably remember what you need to do to alleviate the limit.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      BrowserUk,
      The code looks something like this:
      for my $pdf (glob('*.pdf')) { my $txt = convert_pdf_to_text($pdf); next if ! interesting($txt); # ... } sub convert_pdf_to_text { my ($file) = @_; my $abs_file = rel2abs(catfile(curdir(), $file)); ## Start Simpo PDF To Text system(1, 'C:\Program Files\Simpo PDF to Text\PDF2Text.exe'); # Locate the window my $wid = WaitWindow('Simpo PDF to Text', 5); die "Couldn't find 'Simpo PDF to Text' window" if ! defined $wid; ## Make sure it is on top SetForegroundWindow($wid); # Convert PDFs add_pdf($abs_file); convert(); # Close the application SendKeys('%{F4}'); my $txt_file = construct_txt_file($file); return '' if ! -r $txt_file; my $data = read_file($txt_file); unlink $txt_file or die $!; return $data; }
      As you can see, I use system(1, $app) to start the application and alt+f4 to close the application (before starting the new one). I have already worked around the problem by leaving the app open. Just not sure why this doesn't work as I would expect.

      Cheers - L~R

        You don't reap. Use waitpid($pid), the pid being returned by system 1.

        Either of these will work:

        1. Use a synchronous system and the start command. This will synchronously run a copy of cmd.exe to run the start command, and it starts the program asynchronously.

          As cmd.exe returns immediately and the synchronous system gathers its exit code, it avoids the accumulation of zombies and the WaitForMultipleObjects() problem:

          for ( 1 .. 100 ) { print "spawning job $_"; system 'start \\Windows\\system32\\notepad.exe'; my $wid = WaitWindow( 'Notepad', 1 ); SetForegroundWindow( $wid ); SendKeys( '%{F4}' ); }
        2. Use the asynchronous system and obtain the pid of the started instance from the returned value.

          Use waitpid to gather the exit code thus avoiding the accumulation of the zombies:

          for ( 1 .. 100 ) { print "spawning job $_"; my $pid = system 1, '/Windows/system32/notepad.exe'; my $wid = WaitWindow( 'Notepad', 1 ); SetForegroundWindow( $wid ); SendKeys( '%{F4}' ); waitpid $pid, 1; }

        Finally, 'Simpo PDF to Text' has a 'batch mode' which would be possible -- if awkward -- to drive programmically, but it might be substantially more efficient.

        I guess you've already looked at command line driven alternatives?


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Win32 limit to number of calls to system()?
by chrestomanci (Priest) on Sep 12, 2011 at 15:48 UTC

    It appears there is a limit on the number of children perl can have under windows:

    http://code.activestate.com/lists/perl-win32-users/12064

    However that thread says the limit only applies to concurrent threads that perl can wait on. Perhaps your code is creating child processes to convert each PDF, and then not waiting on each process, but is leaving zombies.

Re: Win32 limit to number of calls to system()?
by Anonymous Monk on Sep 12, 2011 at 15:26 UTC
      Anonymous Monk,
      That certainly seems plausible but it would mean shame on Windows for not actually releasing resources when the program is shut down (unless using alt-f4 is a bad way to close the program). Perhaps before I write all the code to keep the app open, I will try closing it through the X button.

      Cheers - L~R

Re: Win32 limit to number of calls to system()?
by cdarke (Prior) on Sep 12, 2011 at 16:15 UTC
    Since it appears to be a per-process limit, the way I have got over the limit in the past is to split the workload over a number of other processes. For example, with 128 jobs to run, spawn 4 programs which just run 32 jobs each.

    Not elegant, I know.
Re: Win32 limit to number of calls to system()?
by sundialsvc4 (Abbot) on Sep 12, 2011 at 16:42 UTC

    If you are using an external program to convert the PDFs, I suggest that you should probably just go through a loop that spawns one instance of that program and waits for it to exit ... making very sure that the entire cleanup is done before looping around again.   Most of the time, there is no advantage in spawning multiple copies of a highly I/O-bound process.

    In any case, the number of workers should not be tied to the size of the workload.   A certain number of “cooks in the kitchen” should wait for work to arrive in some queue, and should remove just one order at a time and cook it.   They repeat this until the manager comes along and turns off the lights.   You should be able to easily tune such a system until you “find the sweet-spot” for your chosen hardware.   You want to set the knob to the level which achieves the maximum sustainable throughput, and ignore the number of entries that from moment to moment might be in the queue.

    Your metric, then, would not be queue-size or completions per-second, but the average/std.dev. elapsed time from the moment a new request enters the queue to the point when the final output is returned.

      More utter crap.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

        Yes, because clearly an unbounded number of processes is the fastest way to do things. :P

        Also, short negative comments that explain nothing are the best way to get your point across.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://925488]
Approved by Corion
Front-paged by chrestomanci
help
Chatterbox?
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others scrutinizing the Monastery: (6)
As of 2017-12-16 19:17 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    What programming language do you hate the most?




















    Results (458 votes). Check out past polls.

    Notices?