Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

Threads and file descrptors

by markseger (Beadle)
on Dec 20, 2010 at 16:20 UTC ( #878047=perlquestion: print w/ replies, xml ) Need Help??
markseger has asked for the wisdom of the Perl Monks concerning the following question:

I think this is a threads question but will defer to your wisdom...

I have a script that wants to execute commands on multiple remote systems and process their output. I decomposed it into a very simply script that I hope makes my point using a simple 'cat' command and a test file in /tmp.

#!/usr/bin/perl -w my @FD; my @hosts=('poker', 'poker'); for (my $i=0; $i<@hosts; $i++) { my $a="ssh $hosts[$i] cat /tmp/test"; open $FD[$i], "$a|" or die; } for (my $i=0; $i<@hosts; $i++) { my $fd=$FD[$i]; my $line=<$fd>; print "LINE: $line\n"; }
As you can see I first loop through all the node names, which I've hard coded to be the same system for the sake of demonstration, and execute the command by opening it in a pipe. In the second section I'm simply looking at the first line of output but this would actually much more complex.

This all works just fine but I'm concerned with scaling. I tried running in on a couple of hundred systems and each ssh command is executing serially and I thought if I threw in an & it would make things asynchronous but I think the open is waiting on a socket connection to be established with the pipe.

It seemed to me if I could fire off each open in a separate thread, they'd all run in parallel and run much faster. The thing is from my reading and playing around with threads I think one can only share simply arrays and hashes and I believe file descriptors are more complex.

So the question is do I really needs threads to solve this problem and if so how, OR is there just some faster way to get the pipes to open? Ultimately I'd like to see this be able to run on several thousand machines and not take forever to get past the initial opens.

-mark

Comment on Threads and file descrptors
Download Code
Re: Threads and file descriptors
by JavaFan (Canon) on Dec 20, 2010 at 16:43 UTC
    I think the open is waiting on a socket connection to be established with the pipe
    What makes you think that? Perl's open is smart, but I do not think it's so smart it actually knows the command it's executing is going to open a socket, and is going to wait for that.
    So the question is do I really needs threads to solve this problem and if so how, OR is there just some faster way to get the pipes to open?
    No, you don't need threads. The classical solution is a select loop (which you can do easily in Perl, and there are also some CPAN modules for it; POE for instance). Alternatively, you can fork.

    If it's just a matter of opening a bunch of pipes, and reading a single line from each of them, I'd opt for select loop. Care should be taken, don't use readline, but a read.

      bingo... there was some code I discovered was executing between open calls and it WAS trying to read the next record on the pipe and so naturally had to stall while waiting for it to appear. It's so obvious after you've heard the explanation. Thanks... -mark

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://878047]
Approved by Perlbotics
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others romping around the Monastery: (14)
As of 2014-12-19 20:30 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (91 votes), past polls