Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number

threads, forks and SSH

by ralph2014 (Initiate)
on Dec 15, 2015 at 09:08 UTC ( #1150347=perlquestion: print w/replies, xml ) Need Help??

ralph2014 has asked for the wisdom of the Perl Monks concerning the following question:

Morning perl monks! Im just testing the waters here, im fairly new to perl. what I need to do is this: I will have a text file with 4000 IPs in it I need to go through this list as quickly as possible, for each IP I will start an SSH connection and run several commands and catch the stdout of the last command that will return a comma separated output which will need to include the IP and the output and put this into a CSV file ALL of the last commands output then need to go into one CSV file for all the IPs. I was thinking of loading the IP list into memory rather than cycling through it. I was thinking of then forking rather than full on threads and temporarily writing the forks output to a memory file (because forks accessing the same csv would cause problems) and then finally write the output to the csv. I also intend to use LOG4PERL to keep a track of things and another file for when SSH has failed to connect. The PC this will be run is a HP 4 core server with raid 5 drives. SO guys I would be grateful for some input to see how you would do this.

Replies are listed 'Best First'.
Re: threads, forks and SSH
by salva (Canon) on Dec 15, 2015 at 10:03 UTC
    You can use Net::OpenSSH::Parallel for that:
    use Net::OpenSSH::Parallel; my @hosts = (...); my $pssh = Net::OpenSSH::Parallel; # register the list of hosts: $pssh->add_host($_) for @hosts; # declare the commands you want to run in every host: $pssh->push('*', cmd => 'do.this'); $pssh->push('*', cmd => 'do.that'); $pssh->push('*', { stdout_file => '%HOST%.csv' }, # %HOST% is replaced by t +he host name cmd => 'make.csv'); # let Net::OpenSSH::Parallel take care of everything: $pssh->run; # finally, collect the errors and the temporary CSV files. for my $host (@hosts) { my $error = $pssh->get_error($host); if ($error) { print STDERR "Failed to generate CSV from host '$host': $error\n"; } else { system cat => "$host.csv"; unlink "$host.csv"; } }

    update: code adjusted according to some of the comments by serf below.

      Meh about:
      system 'cat $host.csv'; system 'rm $host.csv';

      You have open and print, you have unlink, and you have ways of checking that what you wanted to do worked correctly.

      Perl isn't a shell script.

      I would suggest that it's more reliable, more efficient, more cross-OS portable, and generally better practice to do these things natively in Perl than shelling out via system.

      I get given grief at my $JOB by "scripts" that people who don't know Perl have written in Perl which should be in Perl but are full of awk and sed and grep and wc pipes and other badness. These scripts spawn many more processes than they need to, use more memory, and end up doing knarly things like jamming up, failing silently or causing excessive IO over NFS etc, bringing down our systems.

      When we post code on perlmonks, people who don't know as much as we do will take what we've written without understanding it and use it thinking it's the *right* way to do it. I think we should try and be responsible and steer them in the right direction.

      /me dismounts soapbox

        Morning again Monks! Thank you so much for the replies. This module sounds just what im looking for. Im not too bothered about things being a little dirty aslong as it gets it done quickly and I get decent output. I should add its a windows server 2003. ralph
          A reply falls below the community's threshold of quality. You may see it by logging in.
Re: threads, forks and SSH
by mr_mischief (Monsignor) on Dec 16, 2015 at 15:30 UTC

    I'm not saying not to do this in Perl. However, there are many tools already for this sort of task.

    Are you building a distributed application? Something like RabbitMQ or some other information queue or job queue may help. There's OpenMPI and the Parallel::MPI or Parallel::MPI::Simple library wrapping it if you need that sort of parallel interaction.

    Are you doing configuration of the systems? Is this a one-off thing or will it grow to manage more things? Have you looked at GNU Parallel? How about multissh? CFEngine may work for you, too. On the really comprehensive end there are Saltstack, Ansible, Chef, or Puppet. They'll manage almost every aspect of every system across a whole server farm and report status back about everything they've done. Rex is a configuration management system that supports Windows, Linux, the BSDs, Solaris, and OS X and is written in Perl even. I'll admit I don't have any experience with Rex myself, but if you're wanting something in Perl you don't have to write from scratch it may be worth a long, hard look.

    There's a comparison page for open source configuration management systems on Wikipedia.

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1150347]
Front-paged by Corion
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others surveying the Monastery: (4)
As of 2022-11-29 05:16 GMT
Find Nodes?
    Voting Booth?