Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

Net::OpenSSH fastest way to reconnect to a rebooted machine?

by Anonymous Monk
on May 17, 2011 at 01:19 UTC ( #905187=perlquestion: print w/ replies, xml ) Need Help??
Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I'm using Net::OpenSSH to execute commands on a few remote servers. After the first command completes, I need to reboot the machine and reconnect after it comes back up, then execute more commands. Any advice on an efficient way to reconnect to the rebooted server via SSH as soon as it's back up? i.e. what's a good way to test and see when sshd is once again accepting connections? Here's what I have so far:

my %ssh; my @hosts = qw/ server1.example.com server2.example.com /; my @cmd = "list of shell commands ending with reboot"; for my $host (@hosts) { $ssh{$host} = Net::OpenSSH->new($host, master_opts => [-i => "/path/ +to/ssh_key"], async => 1); $ssh{$host}->error and die "SSH connection to $host failed: " . $ssh +{$host}->error; } for my $host (@hosts) { $ssh{$host}->system("@cmd"); }

This will get me to the point where all of the initial commands are run, and the servers are rebooted. From here, I can't think of a good way to quickly determine when SSHD is running so I can reconnect and continue with the post-reboot commands. I tried variations using $ssh{host}->test("some shell command"); but I'd like to come up with a better solution. Any ideas?

Comment on Net::OpenSSH fastest way to reconnect to a rebooted machine?
Select or Download Code
Re: Net::OpenSSH fastest way to reconnect to a rebooted machine?
by thewebsi (Beadle) on May 17, 2011 at 05:31 UTC
Re: Net::OpenSSH fastest way to reconnect to a rebooted machine?
by patcat88 (Deacon) on May 17, 2011 at 07:01 UTC
    How to do this without polling, and with an unmodified target machine?

    This might be very complicated to do Perl (there is hope, Net::Pcap exists), but listen on the ethernet port (sniffing, pcap, etc) for the broadcasted DHCP packet that the rebooting machine will put out on the ethernet wire. What about SNMP against the DHCP server?

    If you can modify the target machine, have the target machine do a wget on boot on a dynamic dns address, that happens to be your machine, you can run a small perl webserver based off HTTP::Daemon on your side, the server on boot connects to your perl mini-server and the perl script on your side wakes up and continues with the rest of the SSH commands. Remember to consider timeouts and retrys by wget, you can deadlock if wget can't wake up your Perl SSH script for some reason. The keywords "watchdog" or "heartbeat" are relevant. Also remember if you make a miniserver off HTTP::Daemon, what happens if 2 servers try to connect to the miniserver at once (assuming you dont serialize the SSH command execution and rebooting, you probably do)? In my experience, the OS just queues the 2nd attempt, if you dont call another accept() fast enough, the remote side will time out.

    On some networks you might get an ICMP unavailable packet if the ethernet port of the server is down, rather than a plain old timeout. Might wanna somehow separate in code a plain timeout from an ICMP unavailable.
Re: Net::OpenSSH fastest way to reconnect to a rebooted machine?
by salva (Monsignor) on May 17, 2011 at 10:21 UTC
    After the reboot, just keep trying to establish a new connection over and over until it succeeds.

      What you've described is exactly what I'm trying to figure out how to do. The question is how do I test that the connection has either succeeded or failed? I tried this, since the Net::OpenSSH documentation says that a failure returns undef or an empty list:

      $ssh{$host} = Net::OpenSSH->new($host, master_opts => [-i => "/path/t +o/key"]); if(defined $ssh{$host}) { print STDOUT "SSHD is back up. Continuing...\n"; } else { print STDOUT "Still waiting for SSH. Retrying in 5 seconds...\n" +; sleep 5; }

      This fails though. Apparently the "no route to host" message that comes back from the failed connection attempt is enough to define $ssh{$host}. What's a good way to test to see if the connection is successful?

        Use the error method to see if some error happened.
        for my $host (...) { $ssh{$host} = Net::OpenSSH->new($host, ...); if($ssh{$host}->error) { print STDOUT "Still waiting for SSH. Retrying in 5 seconds...\ +n"; sleep 5; redo; } else { print STDOUT "SSHD is back up. Continuing...\n"; } }

        If you are starting several connections in parallel, you can also use the wait_for_master method.

        Passing in parameters works for me, without having to loop & check the error condition:
        my $ssh = Net::OpenSSH->new( $username . "\@" . $host, key_path => $key_path, master_opts => [-o => "ConnectionAttempts +=30", -o => "ConnectTimeout=10"] );

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://905187]
Approved by NetWallah
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others about the Monastery: (13)
As of 2014-07-10 17:37 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    When choosing user names for websites, I prefer to use:








    Results (214 votes), past polls