Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

Re^2: IO::Socket doesn't detect lost TCP connections

by tjdmlhw (Acolyte)
on Sep 04, 2004 at 02:34 UTC ( [id://388472]=note: print w/replies, xml ) Need Help??


in reply to Re: IO::Socket doesn't detect lost TCP connections
in thread IO::Socket doesn't detect lost TCP connections

The receiving system is a vendor package and keepalive heart beats are not part of their system. Once this script is functioning at a production level, variations of it will be used for multiple systems. Out of the ones that I am familiar with, only one uses a heartbeat. It would be prohibitive to pay all of the vendors to modify their systems.

  • Comment on Re^2: IO::Socket doesn't detect lost TCP connections

Replies are listed 'Best First'.
Re^3: IO::Socket doesn't detect lost TCP connections
by DaveH (Monk) on Sep 04, 2004 at 16:55 UTC

    I have implemented similar gateway interfaces in the past, and one thing to bear in mind is that a "keep alive" message does not need to be an explicit thing written into the TCP/IP protocol you are using. Anything which will test whether or not a connection is alive is sufficient. For example, you may be able to send some sort of benign transaction which is valid in the currently defined protocols, but doesn't actually have any effect on the running backend system (i.e. is some sort of read-only query). The would serve the same purpose as an explicitly written keep-alive packet. Obviously, you will need to investigate whether such a query exists and would be suitable for this purpose.

    The other thing which I would do is to not use the buffered input and output functions for doing production socket work. I prefer using sysread/syswrite for socket reads and writes because you can detect things like dropped connections and end of file conditions more easily. You can also see how much data was read or written to the socket with each operation, so you can detect short read/write conditions. Also, I prefer using IO::Select to see whether my sockets can be read from or written to, and I use this to implement my own timeout mechanism.

    However, I would recommend seaching on CPAN for TCP, since this threw up lots of interesting higher-level modules which should hide some of the low-level socket guts from your program. Also I would always recommend looking at POE for any socket programming work, since many socket-based programs fall into the event driven category (i.e. "wait for X condition, then do Y"), and POE is one of the best frameworks for achieving event based programming quickly. It is worth having a look at the POE website in addition to the POD documentation on CPAN.

    You also mentioned that you are using MQSeries for your message queueing. Are you aware that there is an MQSeries Perl module available on CPAN? This may be more suitable than calling out to an external program to retrieve messages.

    Find below the sort of code I have used in the past for socket operations: use this at your own risk, and bear in mind that this is not working code. You will need to customise it. Provided in the hope that it is useful.

    I hope that this helps.

    Cheers,

    -- Dave :-)


    $q=[split+qr,,,q,~swmi,.$,],+s.$.Em~w^,,.,s,.,$&&$$q[pos],eg,print

      Thanks for all the useful information. I am taking Labor Day off, but will definitely try some of your suggestions when I get back to the office Tuesday.

      I am aware of the MQSeries Module in CPAN and have used it in variations of my script on my PC. Unfortunately, to load the module, you need a C compiler and there wasn't one on my AIX Node. Since this is a production box, the systems group wouldn't let me load a compiler.

      A search of Monks turned up the q and qc programs furnished by IBM and I started using those. The systems people later agreed to install the compiler, but qc has been working well for me and I haven't bothered to switch back.

      You, sir, get bonus points for mentioning POE ;)

      I tried the above program, but it doesn't seem to do what I need. I added a few prints to show some of the return values and a sleep to give me time to bounce the receiving test tool. I've shown the test results and the code as used below.

      The first test was just connecting, sending, and receiving data without any interuptions. This worked without any problems.

      In sock_write - sel = IO::Select=ARRAY(0x201651cc) Results of syswrite - 11 wrote TESTING 123 to the socket In sock_read - sel = IO::Select=ARRAY(0x200272b8) Results of syswrite - 5 read AAAAA from the socket

      Next I established the connection and used the sleep time to kill it before a send was attempted. The code did not recognize that the connection was dead on the write, but did on the following read.

      In sock_write - sel = IO::Select=ARRAY(0x20164c0c) Results of syswrite - 11 wrote TESTING 123 to the socket In sock_read - sel = IO::Select=ARRAY(0x20164c9c) Use of uninitialized value in concatenation (.) at /appl/tst/eng/local +/bin/IELmq1 line 63. Results of syswrite - sock_read: socket read error (A connection with a remote socket was re +set by that socket.) at /appl/tst/eng/local/bin/IELmq1 line 28.

      The last test was stopping and starting the receiving system before a transaction was attempted. The results were exactly the same as for the killing the connection test above.

      In sock_write - sel = IO::Select=ARRAY(0x20164c0c) Results of syswrite - 11 wrote TESTING 123 to the socket In sock_read - sel = IO::Select=ARRAY(0x20164c9c) Use of uninitialized value in concatenation (.) at /appl/tst/eng/local +/bin/IELmq1 line 63. Results of syswrite - sock_read: socket read error (A connection with a remote socket was re +set by that socket.) at /appl/tst/eng/local/bin/IELmq1 line 28.

      What I was hoping for was someway of detecting on or before the write that the connection had been lost. I am currently working on modifying my script to trigger a reconnect when the read fails. This will work for the current interface, but may not for future interfaces that don't send an ACK back.

      If you know of anything else that I can try, I would appreciate the suggestion.

      use IO::Socket::INET; use IO::Select; use Errno qw(EAGAIN); my $timeout = 20; my $socket = IO::Socket::INET->new( PeerAddr => "eng1tst", PeerPort => "9903", Proto => 'tcp', ) or die "Cannot create new socket: $!";; my $request = "TESTING 123"; sleep 10; unless ( sock_write($socket, \$request, length($request)) ) { die "sock_write: socket write error ($!)"; } print "wrote $request to the socket\n"; my $len = 5; my $response = ""; unless ( sock_read($socket, \$response, $len) ) { die "sock_read: socket read error ($!)"; } print "read $response from the socket\n"; exit; # sock_read( # $socket - IO::Socket::INET socket to read from # \$msg - Reference to hold the read data # $len - number of bytes to read from the socket #) # returns: original reference to the data, or undef sub sock_read { my $socket = shift; my $buf = shift; my $len = shift; my $offset = 0; my $n = 0; my $sel = IO::Select->new($socket); print "In sock_read - sel = $sel\n"; while ( $offset < $len ) { unless ( $socket ) { warn "Socket became undef during read!"; return undef; } unless ($sel->can_read($timeout)) { warn "Socket read timed out"; return undef; } $n = $socket->sysread($$buf, $len-$offset, $offset); print "Results of syswrite - $n\n"; # Check for "Resource temporarily unavailable" error, and clea # This just means that we can't write to the socket "just now" # buffer is full. if ($!{EAGAIN}) { warn "Socket would read block!"; $! = 0; # Clear the error $n = 0; # "Define" $n } unless ( defined($n) ) { return undef; } if ((not $!{EAGAIN}) && ($n == 0)) { warn "Socket read returned no data: unknown comms error!"; return undef; } $offset += $n; } return $buf; } # sock_write( # $socket - IO::Socket::INET socket to write to # \$msg - Reference to hold the data to write # $len - number of bytes to write to the socket #) # returns: original reference to the data, or undef sub sock_write { my $socket = shift; my $buf = shift; my $len = shift; my $offset = 0; my $n = 0; my $sel = IO::Select->new($socket); print "In sock_write - sel = $sel\n"; while ( $offset < $len ) { unless ($socket) { warn "Socket became undef during write!"; return undef; } unless ($sel->can_write($timeout)) { warn "Socket write timed out"; return undef; } $n = $socket->syswrite($$buf, $len-$offset, $offset); print "Results of syswrite - $n\n"; # Check for "Resource temporarily unavailable" error, and clea # This just means that we can't write to the socket "just now" # buffer is full. if ($!{EAGAIN}) { warn "Socket write would block!"; $! = 0; # Clear the error $n = 0; # "Define" $n } unless ( defined($n) ) { return undef; } if ((not $!{EAGAIN}) && ($n == 0)) { warn "socket failed to write any data!"; return undef; } $offset += $n; } return $buf;

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://388472]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others studying the Monastery: (3)
As of 2024-04-16 04:29 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found