Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

Socket hang. (Windows or Perl? Solutions?) (Updated)

by BrowserUk (Pope)
on Apr 05, 2011 at 20:01 UTC ( #897591=perlquestion: print w/ replies, xml ) Need Help??
BrowserUk has asked for the wisdom of the Perl Monks concerning the following question:

This creates server and client sockets in separate threads--though the threading is not a factor in the problem--and the client repeatedly connects; send a packet; receives a packet; and then disconnects before repeating.

Initially on my system, this runs smoothly at around 500 connects/second, but somewhere usually between 8000 and 16000 cycles, it just grinds to a halt. And stays that way for an extended period before suddenly starting to run again and then freezing again.

This appears to be a problem with the tcpip subsystem as there are lots of open connections hanging around during the freeze periods that are in the SYN_SENT state.

Questions:

  1. Is this unique to my 5.10.1 64-bit Perl, Vista 64-bit system?
  2. Unique to windows?
  3. Caused by something I am doing?
  4. Or something I'm not doing?
  5. Is there a cure?

Thanks for any pointers.

Updated code to remove Win32 dependency; and ditched the stack size arg (just in case).

#! perl -slw use strict; use Time::HiRes qw[ time usleep ]; use threads; use threads::shared; use IO::Socket; our $port //= 12345; my $svrN :shared = 0; my $clientN :shared = 0; my $start = time; async { my $svr = IO::Socket::INET->new( Listen => SOMAXCONN, Reuse =>1, LocalPort => $port, Timeout => 0.1, ) or die $!; while( my $client = $svr->accept ) { my $in = <$client>; print $client "echod:$in"; $client->shutdown( 2 ); close $client; ++$svrN; } }->detach; async { while( 1 ) { my $svr = IO::Socket::INET->new( PeerHost => 'localhost', PeerPort => $port, Reuse => 1, Timeout => 0.1, ) or usleep( 10_000 ), next; sleep 0; print $svr ++$clientN; my $echo = <$svr>; sleep 0; $svr->shutdown( 2 ); close $svr; sleep 0; } }->detach; $|++; while( usleep 100_000 ) { printf "\rserver:$svrN client:$clientN cycles: %.3f/sec", $svrN / ( time() - $start ); } __END__ c:\test>junk79 -port=12347 server:9565 client:9565 cycles: 503.421/sec ## some time later server:16305 client:16305 cycles: 397.683/sec ## some time later still server:16305 client:16305 cycles: 267.295/sec

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

Comment on Socket hang. (Windows or Perl? Solutions?) (Updated)
Download Code
Re: Socket hang. (Windows or Perl? Solutions?)
by sundialsvc4 (Monsignor) on Apr 05, 2011 at 21:51 UTC

    Just to cover all bases, how are you determining the sockets that are in SYN_SENT (or whatever...) state?

    At the time that this anomalous behavior occurs, what’s being recorded in the system logs?

    Are you intending to create a system stress-test?

      Just to cover all bases,

      Duh? FF.

      how are you determining the sockets that are in SYN_SENT (or whatever...) state?

      Using a tool designed for that job. In my case, TCPview.

      At the time that this anomalous behavior occurs, what’s being recorded in the system logs?

      What system logs? This isn't a webserver.

      Are you intending to create a system stress-test?

      Of course I am. The posted code's sole purpose is to demonstrate the problem as quickly as possible.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Socket hang. (Windows or Perl? Solutions?)
by ikegami (Pope) on Apr 05, 2011 at 21:53 UTC

    Segfaulted on 5.12.2 on linux

    $ perl a.pl -port=12347 Using minimum thread stack size of 16384 at /home/eric/usr/perlbrew/pe +rls/perl-5.12.2t/lib/5.12.2/i686-linux-thread-multi/threads.pm line 4 +9. Segmentation fault
    #0 __res_vinit (statp=0xb7ab2df4, preinit=0) at res_init.c:176 #1 0xb7ec12c5 in *__GI___res_ninit (statp=0xb7ab2df4) at res_init.c:1 +42 #2 0xb7ec2330 in *__GI___res_maybe_init (resp=0xb7ab2df4, preinit=0) +at res_libc.c:125 #3 0xb7ec4243 in *__GI___nss_hostname_digits_dots (name=0x8477d60 "lo +calhost", resbuf=0xb7f2ada8, buffer=0xb7f289ec, buffer_size=0xb7f2adb +c, buflen=0, result=0xb7ab2098, status=0x0, af=2, h_errnop=0xb7ab2094) at digit +s_dots.c:46 #4 0xb7ec897a in gethostbyname (name=0x8477d60 "localhost") at ../nss +/getXXbyYY.c:109 #5 0xb7fc364c in XS_Socket_inet_aton () from /home/eric/usr/perlbrew/ +perls/perl-5.12.2t/lib/5.12.2/i686-linux-thread-multi/auto/Socket/Soc +ket.so #6 0x080e0122 in Perl_pp_entersub () #7 0x080de559 in Perl_runops_standard () #8 0x0807a588 in Perl_call_sv () #9 0xb7ac413a in S_ithread_run () from /home/eric/usr/perlbrew/perls/ +perl-5.12.2t/lib/5.12.2/i686-linux-thread-multi/auto/threads/threads. +so #10 0xb7f31955 in start_thread (arg=0xb7ab2b70) at pthread_create.c:30 +0 #11 0xb7eb1e7e in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S: +130

    Got no time for more now.

      Ditto on 5.10.1 on debian sid x64.
      (gdb) run /tmp/blah.pl Starting program: /usr/bin/debugperl /tmp/blah.pl [Thread debugging using libthread_db enabled] Using minimum thread stack size of 16384 at /usr/lib/perl/5.10/threads +.pm line 49. [New Thread 0xb7c84b70 (LWP 10226)] Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 0xb7c84b70 (LWP 10226)] Perl_sv_gets (my_perl=0x83efb18, sv=0x84f6bac, fp=0x840afcc, append=0) + at sv.c:6620 6620 sv.c: No such file or directory. in sv.c (gdb) bt #0 Perl_sv_gets (my_perl=0x83efb18, sv=0x84f6bac, fp=0x840afcc, appen +d=0) at sv.c:6620 #1 0x080949d0 in Perl_filter_read (my_perl=0x83efb18, idx=0, buf_sv=0 +x84f6bac, maxlen=0) at toke.c:2955 #2 0x08094dcd in S_filter_gets (my_perl=<value optimized out>, sv=0x8 +4f6bac, fp=<value optimized out>, append=0) at toke.c:2997 #3 0x080a34ec in Perl_yylex (my_perl=0x83efb18) at toke.c:3757 #4 0x080b6d36 in Perl_yyparse (my_perl=0x83efb18) at perly.c:409 #5 0x08179991 in S_doeval (my_perl=0x83efb18, gimme=0, startop=0x0, o +utside=0x0, seq=886) at pp_ctl.c:2981 #6 0x0819030c in Perl_pp_require (my_perl=0x83efb18) at pp_ctl.c:3573 #7 0x080e91a8 in Perl_runops_debug (my_perl=0x83efb18) at dump.c:1968 #8 0x0807fde2 in Perl_call_sv (my_perl=0x83efb18, sv=0x84f676c, flags +=4) at perl.c:2717 #9 0xb7ca26aa in S_ithread_run (arg=0x82c0220) at threads.xs:440 #10 0xb7f9c7b0 in start_thread () from /lib/libpthread.so.0 #11 0xb7f1c8fe in clone () from /lib/libc.so.6
      Edit: Taking out the stack_size option to use lets the script run:
      cthulhu:/tmp# ./blah.pl server:278652 client:278652 cycles: 1774.854/sec

        Thanks guys. I've removed the stack_size option and the windows dependency in the OP code.

        Any chance you could try it again?


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Socket hang. (Windows or Perl? Solutions?)
by wind (Priest) on Apr 05, 2011 at 22:09 UTC

    After 30 secs, the server stops incrementing. Waited 5 minutes for it to continue to no avail:

    perl temp.pl -port=12347 server:3853 client:3853 cycles: 112.596/secc

    Strawberry perl (v5.12.2) built for MSWin32-x86-multi-thread

      If you ^C it when it stops and try to re-run the script on the same port immediately, does it block/lock/hang much more quickly?

      If you then immediately run it using a different port number, does it then demonstrate the original behaviour again?

      Thanks.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

        Same port worked for just 3 servers before rehanging. New port worked like a fresh instance.

        >perl temp.pl -port=12345 server:3911 client:3911 cycles: 88.886/secTerminating on signal SIGINT +(2) >perl temp.pl -port=12345 server:3 client:3 cycles: 0.250/secTerminating on signal SIGINT(2) >perl temp.pl -port=12346 server:3912 client:3912 cycles: 97.800/secTerminating on signal SIGINT +(2)
Re: Socket hang. (Windows or Perl? Solutions?) (Updated)
by Khen1950fx (Canon) on Apr 05, 2011 at 23:29 UTC
    I replaced the deprecated Reuse with ReuseAddr, and I added Blocking => 0, to the IO::Socket::INET constructors.

    FYI, on my Fedora system, there's a stack_size minimum of 16384.

      I replaced the deprecated Reuse with ReuseAddr, and I added Blocking => 0, to the IO::Socket::INET constructors.

      You don't say what affect that had?

      Replacing Reuse with ReuseAddr shouldn't make any difference, as they are the same thing. Only the keyname changed.

      Setting blocking off might have some affect (on *nix), but is (or was: it might have changed in recent versions) a no-op on Windows. But in any case, the change would break the code as it is designed to use blocking reads and writes.

      FYI, on my Fedora system, there's a stack_size minimum of 16384.

      That appears to be a bug in the *nix implementation. The docs suggest that if you attempt to set it too low, it will be rounded up to the minimum: Some platforms have a minimum thread stack size. Trying to set the stack size below this value will result in a warning, and the minimum stack size will be used.

      I've removed it from the OP code.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Socket hang. (Windows or Perl? Solutions?) (Updated)
by ikegami (Pope) on Apr 06, 2011 at 00:20 UTC
    $ perl -v | grep version This is perl 5, version 12, subversion 2 (v5.12.2) built for i686-linu +x-thread-multi $ time perl a.pl -port=12347 server:274522 client:274522 cycles: 1960.793/sec^C real 2m20.175s user 1m40.466s sys 0m44.603s

    It didn't block. I interrupted it when I got tired.

    I'll try on Windows 7 later.

      For me, it also ran fine, and port re-use was not an issue:

      root@hlg64:~# ./foo.pl
      server:160302 client:160302 cycles: 3121.894/sec^C
      root@hlg64:~# ./foo.pl
      server:222203 client:222203 cycles: 3153.737/sec^C

      This was perl 5.10, 64-bit 2.6.29.3 Linux kernel
      3Ghz core2-duo Xeon

      fnord

Re: Socket hang. (Windows or Perl? Solutions?) (Updated)
by ikegami (Pope) on Apr 06, 2011 at 03:52 UTC
    >perl -v This is perl 5, version 12, subversion 3 (v5.12.3) built for MSWin32-x +86-multi-thread ... Binary build 1204 [294330] provided by ActiveState http://www.ActiveSt +ate.com ... >perl a.pl -port=12347 server:41930 client:41930 cycles: 597.742/secTerminating on signal SIG +INT(2)

    This is where it first froze. I immediately ran the following:

    >netstat /a Active Connections Proto Local Address Foreign Address State ... TCP 127.0.0.1:12347 tribble:49157 TIME_WAIT TCP 127.0.0.1:12347 tribble:49159 TIME_WAIT TCP 127.0.0.1:12347 tribble:49161 TIME_WAIT TCP 127.0.0.1:12347 tribble:49162 TIME_WAIT TCP 127.0.0.1:12347 tribble:49163 TIME_WAIT TCP 127.0.0.1:12347 tribble:49164 TIME_WAIT TCP 127.0.0.1:12347 tribble:49165 TIME_WAIT TCP 127.0.0.1:12347 tribble:49166 TIME_WAIT TCP 127.0.0.1:12347 tribble:49167 TIME_WAIT TCP 127.0.0.1:12347 tribble:49168 TIME_WAIT TCP 127.0.0.1:12347 tribble:49169 TIME_WAIT TCP 127.0.0.1:12347 tribble:49170 TIME_WAIT TCP 127.0.0.1:12347 tribble:49171 TIME_WAIT TCP 127.0.0.1:12347 tribble:49172 TIME_WAIT TCP 127.0.0.1:12347 tribble:49173 TIME_WAIT TCP 127.0.0.1:12347 tribble:49174 TIME_WAIT ...

    I presume there are 41930 of those in total.

    This would account for the pause. My question isn't why I get these in Windows. My question is why didn't linux show a similar pause. I'll see what netstat shows in linux tomorrow.

      In Linux, in order to avoid running out of sockets you can use the following as root:

      echo 1 > /proc/sys/net/ipv4/tcp_tw_recycle 

      I'm not sure how your Linux install is set, just wanted to mention that I once wrote a bot to monitor svn commits and it had this problem and netstat would uncover many

      TIME_WAIT
      still opened but unused connections.

      Here are some docs that describe what tcp_tw_recycle does(Taken from ip-sysctl.txt):

      tcp_tw_recycle - BOOLEAN
      	Enable fast recycling TIME-WAIT sockets. Default value is 0.
      	It should not be changed without advice/request of technical
      	experts.
      
      

        In Linux, in order to avoid running out of sockets you can use the following as root:

        Backwards. I'm not running out of sockets in Debian, even though I'm creating sockets 3x faster than one Windows.

        /proc/sys/net/ipv4/tcp_tw_recycle contains a zero. netstat -a shows a bajillion sockets in TIME_WAIT.

        Why is it not running out of sockets?

Re: Socket hang. (Windows or Perl? Solutions?) (Updated)
by Corion (Pope) on Apr 06, 2011 at 09:07 UTC

    Searching for TIME_WAIT turned up this Winsock Programmer's FAQ, which claims that sockets in the TIME_WAIT state are to be expected. IBM suggests some registry keys to change the Windows network stack behaviour to allow for _more_ connections.

    Completely unfounded speculation: This might work on Linux/Unix but runs out of sockets on Windows maybe because Windows does not keep track of which TCP connections have both ends local, while Unix keeps track. Then Windows would need to keep those sockets in the waiting state, while Unix knows that it can ignore the issue of outstanding packets as all transfers were local to the machine anyway.

      Hm. I've tried setting SO_DONTLINGER on the sockets at both ends, and adjusting the registry entries, but they still persist far longer than is any good for high frequency connections.

      I think that the real solution may be to use UDP rather than TCP. UDP should be reliable enough within the same box that I shouldn't need to re-engineer a low latency protocal on top.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
        I hesitate to provide a windows reference, given my lack of experience in TCP programming in that environment. That said, however, this page may have something worth trying. According to this, you actually don't want to set SO_DONTLINGER, but rather SO_LINGER, and set the l_linger field to 0. This will force a close via an RST rather than wait for full FIN processing. Maybe you have already tried this. This *could* also result in data not being received, based on timing, but I would not expect that to be the case on a local system.

        fnord

        If you are in search of a fast IPC facility, maybe ZeroMQ (resp. 0 MQ) is of interest. It claims to provide quick connections, both in- and inter-process, and even across machines.

        The test failures are somewhat weird. It is developed by tsee, so I presume there is an actual use case behind it.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://897591]
Approved by toolic
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chanting in the Monastery: (10)
As of 2014-08-21 19:21 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The best computer themed movie is:











    Results (143 votes), past polls