sdyates has asked for the wisdom of the Perl Monks concerning the following question:

Okay folks. I want to force the close of Sockets that are in close-wait status. Basically, I have some 100 connections which are in close0wait status and will not get a formal close confirm from the source. I need to close these sockets or the server will eventually stop performing as too many sockets will be open. These are all tcp connections.

I have not been able to find a utility that can do this and am not sure of the best approch. I want to be able to grep all sockets attached to a specific port:

easy with a system(): first nbtstat or netstat to grab a socket report, then grep the socket report based on a given port. However, I am not sure how to kill the socket. netstat does not appear to support this.

Has anyone had any experience with forcing the close of a socket before?
Are there any pitfalls to this?
Should I be trying something else?

Thanks,
Simon

Replies are listed 'Best First'.
Re: CLOSE-WAIT sockets
by beernuts (Pilgrim) on Apr 02, 2002 at 02:35 UTC
    Simon,

    You didn't mention the operating system under which your code is running. If it's Solaris, you can tweak the time_wait interval with ndd (add it to an /etc/init.d script if you'd like) like this:

    ndd set /dev/tcp tcp_time_wait_interval 30000

    Where 30000 is the interval in milliseconds. Everything I've read seems to show this as the bare minimum you can get away with without generating problems. If you need a few more filehandles, you can set that with plimit:

    plimit -n 4096,1024 $PID

    where $PID is the process id of your parent. Or, you can add this to /etc/system (and reboot) to affect all processes:

    rlim_fd_max=4096
    rlim_fd_cur=1024


    Check out the Tunable TCP/IP Parameters section or the General I/O section of the Tunable Parameters Reference Manual for more info. While neither of these will close your socket, it'll give you a bit more overhead so that your sockets will time out on their own without causing you grief. Best of luck.

    -beernuts

    Edit (4/1) - removed redundant 'Reference Manual' text
Re: CLOSE-WAIT sockets
by jepri (Parson) on Apr 01, 2002 at 22:48 UTC
    At the last ISP I worked at, the C programmers spent a few days trying to figure out how to kill off open conections, and couldn't find a way. This doesn't mean it ain't there, but it's well hidden.

    As for lowering the close timeout, you can recompile your kernel (assuming you have the source :) and change the socket timeout to anything you want. I don't know the effects of changing the timeout, but I feel a great inertiea when it comes to monkeying with TCP.

    It is actually fairly easy to fake the responses by monitoring the connections, and then 'injecting' a fake close response at the right time. However this is an unappealing practise.

    ____________________
    Jeremy
    I didn't believe in evil until I dated it.

Re: CLOSE-WAIT sockets
by xeh007 (Sexton) on Apr 02, 2002 at 02:11 UTC
    You can't do this with perl, at least not without spoofing packets or some similar hack. If you're running on a recent version of Linux, the timeout for close-wait tcp sockets is configurable under the /proc/sys/net directory. Specifially, I think the file is tcp_fin_timeout, but someone else may want to check me on that...
Re: CLOSE-WAIT sockets
by traveler (Parson) on Apr 03, 2002 at 21:32 UTC
    To oversimplify the TCP state diagram, CLOSE-WAIT means that the other end of the socket is closed and your application needs to close its end. I do not know what OS you are using, but your OS is supposed to "notify" you of this condition. The way that is generally done is for you to receive an EOF if you try to read from one of these sockets. If you do the read, receive the EOF and then close the socket, the socket shutdown (graceful close) will complete and your sockets will "disappear."

    If the sockets in question do not return EOF, you will probably need to so something radical like setting the TCP keepalive parameters such that a connection to a non-responding peer is closed. I know Linux and BSD can do that, but IIRC it is on a system-wide basis. You may not want that.

    HTH, --traveler