Folks,
I'm looking for suggestions on how I might improve the efficiency of a program I use which does non-blocking HTTP io with often 1000+ open sockets.
The central action of the program is characterized by the simplified code snipit which follows later. My thanks to liverpole for reminding me of the Perl module IO::Select which, although I had previously used, did not in this code which I inherited from the original code author.
I suspect that much time is taken by checking on socket availability too often. I hope that there is a method to limit my calls to IO::Select:can_read so they are only done only when there is pending IO. I have been unsuccessful in finding such a mechanism.
Is there any mechanism to implement the following pseudocode more efficiently than just calling IO::Select's can_read() every time one needs to check if any socket io is pending?
$SIG{INTERUPT_ON_PENDING_SOCKET_IO} = \&ckSockets;
Another optimization possibility
Even though we may have 1000+ open sockets the activity at any one time is sparse. I've been speculating about going back to the bit vector version of select and looking at the 1000+ bit length vector 32 at a time. I'm not optimistic about this approach. For all I know, the implementer of IO::Select may already do this.
Highly simplified version of my current code
use IO::Select;
...
# Check for and process any pending socket input
# avoid steping on toes by keeping running list of ready
# sockets and process it untill empty
sub ckSockets { # Returns: # of ready so we can tell activit
+y
...
my @breadys = $io_select_obj->can_read(0);
foreach my $fd_key (@breadys) {
...read and process data from this socket...
}
Looking at the top of profiled run below we see that the time appears dominated by calls to the socket testing:
[root@ibm-blade-blade0 testbuddy]# time dprofpp
Total Elapsed Time = 1790.060 Seconds
User+System Time = 1315.990 Seconds
Exclusive Times
%Time ExclSec CumulS #Calls sec/call Csec/c Name
55.9 736.8 753.00 317833 0.0002 0.0002 IO::Select::can_read
9.05 119.1 256.27 363021 0.0000 0.0001 BuddyUsers::log
5.25 69.14 137.12 350999 0.0000 0.0000 tsprint::ts
5.17 67.97 67.979 350999 0.0000 0.0000 POSIX::strftime
Update: Updated to correct spelling on IO::Select, add '0' to can_read to reflect actual code.