A 1 Gbps connection was saturated by downloading only 32 web pages at a time?
Instantaneously yes. And for a substantial proportion of the time assuming the servers we were connected to, and their connections, were able supply their data at the required rate.
Obviously if the mix of servers at any given point in time were all 386 class machines in people's back bedrooms, connected via 14.4k modems--not so uncommon back then--then throughput falls off. But usually you had a random mix of good and bad severs and it was sufficient to max the bandwidth available.
Remember I mentioned the 1Gbps was shared. If I remember correctly, by 20 other hosts. Mostly they seemed to be using very little of the bandwidth. Probably low-volume websites running "Mum&Pop's Potpurri Emporium Inc." or "HaKzOr23's CRucIal SeCurITy SiGht". We weren't party to what they were, or what bandwidth they were using, but the hosters ControlPanel app showed us our usage, for which we were billed.
By way of a convincer. The following two trivial scripts run as (2)servers and (2)clients on my 4 cpu machine. I set the affinities so that 2 cores are running the two server threads; and 2 the two client threads. All they do is connect to each other and shovel large lumps of date through from server to client as fast as they can:
Server:
#! perl -slw
use strict;
use threads;
use threads::shared;
use IO::Socket;
$|++;
my $status1 :shared = 0;
my $status2 :shared = 0;
my $server1 = async{
my $lsn = new IO::Socket::INET(
Listen => 5, LocalPort => '12345'
) or die "Failed to open listening port: $!\n";
my $data = 'x' x 1024**2;
while( my $c = $lsn->accept ) {
while( 1 ) {
print $c $data;
++$status1;
}
print "client disconnected";
}
};
my $server2 = async{
my $lsn = new IO::Socket::INET(
Listen => 5, LocalPort => '12346'
) or die "Failed to open listening port: $!\n";
my $data = 'x' x 1024**2;
while( my $c = $lsn->accept ) {
while( 1 ) {
print $c $data;
++$status2;
}
print "client disconnected";
}
};
while( Win32::Sleep 100 ) {
printf "\r$status1 : $status2";
}
Clients: #! perl -slw
use strict;
use threads;
use threads::shared;
use IO::Socket;
$|++;
my $bytes1 :shared = 0;
my $bytes2 :shared = 0;
my $client1 = async{
my $tid = threads->tid;
my $svr = new IO::Socket::INET(
'localhost:12345'
) or die "Failed to connect to port: $!\n";
while( 1 ) {
my $buffer = <$svr>;
$bytes1 += length( $buffer );
}
};
my $client2 = async{
my $svr = new IO::Socket::INET(
'localhost:12346'
) or die "Failed to connect to port: $!\n";
while( 1 ) {
my $buffer = <$svr>;
$bytes2 += length( $buffer );
}
};
my( $last1, $last2 ) = (0,0);
while( sleep 1 ) {
my( $latest1, $latest2 ) = ( $bytes1 , $bytes2 );
printf "\rc1:%5d (%.3f Megabtes/second) c2:%5d (%.3f Megabtes/seco
+nd)",
$latest1, ( $latest1 - $last1 ) / 1024**2,
$latest2, ( $latest2 - $last2 ) / 1024**2;
( $last1, $last2 ) = ( $latest1, $latest2 );
}
That data doesn't go via the internet (my broadband connection is 300kbps at best); but it does go via the tcp stack and is therefore subject to all the handshaking, coalescing and buffering that a proper ip connection goes through.
The main thread in the cients script monitors the throughput on a per second basis. Here's a typical snapshot of that: C:\test>junk62
c1:12559855306 (54.000 Megabtes/second) c2:12407811641 (53.000 Megabte
+s/second)
The cpu usages whilst all that data is flying about is about 12% each for the servers, and 5% each for the clients. The throughput varies up and down a bit between say 50MBytes/s and 58MBytes/s, but 53/54 is the norm.
Remember, for 32 threads to sustain a combined throughput of 1Gbps (100MBytes/s), each thread has only to achieve 3MBytes/s.
Obviously overall throughput at any given point will depend upon the mix of large and small files; good and bad servers; general network load; no of hops; and myriad other factors. But throwing large numbers of more threads at the problem has rapidly diminishing returns. 4 threads per CPU seemed optimal on that system at that time. 8 per cpu sometimes improved overall throughput, but that was mostly negated by the effects of thrashing the disks harder by writing to twice as many files concurrently.
That's why I say that you have to consider the complete system. And also why async DNS doesn't make much difference.
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
|