Re^15: Async DNS with LWP

Replies are listed 'Best First'.
Re^16: Async DNS with LWP by jc (Acolyte) on Oct 09, 2010 at 11:35 UTC
With 32 synchronous LWP sessions you saturated 1Gbps bandwith? Am i understanding you right here? A 1 Gbps connection was saturated by downloading only 32 web pages at a time? What tools did you use to monitor bandwidth consumption?	[reply]
Re^17: Async DNS with LWP by BrowserUk (Patriarch) on Oct 09, 2010 at 13:32 UTC
A 1 Gbps connection was saturated by downloading only 32 web pages at a time? Instantaneously yes. And for a substantial proportion of the time assuming the servers we were connected to, and their connections, were able supply their data at the required rate. Obviously if the mix of servers at any given point in time were all 386 class machines in people's back bedrooms, connected via 14.4k modems--not so uncommon back then--then throughput falls off. But usually you had a random mix of good and bad severs and it was sufficient to max the bandwidth available. Remember I mentioned the 1Gbps was shared. If I remember correctly, by 20 other hosts. Mostly they seemed to be using very little of the bandwidth. Probably low-volume websites running "Mum&Pop's Potpurri Emporium Inc." or "HaKzOr23's CRucIal SeCurITy SiGht". We weren't party to what they were, or what bandwidth they were using, but the hosters ControlPanel app showed us our usage, for which we were billed. By way of a convincer. The following two trivial scripts run as (2)servers and (2)clients on my 4 cpu machine. I set the affinities so that 2 cores are running the two server threads; and 2 the two client threads. All they do is connect to each other and shovel large lumps of date through from server to client as fast as they can: Server: #! perl -slw use strict; use threads; use threads::shared; use IO::Socket; $\|++; my $status1 :shared = 0; my $status2 :shared = 0; my $server1 = async{ my $lsn = new IO::Socket::INET( Listen => 5, LocalPort => '12345' ) or die "Failed to open listening port: $!\n"; my $data = 'x' x 10242; while( my $c = $lsn->accept ) { while( 1 ) { print $c $data; ++$status1; } print "client disconnected"; } }; my $server2 = async{ my $lsn = new IO::Socket::INET( Listen => 5, LocalPort => '12346' ) or die "Failed to open listening port: $!\n"; my $data = 'x' x 10242; while( my $c = $lsn->accept ) { while( 1 ) { print $c $data; ++$status2; } print "client disconnected"; } }; while( Win32::Sleep 100 ) { printf "\r$status1 : $status2"; } [download] Clients: #! perl -slw use strict; use threads; use threads::shared; use IO::Socket; $\|++; my $bytes1 :shared = 0; my $bytes2 :shared = 0; my $client1 = async{ my $tid = threads->tid; my $svr = new IO::Socket::INET( 'localhost:12345' ) or die "Failed to connect to port: $!\n"; while( 1 ) { my $buffer = <$svr>; $bytes1 += length( $buffer ); } }; my $client2 = async{ my $svr = new IO::Socket::INET( 'localhost:12346' ) or die "Failed to connect to port: $!\n"; while( 1 ) { my $buffer = <$svr>; $bytes2 += length( $buffer ); } }; my( $last1, $last2 ) = (0,0); while( sleep 1 ) { my( $latest1, $latest2 ) = ( $bytes1 , $bytes2 ); printf "\rc1:%5d (%.3f Megabtes/second) c2:%5d (%.3f Megabtes/seco +nd)", $latest1, ( $latest1 - $last1 ) / 10242, $latest2, ( $latest2 - $last2 ) / 10242; ( $last1, $last2 ) = ( $latest1, $latest2 ); } [download] That data doesn't go via the internet (my broadband connection is 300kbps at best); but it does go via the tcp stack and is therefore subject to all the handshaking, coalescing and buffering that a proper ip connection goes through. The main thread in the cients script monitors the throughput on a per second basis. Here's a typical snapshot of that: `C:\test>junk62 c1:12559855306 (54.000 Megabtes/second) c2:12407811641 (53.000 Megabte +s/second)` [download] The cpu usages whilst all that data is flying about is about 12% each for the servers, and 5% each for the clients. The throughput varies up and down a bit between say 50MBytes/s and 58MBytes/s, but 53/54 is the norm. Remember, for 32 threads to sustain a combined throughput of 1Gbps (100MBytes/s), each thread has only to achieve 3MBytes/s. Obviously overall throughput at any given point will depend upon the mix of large and small files; good and bad servers; general network load; no of hops; and myriad other factors. But throwing large numbers of more threads at the problem has rapidly diminishing returns. 4 threads per CPU seemed optimal on that system at that time. 8 per cpu sometimes improved overall throughput, but that was mostly negated by the effects of thrashing the disks harder by writing to twice as many files concurrently. That's why I say that you have to consider the complete system. And also why async DNS doesn't make much difference. Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice. RIP an inspiration; A true Folk's Guy	[reply] [d/l] [select]
Re^18: Async DNS with LWP by jc (Acolyte) on Oct 09, 2010 at 17:24 UTC
I see what you mean. In fact, I was toying with the idea of the following architecture to maximise throughput: * Setup asycnronous DNS to quickly resolve all of the 90,000,000 domains to find out which domains reside at the same IP (shared hosting domains). * Send out a TCP ack to each web server (asynchronously) to get a shortlist of which domain names actually have a web server which responds (should send a RST in response to unexpected ACK). * Then send out TCP connects with a short timeout to short list servers which respond faster. * To those that respond fast enough send out HEAD requests to obtain document sizes. * Asynchronously GET the smallest documents first such that database of links experiences the fastest possible growth. Any thoughts?	[reply]
Re^19: Async DNS with LWP by BrowserUk (Patriarch) on Oct 09, 2010 at 20:06 UTC
Re^20: Async DNS with LWP by jc (Acolyte) on Oct 09, 2010 at 21:13 UTC
Some notes below your chosen depth have not been shown here
Re^19: Async DNS with LWP by roboticus (Chancellor) on Oct 10, 2010 at 17:07 UTC
Re^20: Async DNS with LWP by jc (Acolyte) on Oct 10, 2010 at 20:14 UTC


Syntactic Confectionery Delight
	PerlMonks