ryantate has asked for the wisdom of the Perl Monks concerning the following question:
If that weren't enough, the dang thing can download 27 Web pages in less than three seconds.
But there's one weakness to POE::Component::Client::HTTP no one seems to mention, which is that is cannot time out Web connections. In fact, while timeouts are seemingly utlized in many of the above examples, the current http client docs explicitly state, under a BUGS section at the very end:
The following spawn() parameters are accepted but not yet implemented: Timeout.
Instead, what the POE http client does is trigger the "response" event for each session within the timeout period, but does not close out a session until the http connection is actually closed. So $poe_kernel->run will not return until all http sessions are done, even if the timeout is well past.
My question: What is the best way to enforce a timeout on POE::Component::Client::HTTP such that $poe_kernel->run will return at or very near the timeout period, instead of many seconds later?
The only solution I have is to call $kernel->stop within the response handler when the last session response is done. But the stop method is experimental and has some serious caveats.
Some examples of POE::Component::Client::HTTP timeout in action:
The script (poe_delay.pl):
use strict; use warnings; use Time::HiRes qw( time ); use HTTP::Request; use POE qw(Component::Client::HTTP); my $start = time; my $urls_left = 12; POE::Component::Client::HTTP->spawn( Alias => 'ua', Timeout => shift || 10, FollowRedirects => 2, Streaming => 0, ); POE::Session->create( inline_states => { _start => \&client_start, response => \&response_handler } ); $poe_kernel->run; print 'Run done in: ', time - $start, " seconds.\n"; exit 0; sub client_start{ my $kernel = $_[KERNEL]; while (<DATA>) { chomp; $kernel->post( 'ua', # posts to the 'ua' alias 'request', # posts to ua's 'request' state 'response', # which of our states will receive the r +esponse HTTP::Request->new('GET', "http://$_") ); } } sub response_handler { my ($request_packet, $response_packet, $kernel, $heap) = @_[ARG0, AR +G1, KERNE\ L, HEAP]; $urls_left--; my $request_object = $request_packet->[0]; my $response_object = $response_packet->[0]; if ($urls_left <= 0) { print 'Downloads done in: ', time - $start, " seconds.\n"; # $kernel->stop; } } __DATA__ www.google.com www.yahoo.com www.amazon.com www.ebay.com news.yahoo.com news.google.com www.msn.com www.slashdot.org www.indymedia.org www.sfgate.com www.nytimes.com www.cnn.com
Now if we run the above script with a timeout of 5 seconds, $kernel->run does not return until well after the responses are all in:
ryantate [507] perl -w poe_delay.pl 5 Downloads done in: 5.41008615493774 seconds. Run done in: 16.4065310955048 seconds.
If we increase the timeout to 20, we see that the responses and $kernel->run finish at the same time. This is because all HTTP connections are closed within 20 seconds, I believe:
ryantate [508] perl -w poe_delay.pl 20 Downloads done in: 20.4675362110138 seconds. Run done in: 20.4807510375977 seconds.
Finally, if we uncomment the $kernel->stop line inside sub response_handler in the above script, we get:
ryantate [513] perl -w poe_delay.pl 5 Downloads done in: 5.35386991500854 seconds. Run done in: 5.36102390289307 seconds.
|
---|
Replies are listed 'Best First'. | |
---|---|
Re: Timing out POE http client
by rcaputo (Chaplain) on Mar 30, 2006 at 06:03 UTC | |
by ryantate (Friar) on Mar 30, 2006 at 23:47 UTC | |
by rcaputo (Chaplain) on Mar 31, 2006 at 00:38 UTC |