Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

LWP::Parallel::UserAgent timeouts

by perlmonkey2 (Beadle)
on Nov 17, 2006 at 15:41 UTC ( #584740=perlquestion: print w/ replies, xml ) Need Help??
perlmonkey2 has asked for the wisdom of the Perl Monks concerning the following question:

Hello monks, I've been using LWP::Parallel::UserAgent for quite some time now for an academic webminer. But I'm having a problem I just can not get around. After the application has been running for some time, all the open sockets will fail with a (timeout). I'm not sure what is going on or how to correct for it. But my application works by implementing a callback for on_return and then parsing the HTML and registering any new URLs that might be of interest. I have my own DNS caching server, and I only pull one file per domain ever > 5 minutes. Here is my constructor:
my $ua = Spider::LWP->new($depth,$path,$max_sockets,$ignore,$exclude); $ua->duplicates(0);#don't ignore duplicates here as this is done in th +e subclass a gazillion times more efficiently $ua->cookie_jar({});#where else would you store cookies? $ua->redirect(1);#follow redirects. HACKED BASE LIBRARY to make this +work with the subclass. $ua->in_order(1);#do the urls in order, as we randomize their entry in +to the queue. $ua->remember_failures(0);#don't remember failures here as the lib sto +res the entire object. This is done in the subclass $ua->max_hosts($max_sockets);#max open requests at any given moment $ua->max_req(1);#max requests per host $ua->nonblock(1);#don't block on LWP::UserAgent socket reads
Then register the beginning URLs:
$ua->wait(300); # block until we are all finished or until everything + has stopped for 5 minutes
Can anyone see anything wrong with this? What I think is happening is, and maybe I'm way off course here, but that for some reason a socket gets BLOCKED and times out and this timeout causes all the other sockets to timeout.

Comment on LWP::Parallel::UserAgent timeouts
Select or Download Code

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://584740]
Approved by chargrill
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others taking refuge in the Monastery: (6)
As of 2015-07-05 15:12 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (67 votes), past polls