Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

Re: Proper way to thread this in PERL. (mech lwp asynchronous)

by Anonymous Monk
on Dec 25, 2013 at 23:32 UTC ( #1068377=note: print w/ replies, xml ) Need Help??


in reply to Proper way to thread this in PERL.

The idea is to spawn max threads once :) and use proper scoping and argument passing

Re^9: Async DNS with LWP/Re^13: Async DNS with LWP, Re^4: Perl/Tk vs Win32::SerialPort (stash, dispatch, queue), Re: Challenge: Perl 5: lazy sameFringe()?, Re^5: Consumes memory then crashs
threads::Q

This code untested but I've used this pattern before

#!/usr/bin/perl -- Main( @ARGV ); exit( 0 ); sub Main { ... ; ## GetOpt::Long / GetOpt::Declare ... UrlFile_FetchingThreads ( '/f/o/o/bar.txt', 4 ); ## UrlFile_FetchingThreads ( $filename, $maxthreads ); } sub UrlFile_FetchingThreads { my( $filename, $maxthreads ) = @_; my @urls = GetUrls( $filename ); my $qin = threads::Q->new(); my $qout = threads::Q->new(); $qin->nq( @urls ); for ( 1 .. $maxthreads ){ my $tt = threads->create( \&tryHTTP, $qin, $qout ); $tt->detach; } my $quitter = 0; while( 1 ){ if( my $res = $qout->dq_nb ){ doSomething( $ret ); } sleep 1; ### this part not used before, completely untested, probably too compl +ex if( not $qin->cq ){ $quitter++; } else { $quitter = 0; } if( not $qout->cq ){ if( $quitter > 5 ){ die "FINISHED, no more urls to process, no more respon +ses to process\n"; } } } } sub tryHTTP { threads->detach(); ## can't join me :) my( $qin, $qout ) = @_; while( 1 ){ my( $url ) = $qin->dq; ... my $res = $lwp->get( $url ); $qout->nq( $res ); } return; }

Re: simple multithreading with curl
Re: Perl crashing with Parallel::ForkManager and WWW::Mechanize
Re^3: Using LWP instead of wget?
Re^3: Fast fetching of HTML response code
Re^2: Need help with Perl multi threading
LWP::Parallel::UserAgent
LWP::Concurrent
Re^10: Consumes memory then crashs Re^9: Consumes memory then crashs Re: Are there any memory-efficient web scrapers?
Your main event may be another's side-show.
Re: Perl threads to open 200 http connections
Re^3: trying to get timeout to work (easier with threads)


Comment on Re: Proper way to thread this in PERL. (mech lwp asynchronous)
Download Code

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1068377]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chilling in the Monastery: (6)
As of 2015-07-29 07:06 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (260 votes), past polls