Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things

Re: Proper way to thread this in PERL. (mech lwp asynchronous)

by Anonymous Monk
on Dec 25, 2013 at 23:32 UTC ( #1068377=note: print w/ replies, xml ) Need Help??

in reply to Proper way to thread this in PERL.

The idea is to spawn max threads once :) and use proper scoping and argument passing

Re^9: Async DNS with LWP/Re^13: Async DNS with LWP, Re^4: Perl/Tk vs Win32::SerialPort (stash, dispatch, queue), Re: Challenge: Perl 5: lazy sameFringe()?, Re^5: Consumes memory then crashs

This code untested but I've used this pattern before

#!/usr/bin/perl -- Main( @ARGV ); exit( 0 ); sub Main { ... ; ## GetOpt::Long / GetOpt::Declare ... UrlFile_FetchingThreads ( '/f/o/o/bar.txt', 4 ); ## UrlFile_FetchingThreads ( $filename, $maxthreads ); } sub UrlFile_FetchingThreads { my( $filename, $maxthreads ) = @_; my @urls = GetUrls( $filename ); my $qin = threads::Q->new(); my $qout = threads::Q->new(); $qin->nq( @urls ); for ( 1 .. $maxthreads ){ my $tt = threads->create( \&tryHTTP, $qin, $qout ); $tt->detach; } my $quitter = 0; while( 1 ){ if( my $res = $qout->dq_nb ){ doSomething( $ret ); } sleep 1; ### this part not used before, completely untested, probably too compl +ex if( not $qin->cq ){ $quitter++; } else { $quitter = 0; } if( not $qout->cq ){ if( $quitter > 5 ){ die "FINISHED, no more urls to process, no more respon +ses to process\n"; } } } } sub tryHTTP { threads->detach(); ## can't join me :) my( $qin, $qout ) = @_; while( 1 ){ my( $url ) = $qin->dq; ... my $res = $lwp->get( $url ); $qout->nq( $res ); } return; }

Re: simple multithreading with curl
Re: Perl crashing with Parallel::ForkManager and WWW::Mechanize
Re^3: Using LWP instead of wget?
Re^3: Fast fetching of HTML response code
Re^2: Need help with Perl multi threading
Re^10: Consumes memory then crashs Re^9: Consumes memory then crashs Re: Are there any memory-efficient web scrapers?
Your main event may be another's side-show.
Re: Perl threads to open 200 http connections
Re^3: trying to get timeout to work (easier with threads)

Comment on Re: Proper way to thread this in PERL. (mech lwp asynchronous)
Download Code

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1068377]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others taking refuge in the Monastery: (7)
As of 2015-09-05 11:32 GMT
Find Nodes?
    Voting Booth?

    My preferred temperature scale is:

    Results (152 votes), past polls