Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Re: Re: Local::SiteRobot - a simple web crawling module

by rob_au (Abbot)
on Nov 25, 2001 at 05:17 UTC ( #127335=note: print w/ replies, xml ) Need Help??


in reply to Re: Local::SiteRobot - a simple web crawling module
in thread Local::SiteRobot - a simple web crawling module

The problems which I encountered with WWW::SimpleRobot related to the traverse function not returning traversal results via the $object->pages and $object->urls methods - The problem appeared to relate to the shift method by which the author was iterating through the constructed queue undefining the @pages results array before it was returned at the end of the function.

Better than just reporting this to the author, I have submitted a fix patch which corrects this behaviour by pushing results into a separate array to the queue.

115a116 > my @results; 150a152 > push (@results, $page); 165,166c167,168 < $self->{pages} = \@pages; < $self->{urls} = [ map { $_->{url} } @pages ]; --- > $self->{pages} = \@results; > $self->{urls} = [ map { $_->{url} } @results ];

Note that I never meant for my little piece of code to be viewed as a code fork from WWW::SimpleRobot but rather just an additional available option.

 

Ooohhh, Rob no beer function well without!


Comment on Re: Re: Local::SiteRobot - a simple web crawling module
Select or Download Code

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://127335]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others examining the Monastery: (18)
As of 2015-07-01 19:29 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (17 votes), past polls