my( @links ) = $mech->links();
foreach my $link ( @links ) {
my $temp_mech = WWW::Mechanize->new();
$temp_mech->get( $link );
# Do whatever you want now...
}
| [reply] [d/l] [select] |
Cool, Thanks,
My problem is that I cant figure out how to recurse this, so that I am visiting each link on the site. Can you offer any pointers there?
| [reply] |
# mechanize fetching of first page
my %seen;
my @links = $mech->links();
while ( @links && @links < 1_000 ) {
my $link = shift @links;
my $url = $link->url()
next if $seen{$url}++;
# mechanize fetch of $url
push @links, $mech->links;
sleep 1;
}
This will prevent you from fetching the same url and it will stop when you have no more links to visit or you find the site had far too many links to follow then you intended. The 1000 was an arbitrary limit and need not be there at all. You can change between depth first and breadth first by ajusting push/unshift and shift/pop.
| [reply] [d/l] |