Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Re^2: Robust Handling of Broken Links in Mechanize?

by pat_mc (Pilgrim)
on Nov 20, 2009 at 12:58 UTC ( #808418=note: print w/ replies, xml ) Need Help??


in reply to Re: Robust Handling of Broken Links in Mechanize?
in thread Robust Handling of Broken Links in Mechanize?

Wolfgang -

This is great stuff ... it looks like this fixes the problem:

sub download() { my $doc = shift @_; my $mech = WWW::Mechanize -> new( onerror => undef ); return unless defined( $mech -> get( $doc ) ); my $link = $mech -> find_link( url_regex => qr/\.pdf/ ); return unless defined( $link ); $link = $link -> url_abs; return unless ( $mech -> get ( $link ) ); # This is the GET oper +ation which fails. my $name = $1 if $link =~/.+\/(.+\.pdf)/; $mech -> save_content( $name ); }
Thanks for your help! It made my day.

Cheers -

Pat


Comment on Re^2: Robust Handling of Broken Links in Mechanize?
Download Code

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://808418]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others browsing the Monastery: (7)
As of 2014-12-28 08:50 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (179 votes), past polls