Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

Re: WWW::Mechanize follow meta refreshes

by mhi (Friar)
on Aug 28, 2005 at 20:46 UTC ( #487286=note: print w/ replies, xml ) Need Help??


in reply to WWW::Mechanize follow meta refreshes

Thanks for posting this. It was just right to get me on my way to becoming a first-time user of WWW::Mechanize.

I've put together a little script that will do some pre-fetching on a web-application. The app has a result-cache and will output a refresh whenever it encounters a cache-miss and does its calculations.

Instead of parsing the refresh-URLs or somesuch, I simply limited the number of refreshs the script will perform. Perhaps somemonk will find a useful snippet of code herein.

#!/usr/bin/perl -w use strict; use WWW::Mechanize; my $maxrefreshs=5; my $debug=0; my @urls=( "http://www.whatever.xy/cgi-bin/cgi.pl?ACTION=surnames", "http://www.whatever.xy/cgi-bin/cgi.pl?ACTION=unisearch&MATCHSTRING +=foo", "http://www.whatever.xy/cgi-bin/cgi.pl?ACTION=path&STARTNODE=I1&END +NODE=I1257", ); my $refreshs=0; my $mech= new WWW::Mechanize; foreach my $url (@urls){ while($refreshs < $maxrefreshs){ $mech->get($url); my $c=$mech->content; $debug and print $c; if($c =~/<meta\s+http-equiv="refresh"\s+content="\d+;\s*url=([^" +]*)"/mi){ $url=($1 or $url); ++$refreshs; }else{ last; } } }


Comment on Re: WWW::Mechanize follow meta refreshes
Download Code

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://487286]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others musing on the Monastery: (5)
As of 2014-11-27 00:31 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My preferred Perl binaries come from:














    Results (178 votes), past polls