http://www.perlmonks.org?node_id=487286


in reply to WWW::Mechanize follow meta refreshes

Thanks for posting this. It was just right to get me on my way to becoming a first-time user of WWW::Mechanize.

I've put together a little script that will do some pre-fetching on a web-application. The app has a result-cache and will output a refresh whenever it encounters a cache-miss and does its calculations.

Instead of parsing the refresh-URLs or somesuch, I simply limited the number of refreshs the script will perform. Perhaps somemonk will find a useful snippet of code herein.

#!/usr/bin/perl -w use strict; use WWW::Mechanize; my $maxrefreshs=5; my $debug=0; my @urls=( "http://www.whatever.xy/cgi-bin/cgi.pl?ACTION=surnames", "http://www.whatever.xy/cgi-bin/cgi.pl?ACTION=unisearch&MATCHSTRING +=foo", "http://www.whatever.xy/cgi-bin/cgi.pl?ACTION=path&STARTNODE=I1&END +NODE=I1257", ); my $refreshs=0; my $mech= new WWW::Mechanize; foreach my $url (@urls){ while($refreshs < $maxrefreshs){ $mech->get($url); my $c=$mech->content; $debug and print $c; if($c =~/<meta\s+http-equiv="refresh"\s+content="\d+;\s*url=([^" +]*)"/mi){ $url=($1 or $url); ++$refreshs; }else{ last; } } }