<?xml version="1.0" encoding="windows-1252"?>
<node id="447836" title="Re: WWW::Mechanize follow meta refreshes" created="2005-04-14 11:19:50" updated="2005-07-10 22:33:40">
<type id="11">
note</type>
<author id="33345">
Kanji</author>
<data>
<field name="doctext">
&lt;blockquote&gt;I've used a regex as my refresh template is fixed and very, very simple. However, if yours isn't/aren't then you should replace the regex with a call to something like HTML::TokeParser.&lt;/blockquote&gt;

&lt;p&gt;This is actually built into &lt;tt&gt;WWW::Mechanize&lt;/tt&gt; (well, LWP...) for you, so you can do something like:-&lt;/p&gt;

&lt;code&gt;if ($mech-&gt;response and my $refresh = $mech-&gt;response-&gt;header('Refresh'))
{
    my($delay, $uri) = split /;url=/i, $refresh;

    $uri ||= $mech-&gt;uri; # No URL; reload current URL.

    sleep $delay;

    $mech-&gt;get($uri);
}&lt;/code&gt;

&lt;p&gt;&lt;tt&gt;$delay&lt;/tt&gt; should probably be validated to protect against malformed META refresh tags, and there's a whole other headache about potential loops if you hack WWW::Mechanize to follow refreshes automatically.&lt;/p&gt;

&lt;!-- Node text goes above. Div tags should contain sig only --&gt;
&lt;div class="pmsig"&gt;&lt;div class="pmsig-33345"&gt;
&lt;p&gt;&amp;nbsp; &amp;nbsp; --k.&lt;/p&gt;&lt;br&gt;
&lt;/div&gt;&lt;/div&gt;</field>
<field name="root_node">
447314</field>
<field name="parent_node">
447314</field>
</data>
</node>
