Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

How to get back the url of a page

by sulfericacid (Deacon)
on May 16, 2007 at 15:40 UTC ( #615814=perlquestion: print w/ replies, xml ) Need Help??
sulfericacid has asked for the wisdom of the Perl Monks concerning the following question:

How do you get the URL of the page after your last request via www::mechanize? I was reading the DOCS and it looks like $mech->uri() should hold the URL but all it does is error out with
Use of uninitialized value in string at bot.pl line 194.
I have a feeling this site occassionally redirects to an IP address when the traffic is at high tide and I need to test whether or not my URLs are still accessible.
my $mech = WWW::Mechanize->new(); $mech->get($url); print $mech->uri();


"Age is nothing more than an inaccurate number bestowed upon us at birth as just another means for others to judge and classify us"

sulfericacid

Comment on How to get back the url of a page
Select or Download Code
Re: How to get back the url of a page
by cormanaz (Chaplain) on May 16, 2007 at 16:29 UTC
    Hmm this is more of a challenge than I expected.

    You can get the response object from the mech request with $mech->{res}. HTTP::Response has a method $response->is_redirect that is supposed to tell you if it was a redirect, but when I tried that, i.e. $mech->{res}->is_redirect(), with a known redirect it returned null. Maybe some other monk knows why it doesn't work.

    Anyway assuming the redirect is done with a refresh header, you can tell by getting $mech->{res}->header{'refresh'} I don't have a javascript direct to test it with, but I suspect it might not work for that

    Cheers

    Steve

Re: How to get back the url of a page
by xhunter (Sexton) on May 16, 2007 at 17:11 UTC

    $mech->uri works as you would like for me.

    Would you like to provide more details on the error. What does line 194 of bot.pl look like?

      I tried $mech->uri() and it gave the original page, not the one redirected to. Sulfericacid wants the uri of the page mech winds-up on after processing any redirects.
        cormanaz is right. This image site sometimes redirects images to an IP address when the servers are getting high traffic. I need to be able to load the original image URL and detect whether or not it's going to redirect me. If so, I need to know what the address is of the image.

        This is my current set up.

        foreach my $pic (@pics) { my $mech = WWW::Mechanize->new(); $mech->agent_alias( 'Windows IE 6' ); $mech->get($pic); my $location = $mech->uri(); push(@final_pics, "$location"); print "\ntesting $pic with location $location"; }


        "Age is nothing more than an inaccurate number bestowed upon us at birth as just another means for others to judge and classify us"

        sulfericacid

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://615814]
Approved by dsb
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chanting in the Monastery: (9)
As of 2014-10-02 13:42 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    What is your favourite meta-syntactic variable name?














    Results (60 votes), past polls