Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

Using LWP::Simple to read a redirected page

by MorayJ (Acolyte)
on Nov 13, 2012 at 17:43 UTC ( #1003678=perlquestion: print w/ replies, xml ) Need Help??
MorayJ has asked for the wisdom of the Perl Monks concerning the following question:

Hi

The UK government has changed its website and I'm trying to check up on links that I have to see if they still work on the new structure

I'm using LWP::Simple for this

If I put in the web address https://www.insolvencydirect.gov.uk/isolv, it very kindly returns http://www.bis.gov.uk/insolvency when I use $request->uri ($request being found with:

my $browser = LWP::UserAgent->new; my $response = $browser->get( $url ); my $request = $response->request();
)

This is where the site now sends you if you go to that url

Difficulty is encountered with other links, like www.direct.gov.uk/en/Motoring/OwningAVehicle/TaxationClasses/DG_4022042 which takes you to https://www.gov.uk/vehicle-exempt-from-car-tax if you use a browser, but which $request->uri returns the original url I put in.

What are they doing differently? What do I need to do differently? I guess it's probably more of a web question that just about perl.

Thanks for any advice

MorayJ

Comment on Using LWP::Simple to read a redirected page
Select or Download Code
Re: Using LWP::Simple to read a redirected page
by Jukari (Initiate) on Nov 13, 2012 at 19:36 UTC
    Might be DNS related... have you tried using the IP addresses directly?

      I haven't. But I'll see if I can work that out, and see if it makes a difference.

      Thanks for the suggestion

Re: Using LWP::Simple to read a redirected page
by zentara (Archbishop) on Nov 13, 2012 at 20:12 UTC
      The request object contains the url you've been redirected to

      Maybe I'm missing a subtlety, but this appears to be saying that my uri taken from the request should be the final url. But it doesn't reflect what I see for the final url in Chrome.

Re: Using LWP::Simple to read a redirected page
by Anonymous Monk on Nov 13, 2012 at 22:34 UTC

    Hi

    OK, long story short...I think the url should have http in front of it and LWP just works with what it's got and doesn't complain.

    I tried again using WWW::Mechanize and it demanded an absolute url. I put in http - it then said it couldn't deal with https, and instructed me to install LWP-Protocol-https.

    I went back to LWP and fed it the absolute url, and it resolved properly giving me the forwarded url as it ought. Out of interest I removed LWP-Protocol-https and that didn't seem to bother it.

    Thanks for the help

    MorayJ

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1003678]
Front-paged by Arunbear
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chanting in the Monastery: (16)
As of 2014-07-11 15:04 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    When choosing user names for websites, I prefer to use:








    Results (229 votes), past polls