Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister

Different answers for script and browser (LWP)

by Sly_G (Novice)
on Jun 18, 2013 at 18:43 UTC ( #1039627=perlquestion: print w/replies, xml ) Need Help??
Sly_G has asked for the wisdom of the Perl Monks concerning the following question:

Site that I'm parsing with perl script recently moved to human-readable urls. I'm trying to get redirects from "id" requests to current addresses. For example, when I'm going to "" in browser, site server redirects it to ""

But when I'm trying to get this moved location in my script, I don't get 301 answer, it returns "200 OK" for some reason.


use LWP::UserAgent; use HTTP::Cookies; use HTTP::Headers; $ua = LWP::UserAgent->new; $hh = HTTP::Headers->new( User-Agent => 'Mozilla/5.0 (Windows NT 5.1; rv:21.0) Gecko/20100101 +Firefox/21.0', Accept => 'text/html,application/xhtml+xml,application/xml;q=0.9,*/* +;q=0.8', Accept-Language => 'en-us,en;q=0.7,ru;q=0.3', Accept-Encoding => 'gzip, deflate', Connection => 'keep-alive', ); $ua->default_headers( $hh ); $cookie_jar = HTTP::Cookies->new( ); $ua->cookie_jar($cookie_jar); @rename = ( 294 , 9806 , 9807 , ); for $ren (@rename) { $res = $ua->get("$ren"); print $res->header('Location')."\n"; }

I used http sniffer to see what's going on with browser, and there's nothing special, really:

GET /show.php?id=294 HTTP/1.1 Host: User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:21.0) Gecko/20100101 Firef +ox/21.0 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0. +8 Accept-Language: en-us,en;q=0.7,ru;q=0.3 Accept-Encoding: gzip, deflate Cookie: __utma=83753984.1287093182.1370328704.1371539232.1371576574.7; + __utmz=83753984.1370328704.1.1.utmcsr=(direct)|utmccn=(direct)|utmcm +d=(none); __utmb=83753984.10.10.1371576574; _ym_visorc=w; PHPSESSID=4 +p0ql1mitskhkbg3os47v1hc11; __utmc=83753984 Connection: keep-alive

By accident I stumbled on this: if I use $ua->get("$ren 0"), i.e. space and some symbols after URL string, I'm getting completely different response, and there it is, "301 moved" and new location.

I can't understand what's happening.

Replies are listed 'Best First'.
Re: Different answers for script and browser (LWP)
by rnewsham (Chaplain) on Jun 18, 2013 at 21:39 UTC

    LWP will follow the 301 so you will get the 200 from the new location. The details of the chain followed will be in previous. I have modified your code to get the Location from previous and it should do what you want. I have also added use strict and use warnings, as that is always sensible.

    use strict; use warnings; use LWP::UserAgent; use HTTP::Cookies; use HTTP::Headers; my $ua = LWP::UserAgent->new; my $hh = HTTP::Headers->new( 'User-Agents' => 'Mozilla/5.0 (Windows NT 5.1; rv:21.0) Gecko/201001 +01 +Firefox/21.0', Accept => 'text/html,application/xhtml+xml,application/xml;q=0.9,*/* +;q=0.8', 'Accept-Language' => 'en-us,en;q=0.7,ru;q=0.3', 'Accept-Encoding' => 'gzip, deflate', Connection => 'keep-alive', ); $ua->default_headers( $hh ); my $cookie_jar = HTTP::Cookies->new( ); $ua->cookie_jar($cookie_jar); my @rename = ( 294 , 9806 , 9807 , ); for my $ren (@rename) { my $res = $ua->get("$ren"); print $res->previous->header('Location')."\n"; }
    Output /catalog/amulets/ /catalog/amulets/the_cult/ /catalog/amulets/aztek/
      Alternatively, there is $ua->simple_request() which does not redirect.
      Wow, thanks a lot! I wouldn't get to it by myself.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1039627]
Approved by Corion
Front-paged by MidLifeXis
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others meditating upon the Monastery: (13)
As of 2016-10-25 18:47 GMT
Find Nodes?
    Voting Booth?
    How many different varieties (color, size, etc) of socks do you have in your sock drawer?

    Results (326 votes). Check out past polls.