Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

Re: Cookie protected web page and file downloading

by reTard (Sexton)
on Jun 03, 2005 at 00:44 UTC ( #463071=note: print w/ replies, xml ) Need Help??


in reply to Cookie protected web page and file downloading

Hi again
I've made many of the suggested changes and the script now looks like:

#!/usr/bin/perl use Data::Dumper; use WWW::Mechanize; my $mech = WWW::Mechanize->new( cookie_jar => {}, agent => "WWW-Mechanize/0.01", protocols_allowed => ['http'], autocheck => 1); $url = 'http://www.ibm.com/servers/eserver/support/pseries/aixfixes.ht +ml'; $mech->proxy('http','172.17.1.248'); $mech->get( $url ); print Dumper $mech; $a=<STDIN>; `clear`; $mech->follow_link( text_regex => qr/More fix services/) or die; print Dumper $mech; $a=<STDIN>; `clear`; $mech->follow_link( text_regex => qr/AIX 5.3/) or die; print Dumper $mech; $a=<STDIN>; `clear`; $mech->follow_link( text_regex => qr/Data file for AIX 5.3/) or die; print Dumper $mech;

Now I'm getting the following error:
Error GETing http://www.ibm.com/servers/eserver/support/pseries/aixfixes.html: Access to 'http' URIs has been disabled at aixfixes.pl line 15
It's dying on the $mech->get( $url ); line.
If I remove the protocols_allowed => ['http'], bit I get a different error:
Error GETing http://www.ibm.com/servers/eserver/support/pseries/aixfixes.html: Protocol scheme '' is not supported at lwp.pl line 15.
I can manually access this page with lynx so it must be something in the script I'm doing wrong. Any more ideas?
Thanks


Comment on Re: Cookie protected web page and file downloading
Select or Download Code
Replies are listed 'Best First'.
Re^2: Cookie protected web page and file downloading
by dave0 (Friar) on Jun 03, 2005 at 01:51 UTC
    It looks like you're specifying your proxy incorrectly. $mech->proxy() takes a URL as its second argument, not an IP address.

    Try $mech->proxy('http', 'http://172.17.1.248/') and see if that works.

      Yay! Thank you!!

      This almost works now!

      I can see the link to the file I want to download now but I'm now sure how to reference it.

      'last_uri' => 'http://www-912.ibm.com/eserver/support +/fixinfo/download?file=LatestFixData53', 'uri' => 'http://www-912.ibm.com/eserver/support/fixi +nfo/download?file=LatestFixData53',

      How do I save this? I've looked at lwp-download but that only confused me more.
      Thanks
        I know it's bad form to be replying to my own posts but this is fixed now thanks to a work mate.

        The final code is:

        use Data::Dumper; use WWW::Mechanize; my $mech = WWW::Mechanize->new( cookie_jar => {}, agent => "WWW-Mechanize/0.01", protocols_allowed => ['http','https'], protocols_forbidden => [undef], autocheck => 1); $url = 'http://www.ibm.com/servers/eserver/support/pseries/aixfixes.ht +ml'; $mech->proxy('http','http://PROXY/'); $mech->get( $url ); #print Dumper $mech; $mech->follow_link( text_regex => qr/More fix services/) or die; #print Dumper $mech; $mech->follow_link( text_regex => qr/AIX 5.3/) or die; #print Dumper $mech; $mech->follow_link( text_regex => qr/Data file for AIX 5.3/) or die; #print Dumper $mech; $mech->get('http://www-912.ibm.com/eserver/support/fixinfo/download?fi +le=LatestFixData53'); print $mech->content;

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://463071]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others exploiting the Monastery: (11)
As of 2015-07-08 05:22 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (94 votes), past polls