Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

Re^4: Cookie protected web page and file downloading

by reTard (Sexton)
on Jun 02, 2005 at 05:41 UTC ( #462768=note: print w/ replies, xml ) Need Help??


in reply to Re^3: Cookie protected web page and file downloading
in thread Cookie protected web page and file downloading

So far I have:

#!/usr/bin/perl use Data::Dumper; use LWP::UserAgent; use WWW::Mechanize; my $mech = WWW::Mechanize->new(cookie_jar => {}, agent => "WWW-Mechani +ze/0.01"); $url = 'http://www.ibm.com/servers/eserver/support/pseries/aixfixes.ht +ml'; my $ua = new LWP::UserAgent; $ua->proxy('http','192.168.1.248'); $mech->get( $url ); $mech->follow_link( text_regex => qr/More fix services/); $mech->follow_link( text_regex => qr/AIX 5.3/); $mech->follow_link( text_regex => qr/Data file for AIX 5.3/); print Dumper $mech;

But this fails as it is not going through the proxy
Thanks

UPDATED the print dump shows

'status' => 500, 'content' => '500 Can\'t connect to www.ibm.com:80 (B +ad hostname \'www.ibm.com\')


Comment on Re^4: Cookie protected web page and file downloading
Select or Download Code
Re^5: Cookie protected web page and file downloading
by tlm (Prior) on Jun 02, 2005 at 05:58 UTC

    I would not expect what you have to work. That's because you have created two user agent objects, $ua and $mech (yes the latter is a user agent object too, because WWW::Mechanize is a subclass of LWP::UserAgent), one ($ua) that is configured for use with a proxy (but is otherwise not used), and the other ($mech) that is not configured to use a proxy. I think what you want is something more like this:

    #!/usr/bin/perl use Data::Dumper; use WWW::Mechanize; my $mech = WWW::Mechanize->new(cookie_jar => {}, agent => "WWW-Mechani +ze/0.01"); $url = 'http://www.ibm.com/servers/eserver/support/pseries/aixfixes.ht +ml'; $mech->proxy('http','192.168.1.248'); $mech->get( $url ); $mech->follow_link( text_regex => qr/More fix services/); $mech->follow_link( text_regex => qr/AIX 5.3/); $mech->follow_link( text_regex => qr/Data file for AIX 5.3/); print Dumper $mech;
    Note that what I have done is treat $mech as an LWP::UserAgent object. (If it's not clear what's going on, take a look at perltoot.)

    BTW, you should get into the habit of checking for the success of requests made through the $mech object; you do this with its is_success method.

    the lowliest monk

      Instead of manually checking the success/failure after each step, I found it more convenient to have the Mechanize object die on error. This is very convenient for quick development, later, if you want a robust program, you should disable that feature again.

      ... my $mech = WWW::Mechanize->new( autocheck => 1 ); # cookie jar and user agent are set implicitly

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://462768]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others browsing the Monastery: (15)
As of 2015-07-01 19:26 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (17 votes), past polls