Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

Simulating --keep-session-cookies wget option with LWP ?

by i5513 (Monk)
on Sep 27, 2011 at 09:32 UTC ( #928053=perlquestion: print w/ replies, xml ) Need Help??
i5513 has asked for the wisdom of the Perl Monks concerning the following question:

Hi monks,

First, I cannot install LWP::Parallel::UserAgent (it is not installable in debian sid). I would like to solve this problem without any other module (WWW::Mechanize ..., and other modules which can be found searching in the monastery).

How can I simulate next code in perl (just rewriting wget command with LWP module) ?
use threads; my $user="xxx"; my $password="yyy"; sub my_thread_sub { my ($url)=@_; my $cookie=to_string ($url); # to string convert url to an acceptable +filename my $WGET; if ( -e $cookie ) { open $WGET, "wget -q -O - -w 1 -T 1 -t 1 --load-cookies $cookie -- +user $user --password $password $url |"; } else { open $WGET, "wget -q -O - -w 1 -T 1 -t 1 --save-cookies $cookie -- +keep-session-cookies --user $user --password $password $url |" } local $/; $output=<$WGET>; ... return 0; } my @thrs; my @urls=qw("http://example1/" "http://example2/"); dontgo: my $i=0; foreach my $url (@urls) { $thrs[$i]=threads->new ("my_thread_sub",$url); $i++; } foreach my $thr (@thrs) { $thr->join; } sleep 3 goto dontgo; exit 0;

I tried to simulate it with LWP, but I fail when I want to load the previous cookie file (lwp saves always an empty cookie file).

I tried too loading file from wget cookie file, but lwp removes the content. I would like to maintain the http session in each url, and I wouldn't like to have running threads all the time (that is the other option)

So, without pointing me to another module (aka Threads::Pool), how can I s/wget/lwp/ in this case ?

Thank you very much !

Comment on Simulating --keep-session-cookies wget option with LWP ?
Download Code
Re: Simulating --keep-session-cookies wget option with LWP ?
by Corion (Pope) on Sep 27, 2011 at 09:36 UTC

    As you don't show the code using LWP::UserAgent, it's hard for me to guess what you're doing wrong (besides that).

    My guess is that you are directly overwriting the cookie file in your unshown program. I use WWW::Mechanize (not WWW::Mechandize) with (persistent) cookies and it works well. As WWW::Mechanize just uses LWP::UserAgent, both should work well with cookies.

    I doubt that the cookies file that wget writes is supposed to work with LWP::UserAgent resp. HTTP::Cookies, but maybe they use the same format. I haven't seen this documented anywhere.

      I can read in Re: Passing a cookie with LWP::UserAgent about session cookies are deleted when browser (UserAgent in LWP) is closed, so every thread which I start will start with an new cookie

      My last try with HTTP::Cookies was:

      # try to simulate wget: my $cookie_jar = HTTP::Cookies->new( file => "$cookie", autosave => 1, ); my $ua = LWP::UserAgent->new ; $ua->timeout(2); $ua->cookie_jar($cookie_jar); $ua->credentials ("$host:$port","Tomcat Manager Application",$user,$pa +ssword); my $request= HTTP::Request->new (GET=> "http://$host:$port/manager/sta +tus?XML=true"); my $status=$ua->request ($request); if ($status->is_success) { $page=$status->decoded_content(); } else { print STDERR $status->status_line, "\n"; return 1; }
      Thanks!

        Your code works for me, once I fix it to actually become a program:

        use strict; use HTTP::Cookies; use LWP::UserAgent; # try to simulate wget: my $cookie_jar = HTTP::Cookies->new( file => "mycookie.cookie", autosave => 1, ); my $ua = LWP::UserAgent->new; $ua->timeout(2); $ua->cookie_jar($cookie_jar); $ua->env_proxy; if( !@ARGV) { warn "Requesting page"; my $request= HTTP::Request->new (GET=> "http://www.google.com/webhp?hl +=en"); my $status=$ua->request ($request); if ($status->is_success) { print $status->decoded_content(); } else { print STDERR $status->status_line, "\n"; return 1; } }; use Data::Dumper; warn "Cookies in jar:"; $cookie_jar->scan(sub { warn Dumper \@_ });

        It (re)generates the cookie file if run without any command line argument, and if run with a command line argument, it just dumps the cookies from the file.

        Whatever your problem seems to be, I would suspect that it lies elsewhere.

Re: Simulating --keep-session-cookies wget option with LWP ?
by armstd (Friar) on Sep 28, 2011 at 20:17 UTC

    Why not install LWP::Parallel::UserAgent by hand? I don't see what Debian has to do with anything other than not providing the package?

    --Dave

      Debian keeps clean the system, it allow you to uninstall packages

      Seems like installing cpan modules could not do that, because you have not any "universal" method to uninstall modules like apt does.

        Seems like installing cpan modules could not do that

        That is the worst reason not to install cpan modules -- cpanp u Module::Name -- Yes, even you can use CPAN

        Nothing stops you from installing the large majority of modules into custom paths you set up yourself, even without cpan. Most modules have pretty straightforward build systems that make that job easy. You can distribute any dependencies you want with the stuff you're writing as well, and use relative "use lib" paths.

        --Dave

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://928053]
Approved by GrandFather
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others surveying the Monastery: (8)
As of 2014-09-17 06:25 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (61 votes), past polls