Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

Maintaining sessions with WWW::Mechanize across script instances?

by EvanK (Chaplain)
on Apr 21, 2011 at 18:46 UTC ( [id://900690]=perlquestion: print w/replies, xml ) Need Help??

EvanK has asked for the wisdom of the Perl Monks concerning the following question:

I'm utilizing WWW::Mechanize to get a page and return the content of that page, and then in a different instance of the script, submit a form on that same page with the original session. The way I intended to do this was to store the mech object via Storable or YAML in the first instance, then retrieve the mech object in the second instance.

Note: for the purpose of these tests, I'm requesting to "http://localhost/mech-form.php" which is just a page with a form that submits to itself and prints out its current session id.

use strict; use WWW::Mechanize; use Storable qw(store retrieve freeze thaw); # init a mech instance my $mech = WWW::Mechanize->new(); # on first instance... if(! -f 'stored_mech.dat') { # create, get, freeze, and store $mech->get('http://localhost/mech-form.php'); store([freeze($mech)], 'stored_mech.dat'); exit; } # on subsequent instance... else { # retrieve, thaw, and submit form my $stored = retrieve('stored_mech.dat'); $mech = thaw($stored->[0]); $mech->submit_form(form_number=>1); }
Storable dies complaining that it can't store CODE items, so I'm using YAML's freeze/thaw and DumpFile/LoadFile instead:
use strict; use WWW::Mechanize; use YAML qw(DumpFile LoadFile freeze thaw); # init a mech instance my $mech = WWW::Mechanize->new(); # on first instance... if(! -f 'stored_mech.dat') { # get, freeze, and store $mech->get('http://localhost/mech-form.php'); DumpFile('stored_mech.dat', [freeze($mech)]); exit; } # on subsequent instance... else { # retrieve, thaw, and submit form my $stored = LoadFile('stored_mech.dat'); $mech = thaw($stored->[0]); $mech->submit_form(form_number=>1); }
But after I load the mech object and try to submit the form, I get the following error:
Can't locate object method "scheme" via package "URI::http" at /usr/sh +are/perl5/URI.pm line 52.
I decided to try and save/load the cookies as well. That way, after I load the mech object, I can re-get the page, then just reload the saved cookie jar which should continue the same session
use strict; use WWW::Mechanize; use YAML qw(DumpFile LoadFile freeze thaw); # init a mech instance my $mech = WWW::Mechanize->new(cookie_jar => {ignore_discard => 0}); # + save even browser-lifetime cookies # on first instance... if(! -f 'stored_mech.dat') { # get, freeze, save cookie jar, and store $mech->get('http://localhost/mech-form.php'); $mech->cookie_jar->save('cookies.dat'); DumpFile('stored_mech.dat', [freeze($mech)]); exit; } # on subsequent instance... else { # retrieve, thaw my $stored = LoadFile('stored_mech.dat'); $mech = thaw($stored->[0]); # re-get page and reload cookies $mech->get('http://localhost/mech-form.php'); $mech->cookie_jar->clear(); $mech->cookie_jar->load('cookies.dat'); # submit form with (presumably) original session $mech->submit_form(form_number=>1); }
However, it seems like it's still using a new session after i reload the cookie jar, which I confirmed by dumping the response headers and page content (which contains the server-side session id):
use strict; use WWW::Mechanize; use YAML qw(DumpFile LoadFile freeze thaw); # init a mech instance my $mech = WWW::Mechanize->new(cookie_jar => {ignore_discard => 0}); # + save even browser-lifetime cookies # on first instance... if(! -f 'stored_mech.dat') { # get, freeze, save cookie jar, and store $mech->get('http://localhost/mech-form.php'); $mech->cookie_jar->save('cookies.dat'); printf "Before freeze & save: \%s\n\n", $mech->response()->headers +()->as_string(); printf "Page session id: \%s\n\n", $mech->content(); DumpFile('stored_mech.dat', [freeze($mech)]); exit; } # on subsequent instance... else { # retrieve, thaw my $stored = LoadFile('stored_mech.dat'); $mech = thaw($stored->[0]); # re-get page and reload cookies $mech->get('http://localhost/mech-form.php'); $mech->cookie_jar->clear(); $mech->cookie_jar->load('cookies.dat'); # submit form with (presumably) original session $mech->submit_form(form_number=>1); printf "After thaw & submit: \%s\n", $mech->response()->headers()- +>as_string(); printf "Page session id: \%s\n\n", $mech->content(); }
This results in the output below (after I've stripped the page's markup and the non-related headers):
~$ perl ./mech-test.pl Before freeze & save: Set-Cookie: PHPSESSID=p9ogedv2qf1r5h664la79is1j2 +; path=/ Page session id: p9ogedv2qf1r5h664la79is1j2 ~$ perl ./mech-test.pl After thaw & submit: Set-Cookie: PHPSESSID=tq3devkvch2t5fsq6qbp707de4; + path=/ Page session id: tq3devkvch2t5fsq6qbp707de4
So even after clearing and reloading the cookie jar, it's not keeping the same session across multiple instances of this script...and I have no clue where to go from here. Is there something I'm missing? Am I going to have to go mucking about in the Mechanize internals?

Replies are listed 'Best First'.
Re: Maintaining sessions with WWW::Mechanize across script instances?
by Eliya (Vicar) on Apr 21, 2011 at 20:28 UTC
    ...which I confirmed by dumping the response headers

    Have you also checked the request headers for the second invocation (e.g. using wireshark) to see if the old session ID is actually being sent.  This might help to narrow down on where/when it is "getting lost".

      That was it! Saving and loading the cookie jar, apparently, wasn't actually sending the saved cookie in the next request. I'm not sure why, but I'm just manually setting the cookie header before the form submit and that solves the session problem.
Re: Maintaining sessions with WWW::Mechanize across script instances?
by trwww (Priest) on Apr 22, 2011 at 00:54 UTC

    I don't think I'd try to serialize/deserialize complicated data structure unless I knew it like the back of my hand. There are some things in perl that just can't be serialized like that. If you need it to persist, I suggest a job server like POE or a mod_perl process and storing the object in global memory. Before that though, I'd try to figure out how to store the object state and then try to reinitialize a new instance of the object. Either of those is going to be faster than dumping the raw object and reading it back.

    As far as the cookies, did you look at the cookie file to make sure it looks like you want it to? You may have to muck with the WWW::Mechanize internals. But its just a wrapper around LWP, so it is pretty simple to do. I'm sure a glance at the source for those methods and maybe a couple of debugging sessions will show you what the problem is.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://900690]
Approved by Eliya
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others admiring the Monastery: (3)
As of 2024-03-28 17:37 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found