Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Mechanize and Cookie Confusion

by Anonymous Monk
on Aug 25, 2011 at 18:03 UTC ( #922420=perlquestion: print w/ replies, xml ) Need Help??
Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I am trying to understand how cookie work. Pleas forgive my ignorance.
#!/usr/bin/perl use strict; use WWW::Mechanize; use Data::Dumper; my $url = 'http://www.google.com'; my $m = WWW::Mechanize->new( autocheck => 0 ); $m->cookie_jar( HTTP::Cookies->new() ); my $response = $m->get( $url ); ## Do I need to do this? my $extract_cookie = $m->cookie_jar->extract_cookies( $response ); $m->cookie_jar->add_cookie_header( $extract_cookie ); my $request = $m->get( $url ); print Dumper \$request;
Questions:
  • When I create a cookie jar, does it mean that Mechanize will be using the cookie throughout the session?
  • Do I even need to extract the cookie information and try to add it in the request header? If yes, the above code does not work when I try to add the cookie into the header.
  • What do you use to see what you are sending an http request through Mechanize? For example, do you use WireShark to see what you are sending? Is there something simpler?
  • Comment on Mechanize and Cookie Confusion
    Download Code
    Re: Mechanize and Cookie Confusion
    by Anonymous Monk on Aug 25, 2011 at 18:28 UTC
      No need to set up a cookie jar, Mechanize has a built-in one.
      #!/usr/bin/env perl use strict; use warnings FATAL => 'all'; use WWW::Mechanize qw(); use Data::Dumper qw(Dumper); my $url = 'http://www.google.com'; my $m = WWW::Mechanize->new; # do not disable autocheck, it's useful my $response = $m->get($url); my $cookie_jar = $m->cookie_jar; # returns a HTTP::Cookies object $cookie_jar->scan(sub { print Dumper \@_ }); # cookies are sent with subsequent requests $m->get($url);

      When I create a cookie jar, does it mean that Mechanize will be using the cookie throughout the session?
      Yes.
      Do I even need to extract the cookie information and try to add it in the request header?
      No, it's automatic, like the Mechanize documentation says.
      do you use WireShark
      Wireshark is fine with me.
    Re: Mechanize and Cookie Confusion
    by bichonfrise74 (Vicar) on Aug 25, 2011 at 22:34 UTC
      You can use something like this to see what you are sending and receiving:
      $m->add_handler("request_send", sub { shift->dump; return }); $m->add_handler("response_done", sub { shift->dump; return });
    Re: Mechanize and Cookie Confusion
    by sundialsvc4 (Abbot) on Aug 26, 2011 at 14:04 UTC

      One important thing to remember about cookies is ... the host must be sure to send the initial cookie-request to the client, so that the client will have stashed a copy of that cookie and so can return the cookie to you with subsequent requests that it (the client...) makes.   (It is, unfortunately, “annoyingly easy” to set up a cookie in your local database (or “jar”) and to forget to actually send the thing to the client.)   The host must thereafter know that the cookie is there and must not, for example, constantly replace the cookie (a behavior that would imply that the host has amnesia... a very serious host-side bug).

      In the case of setting-up Mechanize scripts, of course, the situation is just a little bit different because you are “the client,” but even so, cookies require some care-and-feeding on your part.   There will be some initial request that you issue early on, which you expect (indeed, which you must require) will send back a cookie-value to you.   You need to make sure that the server does not, thereafter, try to replace that cookie with some other value in a way that you know to be wrong.

      The “cookie jar” is simply the Mechanize way of capturing the cookies and cookie-values that are involved in the exchange.   Mechanize will handle them in the way that the HTTP protocols prescribe; as any browser also would do in the same situation.   My point is, though, that when you are writing automation scripts, especially testing scripts, you probably need to validate the host’s observed behavior.   But otherwise, no, you don’t have to tell Mechanize when to reach into its jar and grab a chocolate-chip goodie.   When I said, “care and feeding” above, no, I don’t mean that you must intervene in some way to cause Mechanize to Do The Right Thing.™

    Log In?
    Username:
    Password:

    What's my password?
    Create A New User
    Node Status?
    node history
    Node Type: perlquestion [id://922420]
    Approved by herveus
    help
    Chatterbox?
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others examining the Monastery: (12)
    As of 2014-10-31 10:02 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?

      For retirement, I am banking on:










      Results (216 votes), past polls