Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

Re: Requesting webpages which use cookies and session ids. (rev)

by PodMaster (Abbot)
on Aug 05, 2002 at 14:36 UTC ( [id://187677]=note: print w/replies, xml ) Need Help??


in reply to Requesting webpages which use cookies and session ids. (rev)

I got bored , and played with it a little (your code/dilemma). Hopefully you can learn something from the below, as I'm not going to even try to explain (it may be overwhelming, but it's all pretty much self-explanatory). It works for me, as messy with debug info as it is. Why it works? It's the simplest approach I could think of. Since GET requests always worked by copying the url by hand, that's what I stuck to.
#!/usr/bin/perl -w # /tell baz oy vey, you're abusing as_string in [id://187513], serious +ly abusing it. # /tell baz also, you're parsing html by hand, i don't like that ;) use strict; use Data::Dumper; use HTML::TokeParser; use URI; use LWP::UserAgent; use HTTP::Request; use HTTP::Headers; use HTTP::Response; use HTTP::Cookies; use HTML::LinkExtor; use HTTP::Request::Common qw(GET POST); my $WHATWORKS = 'http://www.bt.co.uk/directory-enquiries/dq_home.jsp?Q +RY=res&BV_SessionID=@@@@0472129835.1028555271@@@@&BV_EngineID=ccccadc +flifjlhkcflgcefkdffndfki.0&new_search=true&NAM=A*&GIV=&LOC=London&STR +=&PCD=&limit=25&CallingPage=Homepage&Search.x=17&Search.y=13'; $WHATWORKS = URI->new($WHATWORKS); warn Dumper{ $WHATWORKS->query_form}; my $cookie_file = "cookies.txt"; my $cookie_jar = HTTP::Cookies->new( file => $cookie_file, autosave => 1, ignore_discard => 1, # IMPORTANT!!!!!!!!!!!! ); my $url_home = "http://www.bt.co.uk/directory-enquiries/dq_home.jsp"; my $url_search = "http://www.bt.co.uk/directory-enquiries/dq_locationf +inder.jsp"; my $ua = new LWP::UserAgent(); $ua->agent( "Mozilla/8.0(${^O};retmaspod)" ); $ua->cookie_jar( $cookie_jar ); # # Get a session ID first my $req = GET $url_home; my $res = $ua->request( $req ); print $res->status_line(); # die Dumper $res; # as you need my %FORMOLA; ParseIt( \$res->{_content} ); # cause http://www.bt.co.uk/directory-enquiries/dq_locationfinder.jsp # requires javascript, and there is no way in hell i'm going to use it # so you gotta do that one on your own Baz, shouldn't be hard # considering I show you how, here $url_search = $WHATWORKS; $req = GET $url_search; $WHATWORKS->query_form( BV_SessionID => $FORMOLA{BV_SessionID} ); $WHATWORKS->query_form( BV_EngineID => $FORMOLA{BV_EngineID} ); warn $FORMOLA{BV_EngineID} ; warn $FORMOLA{BV_SessionID} ; warn Dumper{ $WHATWORKS->query_form}; $res = $ua->request($req); print $res->content(); my $p = new HTML::LinkExtor(undef,$url_search); $p->parse( $res->{_content} ); print Dumper $p->links; die Dumper $res; sub ParseIt { my $p = new HTML::TokeParser( $_[0] ); while(my $t = $p->get_token() ) { # ["S", $tag, $attr, $attrseq, $text] # ["E", $tag, $text] # ["T", $text, $is_data] # ["C", $text] # ["D", $text] # ["PI", $token0, $text] # print Dumper $$t[2] $FORMOLA{ $$t[2]->{name} } = $$t[2]->{value} if $$t[0] eq 'S' and $$t[1] eq 'input' and $$t[2]->{type} eq 'hidden'; } } __END__ stuff I noticed/got from the first page document.dqform.CallingPage.value="locationfinder"; document.dqform.action="/directory-enquiries/dq_locationfinder.jsp"; document.dqform.submit();} <input type=hidden name="QRY" value="res"> <input type=hidden name="BV_SessionID" value="@@@@1590200227.102855146 +9@@@@"> <input type=hidden name="BV_EngineID" value="cccjadcflifjlhlcflgcefkdf +fndfkh.0"> E:\dev>get -x -U -s -S -e "http://www.bt.co.uk/directory-enquiries/dq_ +home.jsp?QRY=res&BV_SessionID=@@@@0472129835.10285 55271@@@@&BV_EngineID=ccccadcflifjlhkcflgcefkdffndfki.0&new_search=tru +e&NAM=A*&GIV=&LOC=London&STR=&PCD=&limit=25&Callin gPage=Homepage&Search.x=17&Search.y=13">g.html LWP::UserAgent::new: () LWP::UserAgent::request: () LWP::UserAgent::send_request: GET http://www.bt.co.uk/directory-enquir +ies/dq_home.jsp?QRY=res&BV_SessionID=@@@@047212983 5.1028555271@@@@&BV_EngineID=ccccadcflifjlhkcflgcefkdffndfki.0&new_sea +rch=true&NAM=A*&GIV=&LOC=London&STR=&PCD=&limit=25 &CallingPage=Homepage&Search.x=17&Search.y=13 LWP::UserAgent::_need_proxy: Not proxied LWP::Protocol::http::request: () LWP::Protocol::collect: read 1360 bytes LWP::Protocol::collect: read 976 bytes LWP::Protocol::collect: read 384 bytes LWP::Protocol::collect: read 1360 bytes LWP::Protocol::collect: read 1360 bytes LWP::Protocol::collect: read 1360 bytes LWP::Protocol::collect: read 1360 bytes LWP::Protocol::collect: read 1360 bytes LWP::Protocol::collect: read 1360 bytes LWP::Protocol::collect: read 1360 bytes LWP::Protocol::collect: read 1360 bytes LWP::Protocol::collect: read 1360 bytes LWP::Protocol::collect: read 1360 bytes LWP::Protocol::collect: read 1360 bytes LWP::Protocol::collect: read 1360 bytes LWP::Protocol::collect: read 1360 bytes LWP::Protocol::collect: read 1360 bytes LWP::Protocol::collect: read 1360 bytes LWP::Protocol::collect: read 1360 bytes LWP::Protocol::collect: read 1360 bytes LWP::Protocol::collect: read 1360 bytes LWP::Protocol::collect: read 1360 bytes LWP::Protocol::collect: read 1360 bytes LWP::Protocol::collect: read 1360 bytes LWP::Protocol::collect: read 1360 bytes LWP::Protocol::collect: read 1360 bytes LWP::Protocol::collect: read 1360 bytes LWP::Protocol::collect: read 1360 bytes LWP::Protocol::collect: read 1360 bytes LWP::Protocol::collect: read 1360 bytes LWP::Protocol::collect: read 1360 bytes LWP::Protocol::collect: read 1360 bytes LWP::Protocol::collect: read 1360 bytes LWP::Protocol::collect: read 1360 bytes LWP::Protocol::collect: read 1360 bytes LWP::Protocol::collect: read 266 bytes LWP::UserAgent::request: Simple response: OK E:\dev>get -v This is lwp-request version 2.01 (libwww-perl-5.64) Copyright 1995-1999, Gisle Aas. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

____________________________________________________
** The Third rule of perl club is a statement of fact: pod is sexy.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://187677]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others admiring the Monastery: (5)
As of 2024-03-28 13:08 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found