Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

Re: Handling Javascript with LWP::UserAgent

by shmem (Canon)
on Jul 09, 2006 at 23:14 UTC ( #560029=note: print w/replies, xml ) Need Help??


in reply to Handling Javascript with LWP::UserAgent

Just to show how's it can be done...

Some time ago I installed JavaScript::SpiderMonkey, but had not played with it so far - so your question is an opportunity ;-)

As others pointed out LWP::UserAgent doesn't parse or evaluate JavaSript. The code below uses JavaScript::SpiderMonkey for that and extracts the JavaScript stuff with HTML::Parser.

This cruft for obvious reasons works only for the link you provided.

#!/usr/bin/perl use strict; use LWP::UserAgent; use HTML::Parser; use JavaScript::SpiderMonkey; use Data::Dumper; my ($js_flag,$js,$eval,$js_text); my $base = 'http://www.GIDEONonline.net'; my $js = JavaScript::SpiderMonkey->new(); $js->init(); # create all neccesary objects and functions # for the javascript engine. These are the minimum # for a working version, and are demanded by the infamous # browser_check.js from www.webreference.com (which is # what's behind SRC="js_lib/browser_check.js") # # how to set these automatically for an arbitrary # javascript file is left as an excercise to the reader. $js->property_by_path("document.location.href"); $js->property_by_path("window"); $js->property_by_path("navigator.userAgent"); $js->property_by_path("navigator.appVersion"); $js->function_set("toLowerCase", sub { return lc($_[0]); }); $js->function_set("javaEnabled", sub { undef }); # The OPs code slightly modified { my $ua = new LWP::UserAgent(); my $search_address = "$base/loginx.php?user=metalib"; #creating the request object my $req = new HTTP::Request ('POST', $search_address); #sending the request my $res = $ua->request($req); if (!($res->is_success)){ warn "Warning:".$res->message."\n"; } my $response = $res->headers_as_string(); my $response .= $res->content(); my $p = HTML::Parser->new( default_h => [\&extract_js, "tag,attr,text"], ); $p->parse($response); if($eval) { my $code = $js_text . "\n". $eval.";\n"; my $rc = $js->eval( $code) if $eval; die $@ if $@; } my $url = $js->property_get("document.location.href"); if($url) { $response = $ua->get($base.'/'.$url); print $response->content if $response; } } sub extract_js { my ($tag,$attr,$text) = @_; if($tag eq 'body') { $eval = $attr->{onload}; } $js_flag = 0 if $tag eq '/script'; if($js_flag) { $js_text .= $text; } if ($tag eq 'script') { if ($attr->{src}) { my $ua = new LWP::UserAgent(); my $res = $ua->get($base .'/'. $attr->{src}); $js_text .= $res->content; } $js_flag++; } }

--shmem

_($_=" "x(1<<5)."?\n".q/)Oo.  G\        /
                              /\_/(q    /
----------------------------  \__(m.====.(_("always off the crowd"))."
");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://560029]
help
Chatterbox?
[Corion]: ... values to be used. For example, I think for headers, one would want to have various kinds of Content-Encoding headers, but for the get_parameters one would have various kinds of Bobby Tables
[choroba]: What about [metadoc:// Algorithm::Loops]?
[Corion]: choroba: Yeah, but handing off the request to Dancer,Plack, Mojolicious,LWP is easy once I have the data filled into some structure ;))
[choroba]: Algorithm::Loops
[Corion]: choroba: I'm using that to generate the permutations, but I don't know how the user can pass the intended values to my function in a sane way
[Corion]: I have a prototype that permutes the get_parameters, but the values used for the get parameters should be different from the values used for the headers and potentially for parts of the URL
[Corion]: But yes, in general, my approach will be "split the URL into another set of parameters, generate an array of allowed values for each parameter and then NestedLoops() over the set"
[choroba]: hmm... so you need something like bag from Test::Deep, but not for checking, but for generation
[Corion]: This has the dual use of easily requesting sequential URLs and also being suitable for testing
[Corion]: For testing, I want to skip all tests with the same value(s) once one test fails to cut down on the number of failing tests

How do I use this? | Other CB clients
Other Users?
Others wandering the Monastery: (9)
As of 2017-01-17 08:16 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Do you watch meteor showers?




    Results (152 votes). Check out past polls.