Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

Re: Handling Javascript with LWP::UserAgent

by shmem (Canon)
on Jul 09, 2006 at 23:14 UTC ( #560029=note: print w/ replies, xml ) Need Help??


in reply to Handling Javascript with LWP::UserAgent

Just to show how's it can be done...

Some time ago I installed JavaScript::SpiderMonkey, but had not played with it so far - so your question is an opportunity ;-)

As others pointed out LWP::UserAgent doesn't parse or evaluate JavaSript. The code below uses JavaScript::SpiderMonkey for that and extracts the JavaScript stuff with HTML::Parser.

This cruft for obvious reasons works only for the link you provided.

#!/usr/bin/perl use strict; use LWP::UserAgent; use HTML::Parser; use JavaScript::SpiderMonkey; use Data::Dumper; my ($js_flag,$js,$eval,$js_text); my $base = 'http://www.GIDEONonline.net'; my $js = JavaScript::SpiderMonkey->new(); $js->init(); # create all neccesary objects and functions # for the javascript engine. These are the minimum # for a working version, and are demanded by the infamous # browser_check.js from www.webreference.com (which is # what's behind SRC="js_lib/browser_check.js") # # how to set these automatically for an arbitrary # javascript file is left as an excercise to the reader. $js->property_by_path("document.location.href"); $js->property_by_path("window"); $js->property_by_path("navigator.userAgent"); $js->property_by_path("navigator.appVersion"); $js->function_set("toLowerCase", sub { return lc($_[0]); }); $js->function_set("javaEnabled", sub { undef }); # The OPs code slightly modified { my $ua = new LWP::UserAgent(); my $search_address = "$base/loginx.php?user=metalib"; #creating the request object my $req = new HTTP::Request ('POST', $search_address); #sending the request my $res = $ua->request($req); if (!($res->is_success)){ warn "Warning:".$res->message."\n"; } my $response = $res->headers_as_string(); my $response .= $res->content(); my $p = HTML::Parser->new( default_h => [\&extract_js, "tag,attr,text"], ); $p->parse($response); if($eval) { my $code = $js_text . "\n". $eval.";\n"; my $rc = $js->eval( $code) if $eval; die $@ if $@; } my $url = $js->property_get("document.location.href"); if($url) { $response = $ua->get($base.'/'.$url); print $response->content if $response; } } sub extract_js { my ($tag,$attr,$text) = @_; if($tag eq 'body') { $eval = $attr->{onload}; } $js_flag = 0 if $tag eq '/script'; if($js_flag) { $js_text .= $text; } if ($tag eq 'script') { if ($attr->{src}) { my $ua = new LWP::UserAgent(); my $res = $ua->get($base .'/'. $attr->{src}); $js_text .= $res->content; } $js_flag++; } }

--shmem

_($_=" "x(1<<5)."?\n".q/)Oo.  G\        /
                              /\_/(q    /
----------------------------  \__(m.====.(_("always off the crowd"))."
");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}


Comment on Re: Handling Javascript with LWP::UserAgent
Download Code

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://560029]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others perusing the Monastery: (7)
As of 2015-07-30 04:13 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (269 votes), past polls