Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

WWW::Scripter eval of JQuery

by ait (Friar)
on Mar 05, 2012 at 21:52 UTC ( #957977=perlquestion: print w/ replies, xml ) Need Help??
ait has asked for the wisdom of the Perl Monks concerning the following question:

Hello,

I am trying to learn WWW::Scripter to scrape some JS intensive pages. I tried evaling the JQuery.js from a page received with $mech->get. Since that didn't work I tried a simple example and this doesn't work either. Maybe JQuery needs a loaded DOM for this to work?

#!/usr/bin/env perl use warnings; use strict; use HTTP::Cookies; use WWW::Scripter; my $cookie_jar = HTTP::Cookies->new(); my $mech = WWW::Scripter->new( agent => 'Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident +/5.0', cookie_jar => $cookie_jar, ); $mech->get("https://ajax.googleapis.com/ajax/libs/jquery/1.7.1/jquery. +min.js"); my $jquery = $mech->content; $mech->use_plugin( JavaScript => engine => 'SpiderMonkey', ); $mech->eval($jquery);

I get "c has no properties at WWW::Scripter::Plugin::JavaScript::SpiderMonkey line 165"

Any help greatly appreciated...

Update 20120306

To further expand on the comment below by tobyink, the fact that it can't parse JQuery is by no means a limitation of WWW::Scripter, it just means that I was trying to use a great tool for the wrong job.

In my particular case I need a full-blown rendering engine so there is not getting around the use of something like Gecko, whether it's using Xvfb, Crowbar, offscreen or something along those lines. I would definitively try to keep everything within WWW::Mechanize, after all, the final scraping is very effective with the tools provided in that namespace, so probably WWW::Mechanize::Firefox and Xvfb is the way to go. Expect another update when I get one of these working...

Comment on WWW::Scripter eval of JQuery
Download Code
Re: WWW::Scripter eval of JQuery
by Anonymous Monk on Mar 05, 2012 at 22:37 UTC

    Maybe JQuery needs a loaded DOM for this to work?

    What happens when you try that?

Re: WWW::Scripter eval of JQuery
by tobyink (Abbot) on Mar 05, 2012 at 23:06 UTC

    jQuery has a lot of stuff that assumes it's running inside a traditional desktop browser. WWW::Scripter emulates that environment pretty well, but still not quite well enough to run jQuery. It might get there eventually, but it's not there yet.

    There are plenty of actual desktop browsers (e.g. Konqueror - though I haven't tried its WebKit module) that can't run jQuery fully.

    Looked at WWW::Mechanize::Firefox?

      Thanks for the thorough explanation... I had the feeling that evaling JQuery might have been probably overkill for what I really need. Yep, I looked at mechanize::firefox but it seems to require firefox running with a GUI. Do you know if there is any way to run firefox without GUI or run the Gecko engine as a library, I mean much like Spidermonkey? Maybe using Xvfb or maybe Firefox itself has the ability to run GUI-less like other GUI-intensive apps?, for example Inkscape which has a console/batch mode (e.g. to convert SVG to PDF).

        Not tried this with Firefox, but many years ago I did something along these lines with OpenOffice.org to generate PDF files from various other formats. I used TightVNC Server to create a spare X11 display and ran OpenOffice.org on that.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://957977]
Approved by NetWallah
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others musing on the Monastery: (6)
As of 2014-07-26 02:28 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (175 votes), past polls