http://www.perlmonks.org?node_id=1024436


in reply to how to access HTML within a javascript

JavaScript can create content for the browser dynamically. A page that is heavily dependent on JavaScript can be difficult to scrape or automate, because often first you've got to execute the JavaScript to see what content it produces.

While you're not going to find a Perl module with an embedded JavaScript interpreter, you can find tools that will help bail you out of a difficult situation. One is corion's WWW::Mechanize::Firefox. Another is Selenium (teamed up with CPAN modules that use selenium). Two totally different approaches. Both require a bit of work on your part as a programmer. But they are reasonable answers to the JavaScript problem.


Dave

  • Comment on Re: how to access HTML within a javascript

Replies are listed 'Best First'.
Re^2: how to access HTML within a javascript
by Anonymous Monk on Mar 20, 2013 at 07:00 UTC

      Indeed. WWW::Scripter is powered by JE, a very good pure Perl Javascript implementation. Other Javascript implementations for Perl include JavaScript::SpiderMonkey and JavaScript::V8 which are generally faster but offer poorer integration between the Javascript code and the Perl code.

      package Cow { use Moo; has name => (is => 'lazy', default => sub { 'Mooington' }) } say Cow->new->name

        I'm reading the documentation for WWW::Scripter::Plugin::Javascript and it's not immediately clear to me how to use it to access the HTML that is released by the html.js script that runs on the webpage. Do I somehow use WWW::Scripter::Plugin::Javascript to force the script to run on the website and then capture the output?