Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

How do I perform web automation with sites that use Javascript?

by whakka (Hermit)
on May 02, 2009 at 18:58 UTC ( [id://761527]=perlquestion: print w/replies, xml ) Need Help??

whakka has asked for the wisdom of the Perl Monks concerning the following question: (http and ftp clients)

I need to automate a script for crawling web pages, but these pages use Javascript/AJAX for form processing and the like. LWP and WWW::Mechanize don't handle this case well. What can I do?

Originally posted as a Categorized Question.

  • Comment on How do I perform web automation with sites that use Javascript?

Replies are listed 'Best First'.
Re: How do I perform web automation with sites that use Javascript?
by jettero (Monsignor) on May 02, 2009 at 19:12 UTC
Re: How do I perform web automation with sites that use Javascript?
by jdporter (Paladin) on May 04, 2009 at 18:10 UTC

    Here are some modules which give you a way around the issue:

    Other things to try:

    • Disable Javascript in your browser and see if the site still functions as you want. If so, then you don't actually have a problem :)
    • Figure out what the scripts are doing on the wire, and re-implement those transactions in your own program. The Firefox add-on Live HTTP Headers is well suited for this.

    This info provided by the OP.

Re: How do I perform web automation with sites that use Javascript?
by planetscape (Chancellor) on May 04, 2009 at 05:31 UTC
Re: How do I perform web automation with sites that use Javascript?
by ninuzzo (Novice) on Mar 12, 2011 at 23:16 UTC

    A recent addition of mine is WWW::HtmlUnit::Spidey. This module uses the Java library HtmlUnit which is a headless browser with pretty good JavaScript support. Do not worry, you won't have to write any Java code :D

    It is good for massive web scraping where screen scraping does not scale and may be unstable.

    There is a tutorial here that scrapes some data obtained from a form not working without JavaScript support.

    Btw I am just a Perl beginner. Any Perl guru interested in co-developing Spidey?

      If you wish to recruit, try perl news section, much more visible
Re: How do I perform web automation with sites that use Javascript?
by jdporter (Paladin) on Jun 08, 2009 at 18:35 UTC

    Re: How do I perform web automation with sites that use Javascript?

    Originally posted as a Categorized Answer.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://761527]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others meditating upon the Monastery: (4)
As of 2024-04-19 22:58 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found