http://www.perlmonks.org?node_id=1066851

bretelle has asked for the wisdom of the Perl Monks concerning the following question:

Bonjour tout le monde,

in this piece of script

use WWW::Scripter; ($w = new WWW::Scripter)->use_plugin(JavaScript); $w->get('http://www.immoweb.be/FR/Rent.Estate.cfm?IdBien=2805206&xpage +=1');
The get() function gives the expected results but it takes more than 60 seconds and produces these warnings:
Argument "\x{b}\x{31}" isn't numeric in addition (+) at /usr/local/sha +re/perl/5.14.2/JE/Number.pm line 93. Unquoted string "inf" may clash with future reserved word at (eval 646 +2) line 2. Unquoted string "inf" may clash with future reserved word at (eval 663 +7) line 2. Unquoted string "inf" may clash with future reserved word at (eval 664 +0) line 2.
... which I've not been able to interpret so far. Using get() with other URLs produces other kind of warnings. I know Scripter can be very slow (cf #936386) but maybe if I manage to understand the warnings I'll be able to find a workaround?

Replies are listed 'Best First'.
Re: WWW::Scripter performance and warnings
by PerlSufi (Friar) on Dec 12, 2013 at 15:57 UTC
    I'm not totally clear on why you are using WWW::Scripter as opposed to WWW::Mechanize- except that WWW::Mechanize doesn't work with javascript very well. However, I was able to connect to that page just by doing the following:
    use WWW::Mechanize; use strict; use warnings; my $mech = WWW::Mechanize->new(); $mech->get('http://www.immoweb.be/FR/Rent.Estate.cfm?IdBien=2805206&xp +age=1'); $mech->dump_text;
    Additionally, if you want to crawl the site AND use some of the javascript, you can either:
    A) use WWW::Mechanize::Firefox
    B) inspect the various html elements of the page with Firefox's firebug extension and use the $mech->get() similar to what I did above
    UPDATE:
    C) Or Go with Laurent_R's response ;)
      Hi PerlSufi,

      one element I need in the web page is printed there by a Javascript script. So I need WWW::Scripter (or something else) to execute this script.

      I guess Scripter is slow because there is a lot of Javascript in this page.

      When I inspect this particular element with Firebug I can see the name of the script. The question now is whether I can use that information in my Perl script so that WWW::Scripter executes only that one script. I'll first have to try to understand a bit more about Javascript and WWW::Scripter.

      Thanks a lot for your answers

      UPDATE I also tried solution A, WWW::Mechanize::Firefox, which does the work OK but is not faster, more than one minute to perform get().

        I guess Scripter is slow because there is a lot of Javascript in this page.

        Or infinite loop, memory leaks .... even the browsers (firefox/chrome...) do very little to protect from this

Re: WWW::Scripter performance and warnings
by Laurent_R (Canon) on Dec 12, 2013 at 15:46 UTC

    Bonjour,

    Maybe you could try with this syntax:

    use WWW::Scripter; $w = new WWW::Scripter; $w->use_plugin('JavaScript');
    Especially the single quotes around the word JavaScript may be important.
      Bonjour et merci,

      Using this syntax doesn't seem to help. The get() function still takes ages and I get the warning about Argument "\x{b}\x{31}".