Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

SpiderMonkey and JS

by Anonymous Monk
on May 01, 2009 at 13:22 UTC ( [id://761285]=perlquestion: print w/replies, xml ) Need Help??

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I'm currently working on a script for my band to help promote our music on Myspace. In the US they have turned on an option to Disable captchas, so I've verifed my cell phone to take advantage of this feature. I must now pass this "SMSToken" value to my login script but the value is set via javacript. I've stumbled upon JavaScript-SpiderMonkey which seems to be the answer. I'm still having trouble retrieving the token using my script below:
use warnings; use strict; use LWP::UserAgent; use HTML::Parser; use JavaScript::SpiderMonkey; use Data::Dumper; my $base = 'http://www.myspace.com'; my $js = JavaScript::SpiderMonkey->new(); $js->init(); $js->function_set("SMSTokenValue", sub { print "@_\n"; }); $js->property_by_path("document.getElementById"); { my $ua = new LWP::UserAgent(); my $req = new HTTP::Request ('GET', $base); my $res = $ua->request($req); if (!($res->is_success)){ warn "Warning:".$res->message."\n"; } else { print "Successful\n"; my $rc = $js->eval("document.getElementById('ctl00_ctl00_cpMai +n_cpMain_LoginBox_SMSVerifiedCookieToken');"); print Dumper($rc); $js->destroy(); } }
Any idea what I'm missing? I've eliminated the call to my function after initilization and tried to bring it down to the basics but can't seem to retrieve the javascript value.

Any help would be greatly appreciated

Replies are listed 'Best First'.
Re: SpiderMonkey and JS
by spx2 (Deacon) on May 01, 2009 at 15:40 UTC

    the js object must know about the page structure in order to be able to traverse the DOM and apply the getElementById to it. there should be a method of the SpiderMonkey object which does just that. By reading the docs I see the only way to load a particular page would be for you to  document.location.href = that_page and eval that with the SpiderMonkey object,and only after that's finished , and will hopefully make a HTTP request(you should check that with ngrep for example) , will you be able to do anything with the DOM.

    As an alternative and more DIY option I suggest you write a Firefox extension(XUL) which communicates through AJAX with a Perl script and automate your task that way , you'll be able to have complete control over what you're doing and instead of having to deal with the potential bugs of SpiderMonkey or it's counterpart CPAN module you'll only have to deal with Firefox bugs(which is pretty stable IMHO).

      well...I think I'm still missing something...
      use warnings; use strict; use LWP::UserAgent; use HTML::Parser; use JavaScript::SpiderMonkey; use Data::Dumper; my $base = 'http://www.myspace.com'; my $js = JavaScript::SpiderMonkey->new(); $js->init(); $js->function_set("SMSTokenValue", sub { print "@_\n"; }); $js->property_by_path("document.getElementById"); { my $ua = new LWP::UserAgent(); my $req = new HTTP::Request ('GET', $base); my $res = $ua->request($req); if (!($res->is_success)){ warn "Warning:".$res->message."\n"; } else { print "Successful\n"; $js->property_by_path("document.location.href"); my $rc = $js->eval(q! document.location.href = append("http://", "www.myspace.com"); document.getElementById = append("ctl00_ctl00_cpMain_cpMain_LoginB +ox_SMSVerifiedCookieToken"); SMSTokenValue("Token: ", document.getElementById); function append(first, second) { return first + ' = ' + second; } !); $js->destroy(); } }
      The value comes back as undefined...Am I missing something else?
          $js->property_by_path("document.location.href");

        here you're not actually setting aything , please read here how property_by_path is properly used.

        document.location.href = append("http://", "www.myspace.com"); document.getElementById = append("ctl00_ctl00_cpMain_cpMain_LoginB +ox_SMSVerifiedCookieToken");

        if you would have used property_by_path correctly above there would've been no need for setting location.href again inside javascript . however if you decide to do it in js you need to keep in mind that it won't take effect immediately and you must somehow wait until the page is loaded.

        P.S. see this tutorial by Limbic~Region where he describes how WWW::Selenium is used in a similar situation.
Re: SpiderMonkey and JS
by Crackers2 (Parson) on May 01, 2009 at 14:35 UTC

    As far as I can tell there's no connection between the SpiderMonkey object and your LWP request. In other words your document.getElementById is being applied to an empty DOM tree, NOT to the document you retrieved with LWP.

    I had a brief look through the SpiderMonkey documentation you linked, and I don't immediately see a way to feed it an HTML document.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://761285]
Approved by jettero
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others rifling through the Monastery: (2)
As of 2024-04-24 23:14 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found