Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

Re: Mech and Javascript

by runrig (Abbot)
on Jun 13, 2008 at 19:28 UTC ( [id://691990]=note: print w/replies, xml ) Need Help??


in reply to Mech and Javascript

Mech does not do javascript. You'll have to munge the URL yourself. Or you can try Mozilla::Mechanize or WWW::Mechanize::Plugin::JavaScript or Win32::IE::Mechanize. I don't know javascript either, but so far I've found it's been easy to read well enough to tell what it's doing to the URL so I haven't had to resort to any of those other modules yet.

Replies are listed 'Best First'.
Re^2: Mech and Javascript
by jcdento (Novice) on Jun 13, 2008 at 19:50 UTC
    ok, here is the page source code (or at least the relevant parts):
    <tr><td class="xsmall" nowrap="1"><img src="/images/spacer.gif" height +="1" width="12" /><a class="nobold" href="/gs/portal/services/cas/">G +lobal Client Access</a></td></tr>
    This is the link I want to follow. preceding code:
    <table border="0" width="100%" cellpadding="0" cellspacing="0"> <!-- NOTE: this row enforces the minimum width of 280px per co +lumn for site map --> <tr> <td width="280"><img alt="" src="/images/spacer.gif" width +="280" height="1" /></td> <td width="1"><img alt="" src="/images/spacer.gif" width=" +1" height="1" /></td> <td width="280"><img alt="" src="/images/spacer.gif" width +="280" height="1" /></td> <td width="1"><img alt="" src="/images/spacer.gif" width=" +1" height="1" /></td> <td width="280"><img alt="" src="/images/spacer.gif" width +="280" height="1" /></td> </tr> <tr> <td valign="top" width="34.0%"> <table width="100%" border="0" cellspacing="0" cellpad +ding="2"> <!-- display the site map header and site map --> <tr class="sitemapTopNodesHeader"> <td nowrap> <img src="/images/spacer.gif" height="1" width +="5" /> <b><span class="medium">Home</span></b> <img src="/images/spacer.gif" height="1" width +="5" /> </td> </tr> <!-- display the site map --> <tr><td class="xsmall" nowrap="1"><img src="/images/sp +acer.gif" height="1" width="12" /><a class="nobold" href="/gs/portal/ +home/">Home</a></td></tr> <tr> <td><img src="/images/spacer.gif" height="10" widt +h="1" /></td> </tr> <!-- display the site map header and site map --> <tr class="sitemapTopNodesHeader"> <td nowrap> <img src="/images/spacer.gif" height="1" width +="5" /> <b><span class="medium">Services</span></b> <img src="/images/spacer.gif" height="1" width +="5" /> </td> </tr> <!-- display the site map -->
    I have tried everything I can, but I havent been able to get the link with $mech->follow_link, click, click_button, submit, or get. Any advice for how I can even get the link? It appears as a normal link in the browser, but when i save the $res to an output file, the link appears as file:///tmp/foo.html/gs/portal/services/cas instead of the actual link.
      I don't see any javascript in what you posted. But if you do post a bunch of javascript, I'm not going to look at it. A simple Mech form submit is just not going to cut it here. Figure out what you need to submit and submit it (URL/form and hidden fields, etc.). Or maybe try one of the other modules I mentioned (but don't ask me how to use them).
      You're probably going to get much better responses if you show the Perl code you tried, not just the HTML you're trying to work with.


      Revolution. Today, 3 O'Clock. Meet behind the monkey bars.

      I would love to change the world, but they won't give me the source code

        Sorry, the javascript comes later (after I click that link, I get a javascript table). I cannot find the link, even by viewing the source code then by calling it by name. Here is the code I am using:
        my $agent = WWW::Mechanize->new(autocheck=>1, agent=> 'Mozilla/5.0 (Wi +ndows; U; Windows NT 5.1; en-US; rv:1.8.1.14) Gecko/20080404 Firefox/ +2.0.0.14', requests_redirectable => [ 'GET', 'HEAD', 'POST' ]); my $response = $agent->get($url2); $response = $agent->get($url); my $form = HTML::Form->parse($response->{_content}, $response->base()) +; $form->param("username", "usrname"); $form->param("password", "pass"); $response = $agent->request($form->click); $response->{_content} =~ /(https:\/\/\S+)"/ or die; $url = $1; $response = $agent->get($url); print $response->decoded_content; $url = 'https://linkfromsite.com'; #Note: This link is in the form of + /abc/words/main/ and although it goes to site.com/abc/words/main whe +n mousing over the link in firefox it shows only the /abc/etc. part $response = $agent->get($url); #code fails at this point, response is a website-generated error #$response = $agent->click("URL"); $response = $agent->reload;
        Please note I am using made up site names since the actual name of the site cannot be disclosed.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://691990]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others about the Monastery: (5)
As of 2024-04-19 10:19 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found