Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

WWW::Mechanize / javascript pointer

by spstansbury (Monk)
on Nov 12, 2009 at 04:07 UTC ( [id://806641]=perlquestion: print w/replies, xml ) Need Help??

spstansbury has asked for the wisdom of the Perl Monks concerning the following question:

I've got a Mechanize script that logs on to a website, navigates through a couple of pages and prepares a report.

The resulting report is listed in a table with a download button in the last cell. Using WWW::Mechanize::Shell -> links shows the download button as a link, but executing an open on that link doesn't cause anything to happen.

The relevant code snippet is:

<a href="MySurvey_SummaryDownload.aspx?sm= yUsTZIyAVJQqSsTu3TSbdT0ZMzsfbk0cbSdAOVkNUDY%3d" class="itBtn">

Running HTTP::Recorder only shows:$agent->follow_link(n => '25');for this step.

Wireshark shows a:GET http://www.surveymonkey.com/MySurvey_SummaryDownload.aspx?sm=yUsTZIyAVJQqSsTu3TSbdT0ZMzsfbk0cbSdAOVkNUDY%3d

What's the syntax for getting Mechanize to parse the "sm" so I can construct this GET?

As always, thanks for you input!

Replies are listed 'Best First'.
Re: WWW::Mechanize / javascript pointer
by Corion (Patriarch) on Nov 12, 2009 at 08:15 UTC

    Why would you want to "parse the sm"?

    The link you showed and the resulting URL are identical unless the whitespace in the link you showed actually exists:

    <a href="MySurvey_SummaryDownload.aspx?sm= yUsTZIyAVJQqSsTu3TSbdT0ZMzs +fbk0cbSdAOVkNUDY%3d" class="itBtn"> /MySurvey_SummaryDownload.aspx?sm=yUsTZIyAVJQqSsTu3TSbdT0ZMzsf +bk0cbSdAOVkNUDY%3d

    Anyway, the general rule of thumb when automating things is to compare what your script does against what the reference does. So, compare what Wireshark gets from your browser against what it gets when you use WWW::Mechanize. Also test the site with Javascript disabled.

      Thanks for the response.

      I want the ability to parse the "sm=" because it is generated for each report and since this is javascript, I don't have a button to click, so therefore I need to parse the report identifier and build a GET statement using it.

        If the site uses Javascript, you'll have to parse it, even though your HTML looks good enough. But personally, I'd use my module WWW::Mechanize::FireFox, which understands Javascript by virtue of automating FireFox.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://806641]
Approved by AnomalousMonk
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others exploiting the Monastery: (3)
As of 2024-04-25 06:30 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found