http://www.perlmonks.org?node_id=693223


in reply to Re: Mechanize, Forms, Links, problem from Javascript?
in thread Mechanize, Forms, Links, problem from Javascript?

Thanks guys. HTTP::Recorder looks very cool and useful, but it doesn't deal with Javascript either. It just pumps out the code that I'd already tried.

So I tried Live HTTP with firefox. Oh man. I'm not super well versed in this stuff. I've just attached my Live HTTP output below. The only thing I've been able to think to try is to get the ../sipp2005/form_A3.php page "manually" by adding the the ses_id line to the full form_A3 URL after a ?, both with the +'s and with them replace by %20, since they appear as spaces in the page itself. That doesn't work even in my browser, already logged in to the site. Any help is GREATLY appreciated...

There's stuff before this, but none of it, like the login, is javascript dependent, so I can get there fine. I can get to the http://sipp.pu.go.id/sipp/sipp.php?thn=2008&kdprop=01 site just by putting those URLs in after logging (through Mechanize). The thn and kdprop values come from two drop down select controls that use javascript so I can't use them properly but just getting the URL works fine. There's statcounter.com HTTP content after the top one below, but I've deleted that. The second Post is what I need and can't get to work...

----------------------------------------------------------
http://sipp.pu.go.id/sipp/sipp.php?thn=2008&kdprop=01

GET /sipp/sipp.php?thn=2008&kdprop=01 HTTP/1.1
Host: sipp.pu.go.id
User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en-US; rv:1.8.1.14) Gecko/20080404 Firefox/2.0.0.14
Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
Referer: http://sipp.pu.go.id/sipp/sipp.php?PHPSESSID=51dfb45045363353c6280cae9c591498

HTTP/1.x 200 OK
Date: Fri, 20 Jun 2008 20:30:03 GMT
Server: Apache/2.0.63 (Win32) PHP/4.4.2
X-Powered-By: PHP/4.4.2
Content-Encoding: gzip
Vary: Accept-Encoding
Content-Length: 5580
Keep-Alive: timeout=15, max=100
Connection: Keep-Alive
Content-Type: text/html
----------------------------------------------------------
http://sipp.pu.go.id/sipp2005/form_A3.php

POST /sipp2005/form_A3.php HTTP/1.1
Host: sipp.pu.go.id
User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en-US; rv:1.8.1.14) Gecko/20080404 Firefox/2.0.0.14
Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
Referer: http://sipp.pu.go.id/sipp/sipp.php?thn=2008&kdprop=01
Content-Type: application/x-www-form-urlencoded
Content-Length: 281
ses_id=sid&fiscal=2008&thnang=&propinsi=01&proyektemp=0905497004-110037040+-Ir.+Bambang+Erianto%2CMM++++++++&nmpinpro=-Ir.+Bambang+Erianto%2CMM++++++++&nippin=110037040&nmproyek=PUSAT-SEKRETARIAT-SNVT+PENANGANAN+MENDESAK+DAN+TANGGAP+DARURAT&proyek=0905497004&nmpropinsi=DKI+Jakarta
HTTP/1.x 200 OK
Date: Fri, 20 Jun 2008 20:30:49 GMT
Server: Apache/2.0.63 (Win32) PHP/4.4.2
X-Powered-By: PHP/4.4.2
Content-Encoding: gzip
Vary: Accept-Encoding
Content-Length: 2359
Keep-Alive: timeout=15, max=100
Connection: Keep-Alive
Content-Type: text/html
----------------------------------------------------------
  • Comment on Re^2: Mechanize, Forms, Links, problem from Javascript?

Replies are listed 'Best First'.
Re^3: Mechanize, Forms, Links, problem from Javascript?
by Cody Pendant (Prior) on Jun 22, 2008 at 06:24 UTC
    Well, the output there says that there was a POST request to http://sipp.pu.go.id/sipp2005/form_A3.php with the content
    ses_id=sid&fiscal=2008&thnang=&propinsi=01&proyektemp=0905497004-11003 +7040+-Ir.+Bambang+Erianto%2CMM++++++++&nmpinpro=-Ir.+Bambang+Erianto% +2CMM++++++++&nippin=110037040&nmproyek=PUSAT-SEKRETARIAT-SNVT+PENANGA +NAN+MENDESAK+DAN+TANGGAP+DARURAT&proyek=0905497004&nmpropinsi=DKI+Jak +arta

    Which might be the same, assuming the server accepts GET requests as well as POST requests, as this URL:

    http://sipp.pu.go.id/sipp2005/form_A3.php?ses_id=sid&fiscal=2008&thnan +g=&propinsi=01&proyektemp=0905497004-110037040+-Ir.+Bambang+Erianto%2 +CMM++++++++&nmpinpro=-Ir.+Bambang+Erianto%2CMM++++++++&nippin=1100370 +40&nmproyek=PUSAT-SEKRETARIAT-SNVT+PENANGANAN+MENDESAK+DAN+TANGGAP+DA +RURAT&proyek=0905497004&nmpropinsi=DKI+Jakarta

    Now, I've gone to that URL and ... I can't read Bahasa Indonesia, so I don't know if that's the information you want or an error message.



    Nobody says perl looks like line-noise any more
    kids today don't know what line-noise IS ...

      Just for anyone googling a similar problem, I got this working. In the end it was just a simple post from mechanize. With the same $br = WWW::Mechanize->new() object, I went to the login page logged in, followed the link to the trouble page, manually went to the URL you're directed to through the first two option drop downs, then to get the behavior of the combo selecting from the 3rd option drop down and then clicking on one of the data table links, I did this:

      $br->post('http://sipp.pu.go.id/sipp2005/form_A3.php', ['ses_id' => 'sid', 'fiscal' => "$year", 'thnang' => '', 'propinsi' => "$prop", 'proyektemp' => satker->{'proyektemp'}, 'nmpinpro' => $satker->{'nminpro'}, 'nipppin' => $satker->{'nippin'}, + 'nmproyek' => $satker->{'nmproyek'}, 'proyek' => $satker->{'proyek'}, + 'nmpropinsi' => $state_names{$prop}]); + my $resp = $br->content();

      All the values in the POST come from the names and values in the 3rd drop down option select input. Easy to figure out how to parse those to get the values in the HTTP above. I got those like this:

      $form = $br->form_name('sipp'); my $input = $form->find_input('proyektemp'); my @pt_values = $input->possible_values; my @pt_names = $input->value_names;

      Read on cpan HTML::Form for more on that...

Re^3: Mechanize, Forms, Links, problem from Javascript?
by Anonymous Monk on Jun 21, 2008 at 10:35 UTC
    um, http::recorder is easier than live/headers. you use a browser to do what you want (js or no), and http::recorder records the conversation, which you duplicate using mechanize. Otherwise there really is no way without learning http/cgi....
      I worked with HTTP::Recorder and went through the browser. The code it gave me back was exactly what I'd written myself in mechanize, and didn't work. Can I read up on HTTP/CGI and figure out what's going on from live/headers? I guess that's my only option now...
      From the docs:
      WWW::Mechanize can't play back Javascript actions, and HTTP::Recorder doesn't record them.