Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

WWW::Mechanize Webcrawler

by Thoery55 (Initiate)
on Aug 29, 2014 at 06:43 UTC ( #1098948=perlquestion: print w/replies, xml ) Need Help??
Thoery55 has asked for the wisdom of the Perl Monks concerning the following question:

I'm currently new to using WWW::Mechanize and am using it to build a web scraping tool that will go onto a school website and pull course data so that the website I'm building can recognize conflicts. I currently have this code in Perl:

#!/user/bin/perl use warnings; use strict; use WWW::Mechanize; my $browser = WWW::Mechanize->new; $browser->get( 'https://registrar.ucdavis.edu/courses/search/index.cfm +'); $browser->form_number(3); #Search Form $browser->select('subject', 'AAS'); $browser->submit(); print $browser->content();

The way the website works, you go in, select a subject area, and then click "Search". Then a table populates based off what you selected in the forms above.

I'm currently focused on just getting my script to select one option from the dropdown menu, click "Search" and then copy the results, but for one, I'm not sure if it's actually working (It's not giving me any errors, but I'm not sure it's doing anything) and two, I'm not sure how to view the data that pops up in the table. Any help would be appreciated!

Replies are listed 'Best First'.
Re: WWW::Mechanize Webcrawler
by Anonymous Monk on Aug 29, 2014 at 09:16 UTC
    Javascript magic is used on this page. You need to make POST request manually. Here is POST request (from Firefox's HTTPFox extension):
    termYear 2014 term 10 course_number multiCourse course_title instructor subject AAS course_start_eval - course_start_time - course_end_eval - course_end_time - course_status - course_level - course_units - virtual - termCode 201410 runMe 1 clearMe 1 reorder gettingResults 0 search Search _cf_nodebug true _cf _nocache true
    And here is the code:
    $mech->post("https://registrar.ucdavis.edu/courses/search/course_searc +h_results_mod8.cfm", Content => { termYear => "2014", term => "10", ... subject => "AAS", ... }, );

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1098948]
Approved by Corion
Front-paged by toolic
help
Chatterbox?
NodeReaper settles into the armchair by the fire. You weren't using it were you?

How do I use this? | Other CB clients
Other Users?
Others having an uproarious good time at the Monastery: (8)
As of 2016-12-08 16:09 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    On a regular basis, I'm most likely to spy upon:













    Results (143 votes). Check out past polls.