Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

WWW::Mechanize Webcrawler

by Thoery55 (Initiate)
on Aug 29, 2014 at 06:43 UTC ( #1098948=perlquestion: print w/ replies, xml ) Need Help??
Thoery55 has asked for the wisdom of the Perl Monks concerning the following question:

I'm currently new to using WWW::Mechanize and am using it to build a web scraping tool that will go onto a school website and pull course data so that the website I'm building can recognize conflicts. I currently have this code in Perl:

#!/user/bin/perl use warnings; use strict; use WWW::Mechanize; my $browser = WWW::Mechanize->new; $browser->get( 'https://registrar.ucdavis.edu/courses/search/index.cfm +'); $browser->form_number(3); #Search Form $browser->select('subject', 'AAS'); $browser->submit(); print $browser->content();

The way the website works, you go in, select a subject area, and then click "Search". Then a table populates based off what you selected in the forms above.

I'm currently focused on just getting my script to select one option from the dropdown menu, click "Search" and then copy the results, but for one, I'm not sure if it's actually working (It's not giving me any errors, but I'm not sure it's doing anything) and two, I'm not sure how to view the data that pops up in the table. Any help would be appreciated!

Comment on WWW::Mechanize Webcrawler
Download Code
Re: WWW::Mechanize Webcrawler
by Anonymous Monk on Aug 29, 2014 at 09:16 UTC
    Javascript magic is used on this page. You need to make POST request manually. Here is POST request (from Firefox's HTTPFox extension):
    termYear 2014 term 10 course_number multiCourse course_title instructor subject AAS course_start_eval - course_start_time - course_end_eval - course_end_time - course_status - course_level - course_units - virtual - termCode 201410 runMe 1 clearMe 1 reorder gettingResults 0 search Search _cf_nodebug true _cf _nocache true
    And here is the code:
    $mech->post("https://registrar.ucdavis.edu/courses/search/course_searc +h_results_mod8.cfm", Content => { termYear => "2014", term => "10", ... subject => "AAS", ... }, );

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1098948]
Approved by Corion
Front-paged by toolic
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others about the Monastery: (7)
As of 2014-09-17 04:27 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (57 votes), past polls