Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

WWW::Mechanize "Input not found"

by ScottJohn (Novice)
on Jan 26, 2012 at 01:25 UTC ( [id://950004]=perlquestion: print w/replies, xml ) Need Help??

ScottJohn has asked for the wisdom of the Perl Monks concerning the following question:

Hello Perl Monks,

I was trying to scrap data from the following website: http://web1.ncaa.org/stats/StatsSrv/careersearch and am getting the following message on 2 of 3 selection fields: "Input 'field x' not found..."

I am confused why it is apparently working for 1 field, but not the other 2. (I didn't paste the page source data since it was probably too large.) The site does navigate using Javascript. I hear that Java-based can be difficult to scrape using Perl. Should I try some other method?

My code is below:

use strict; use warnings; use WWW::Mechanize; my $mech = WWW::Mechanize->new(); my $outfile = "testNCAAdata.txt"; open(OUTFILE, ">$outfile"); my $url = "http://web1.ncaa.org/stats/StatsSrv/careersearch"; $mech->get($url); $mech->select('searchOrg','328'); $mech->select('academicYear','2009'); $mech->select('searchSport','MBB'); $mech->click(); $mech->click('submit',[0,1]); $mech->get($url); my $output_page = $mech->content(); print OUTFILE "$output_page"; close(OUTFILE);

I really appreciate any help you guys can offer. Thanks

Replies are listed 'Best First'.
Re: WWW::Mechanize "Input not found"
by Anonymous Monk on Jan 26, 2012 at 02:21 UTC

    I was trying to scrap data from the following website

    Um, scrap means throw away, recycle, chop into little pieces, melt down

    The site does navigate using Javascript. I hear that Java-based can be difficult to scrape using Perl. Should I try some other method?

    For the solution to every scraping problem, see Re^5: can't get WWW::Mechanize to sign in on JustAnswer or Web Testing with HTTP::Recorder or WWW::Mechanize::Firefox

     mech-dump http://web1.ncaa.org/stats/StatsSrv/careersearch will show you the noscript version of html, and indeed, one of the fields you try to populate doesn't exist in the html only browser

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://950004]
Approved by ww
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others musing on the Monastery: (2)
As of 2024-04-26 01:17 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found