Perl mechanize get Error!

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hello,

Greetings Monks,

Below is my code, dont know why it is not working.

use strict;
use WWW::Mechanize;

my $url = "http://www.truro-penwith.ac.uk/";

my $mech = WWW::Mechanize->new();
print "\nURL: $url ...\n";

eval{
$mech->agent_alias('Windows Mozilla');   
#$mech->add_header('User-Agent'=>'Mozilla/5.0 (Windows NT 6.1; WOW64; 
+rv:25.0) Gecko/20100101 Firefox/25.0');
#$mech->add_header('Accept'=>'text/html,application/xhtml+xml,applicat
+ion/xml;q=0.9,*/*;q=0.8');
#$mech->add_header('Accept-Language'=>'en-US,en;q=0.5');
#$mech->add_header('Accept-Encoding'=>'gzip, deflate');
#$mech->add_header('Cookie'=>'bb2_screener_=1385998863+111.92.64.106; 
+PHPSESSID=078fc31740655a3a3f5fb280dbdf335d');
$mech->add_header('Connection'=>'keep-alive');
 
    $mech->get($url);
    };

    #$mech = $mech->content();
    $mech = $mech->response->content();
    print $mech;
    exit;
[download]

Anyone know what could be the proper reason.

Site is detecting this as a script, I tried adding headers with add_header & default_header, but nothing works. Response shows 400 Error and sometimes 403 Error. I wonder why this happened even though I had given the headers. Any ideas, I don't :(

Thanks in advance

Comment on Perl mechanize get Error! Download Code

Replies are listed 'Best First'.
Re: Perl mechanize get Error! by PerlSufi (Friar) on Dec 02, 2013 at 20:32 UTC
What is your goal with this script? I have written a brief tutorial on using mechanize that can be found here: WWW::Mechanize Basics If you need to do a lot of navigating on the site, I would recommend WWW::Mechanize::Firefox since it uses a lot of javascript. WWW::Mechanize and javascript don't get along too well. Also, try `$mech->dump_text;` [download] I also recommend getting the firebug firefox extension and manually inspecting the page for each thing you want to access. For example, the url for 'Latest News' is http://www.truro-penwith.ac.uk/category/news/ which I determined by using the firebug extension.. So to go there, just do `$mech->get('http://www.truro-penwith.ac.uk/category/news/');` [download] UPDATE: Also, simply: `my $mech = WWW::Mechanize->new(); $mech->get('http://www.truro-penwith.ac.uk/'); $mech->dump_text;` [download] worked for me.. you don't need to do anything with headers..	[reply] [d/l] [select]
Re^2: Perl mechanize get Error! by Anonymous Monk on Jan 04, 2014 at 08:26 UTC
Hi PerlSufi, You are great. Ok, Can you check this, https://thebigword-careers.irecruittotal.com/cac/SearchVacancy.aspx?EmploymentTypeID=0&Intranet=0 and give us a solution? Take it as a challenge. ;) Best Anonymous Monk	[reply]
Re^3: Perl mechanize get Error! by PerlSufi (Friar) on Jan 06, 2014 at 15:04 UTC
I'm not really sure what the 'challenge' is? Do you want to be able to submit that form? `use strict; use warnings; use WWW::Mechanize; #takes what vacancy to search as first argument on command line my $mech = WWW::Mechanize->new(); $mech->get("https://thebigword-careers.irecruittotal.com/cac/SearchVac +ancy.aspx?EmploymentTypeID=0&Intranet=0"); my $vacancy = $ARGV[0]; $mech->field( "ctl00$mvMintPP$ctl00$ContentPlaceHolder_Main$mvMintPP$ctl00$txbJobRef +", $vacancy); #(^^without plus sign occuring copied over) $mech->click_button(value => "Search Vacancies"); $mech->dump_text;` [download] ..might work..	[reply] [d/l]
Re: Perl mechanize get Error! by Anonymous Monk on Dec 02, 2013 at 17:25 UTC
They don't want you to scrape the university website. Solution, don't.	[reply]
Re^2: Perl mechanize get Error! by Anonymous Monk on Jan 04, 2014 at 08:23 UTC
what an idiot you are... :D???!!!	[reply]

Back to Seekers of Perl Wisdom