Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

www::mechanize not behaving as browser?

by jonnyfolk (Vicar)
on Jul 24, 2010 at 09:37 UTC ( #851139=perlquestion: print w/ replies, xml ) Need Help??
jonnyfolk has asked for the wisdom of the Perl Monks concerning the following question:

I am using www::mechanize to log in to a user account and make a search of the db. This has been a straightforward task in the past, I simply create a GET query using a search form, send it to my script, the query is then appended to a url which retrieves the search results to my script and I carry on from there.

However now when I use this method there is a database error:

Microsoft VBScript runtime error '800a000d' Type mismatch: 'EndSQLDate' C:\WWWROOT\CUSTOMERS\K-O\MY\AGENT\../searchresults.asp, line 683
so I cannot retrieve the results.

The strange thing (to me) is that when I use exactly a query which retrieves results when using a browser to log in and search and append that as a query to my script, I get the error. This seems to indicate that the behavior of the browser generated by WWW::Mechanize is not the same as the browser on my computer, and whatever is the difference is having a negative consequence.

Is there some changes I could try to try to emulate better the browser or does anyone have an idea about how I can prevent this error? (By the way, the fact of getting the error demonstrates that log in was successful!)

sub new { my ( $this,$id ) = @_; my $ua = LWP::UserAgent->new(timeout => 90,agent=> 'User-Agent +=Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.1.5) G +ecko/20091102 Firefox/3.5.7'); $ua->cookie_jar( HTTP::Cookies->new('file' => 'cookies.txt' +,'autosave'=>1)); $ua->conn_cache(LWP::ConnCache->new()); #$ua->show_progress(1); push @{ $ua->requests_redirectable }, 'POST'; my $self = {id=>$id, ua=>$ua}; }); bless $self, __PACKAGE__; }
sub getsearchresults{ my $self = shift; $self->login; my $query = $ENV{'QUERY_STRING'}; my $searchurl = 'https://secure.url.net/agent/loadsearchresults.as +p?' . $query; my $res; $res = $self->{ua}->get($searchurl)->content; print $res; exit; }

Comment on www::mechanize not behaving as browser?
Select or Download Code
Re: www::mechanize not behaving as browser?
by Anonymous Monk on Jul 24, 2010 at 09:49 UTC
    WWW::Mechanize is behaving as a browser.

    It is the stupid servers/cgi programs which is making assumptions, basically expecting a specific brand of browser and exploding when it gets something different.

    So if you're going to fool these stupid programs, you have to pretend you are that specific brand of browser.

    The way you do this is by using a regular browser to navigate the website successfully while you record the HTTP conversation with HTTP::Recorder/WireShark/Ethereal/LiveHTTPHeaders... and then you configure WWW::Mechanize to send similar headers....

      Hi, thanks for your comment - I am taking a look at Wireshark now.

      I have tried several different browsers include IE & Firefox on Windows and Safari, Firefox and even Camino on Mac, and all have achieved successful search results. It doesn't seem too fussy about the particular browser, but there's obviously something different in my Mechanize browser which is sending things awry.

        If it isn't filtering by the user-agent string, its probably setting a cookie via img (which mechanize doesn't load), or some form value via script... so it could be as simple as fetching an image, or set some extra form values normally set via script ... it all shows up on the wire :)
Re: www::mechanize not behaving as browser?
by Gangabass (Priest) on Jul 24, 2010 at 23:49 UTC

    May be this is JavaScript? Try to clear all site specifiec cookies in the browser than disable JavaScript and after that make a request to the site again (in the browser). If you get results than you'll know that this is not JavaScript.

    If it not JavaScript than it can be any file (image, css) which target page refer (you can find it by Set-Cookie header in the session log.

    UPDATE:and i see LWP::UserAgent but not WWW::Mechanize in your code! Life is short so use WWW::Mechanize!

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://851139]
Approved by Perlbotics
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others having an uproarious good time at the Monastery: (8)
As of 2014-09-20 13:21 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (159 votes), past polls