http://www.perlmonks.org?node_id=851139

jonnyfolk has asked for the wisdom of the Perl Monks concerning the following question:

I am using www::mechanize to log in to a user account and make a search of the db. This has been a straightforward task in the past, I simply create a GET query using a search form, send it to my script, the query is then appended to a url which retrieves the search results to my script and I carry on from there.

However now when I use this method there is a database error:

Microsoft VBScript runtime error '800a000d' Type mismatch: 'EndSQLDate' C:\WWWROOT\CUSTOMERS\K-O\MY\AGENT\../searchresults.asp, line 683
so I cannot retrieve the results.

The strange thing (to me) is that when I use exactly a query which retrieves results when using a browser to log in and search and append that as a query to my script, I get the error. This seems to indicate that the behavior of the browser generated by WWW::Mechanize is not the same as the browser on my computer, and whatever is the difference is having a negative consequence.

Is there some changes I could try to try to emulate better the browser or does anyone have an idea about how I can prevent this error? (By the way, the fact of getting the error demonstrates that log in was successful!)

sub new { my ( $this,$id ) = @_; my $ua = LWP::UserAgent->new(timeout => 90,agent=> 'User-Agent +=Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.1.5) G +ecko/20091102 Firefox/3.5.7'); $ua->cookie_jar( HTTP::Cookies->new('file' => 'cookies.txt' +,'autosave'=>1)); $ua->conn_cache(LWP::ConnCache->new()); #$ua->show_progress(1); push @{ $ua->requests_redirectable }, 'POST'; my $self = {id=>$id, ua=>$ua}; }); bless $self, __PACKAGE__; }
sub getsearchresults{ my $self = shift; $self->login; my $query = $ENV{'QUERY_STRING'}; my $searchurl = 'https://secure.url.net/agent/loadsearchresults.as +p?' . $query; my $res; $res = $self->{ua}->get($searchurl)->content; print $res; exit; }

Replies are listed 'Best First'.
Re: www::mechanize not behaving as browser?
by Anonymous Monk on Jul 24, 2010 at 09:49 UTC
    WWW::Mechanize is behaving as a browser.

    It is the stupid servers/cgi programs which is making assumptions, basically expecting a specific brand of browser and exploding when it gets something different.

    So if you're going to fool these stupid programs, you have to pretend you are that specific brand of browser.

    The way you do this is by using a regular browser to navigate the website successfully while you record the HTTP conversation with HTTP::Recorder/WireShark/Ethereal/LiveHTTPHeaders... and then you configure WWW::Mechanize to send similar headers....

      Hi, thanks for your comment - I am taking a look at Wireshark now.

      I have tried several different browsers include IE & Firefox on Windows and Safari, Firefox and even Camino on Mac, and all have achieved successful search results. It doesn't seem too fussy about the particular browser, but there's obviously something different in my Mechanize browser which is sending things awry.

        If it isn't filtering by the user-agent string, its probably setting a cookie via img (which mechanize doesn't load), or some form value via script... so it could be as simple as fetching an image, or set some extra form values normally set via script ... it all shows up on the wire :)
Re: www::mechanize not behaving as browser?
by Gangabass (Vicar) on Jul 24, 2010 at 23:49 UTC

    May be this is JavaScript? Try to clear all site specifiec cookies in the browser than disable JavaScript and after that make a request to the site again (in the browser). If you get results than you'll know that this is not JavaScript.

    If it not JavaScript than it can be any file (image, css) which target page refer (you can find it by Set-Cookie header in the session log.

    UPDATE:and i see LWP::UserAgent but not WWW::Mechanize in your code! Life is short so use WWW::Mechanize!