Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

WWW::Mechanize Problem

by pilgrim (Monk)
on Oct 10, 2003 at 01:12 UTC ( [id://298151]=perlquestion: print w/replies, xml ) Need Help??

pilgrim has asked for the wisdom of the Perl Monks concerning the following question:

Sorry about the vague title; I'm not sure if it's a problem with redirection, cookies, the site I'm trying to automate, or none of the above.

Let me start by saying I've used WWW::Mechanize to automate two of my company's internal apps, and am trying to get it to do one now that starts with a login page. For this reason, I can't give the link it's failing on -- you wouldn't be able to access it.

I've managed to reduce my script to a very simple case that still fails. The would-be working version tests for success at every step of the way. Also, the test case of submitting bogus login information works quite well -- I get a 403, and react accordingly.

My simple case:
#!/usr/local/bin/perl -w use strict; use warnings; use WWW::Mechanize; my $url = 'inaccessible'; my $ua = WWW::Mechanize->new(); my $user = 'insecure'; $ua->get($url); $ua->set_fields("j_username" => $user, "j_password" => $user); $ua->submit(); if ($ua->success()) { print "Yay!"; } else { print "Login failed: " . $ua->response->status_line; }
The output of my simple case is either:
Login failed: 500 Can't read entity body: Connection reset by peer
or more commonly
Login failed: 500 EOF when chunk header expected

In a browser, I get redirected to a page with 3 links, and my script would follow() these if I could get past this error. The server's access.log shows me the initial GET request, a couple of 302s to the login form, the POST to the authenticator (also a 302), and then a GET of the page o' links as a 200 (about 3K of data).

Why is WWW::Mechanize not getting this page? When I print the uri, I see it contains the same session information as the server logs. I don't seem to be getting a cookie, though. Is there some parameter I need to set in the new()?

Thanks for any assistance (or insightful questions). I'm really stumped on this, and apparently too dense (or frustrated) to be able to access the non-login-protected XML interface with SOAP::Lite, either. I'd eventually like to learn to do that as well, since both interfaces need testing, but this seemed the more concrete problem.

If it helps:
This is perl, v5.8.0 built for cygwin-multi-64int
I haven't tried outside cygwin. I just installed WWW::Mechanize this week.

--pilgrim

Replies are listed 'Best First'.
Re: WWW::Mechanize Problem
by InfiniteSilence (Curate) on Oct 10, 2003 at 21:26 UTC
    I don't know pilgrim, but I tried your code and I don't see anything wrong with any of it. I created a bogus page:
    #!/usr/bin/perl -w use strict; use CGI qw/:standard/; CGI::initialize_globals(); print header; print start_html($ENV{'QUERY_STRING'}); if (!defined(param('j_username'))) { print start_form(-name=>'a_form', -method=>'GET', -action=>'login.cg +i'); print textfield(-name=>'j_username'); print br,password_field(-name=>'j_password'); print br,submit(); print end_form; } else { if ((defined(param('j_password'))) and (param('j_password') eq '1234' +)) { print h1('YEAH!'); } else { print h1('Bummer, wrong password!'); } } print end_html; 1;
    And ran your code and a debug and look at the response means that it got the data just fine:
    #!/usr/local/bin/perl -w use strict; use warnings; use WWW::Mechanize; my $url = 'http://localhost/cgi-bin/login.cgi'; my $ua = WWW::Mechanize->new(); my $user = 'foobie'; my $clearpass = '1999'; $ua->get($url); #$ua->form_number('1'); #$ua->field('j_username',$user); #$ua->field('j_password',$clearpass); #$ua->submit(); $ua->set_fields("j_username" => $user, "j_password" => $clearpass); $ua->submit(); if ($ua->success()) { print "Yay!"; } else { print "Login failed: " . $ua->response->status_line; }
    I get:
    # from perl -d node_298151.pl DB<7> p $ua->res->{'_content'} <?xml version="1.0" encoding="utf-8"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML Basic 1.0//EN" "http://www.w3.org/TR/xhtml-basic/xhtml-basic10.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" lang="en-US"><head><title>j +_username= foobie&amp;j_password=1999</title> </head><body><h1>Bummer, wrong password!</h1></body></html> DB<8>
    Which responds correctly because I issue a bad password (1999 instead of 1234). Is your page being protected by something in .htaccess?

    Celebrate Intellectual Diversity

      I don't believe so, InfiniteSilence. This is a WebLogic server, I don't know if .htaccess is even a factor.

      One difference between our tests is that my (under test) form does a POST to a separate validator that either redirects to the appropriate page, or returns a 403 if you give it invalid info.

      Say, thanks, that may be the critical difference. I've been re-re-reading the docs on WWW::Mechanize and LWP::UserAgent, and I think the next logical step would be a rewrite of my simple case to use LWP::UserAgent directly.

      (I'm leaving out the additional troubleshooting that's led me to that conclusion, I think it would only muddy the waters. Suffice to say I believe I can build a test case at home over the weekend that mimics the problem, and then see whether LWP::UA solves it.)

      --pilgrim
Re: WWW::Mechanize Problem
by pilgrim (Monk) on Nov 17, 2003 at 23:10 UTC
    The same code worked fine for me both at home and at the office against my emulation of the WebLogic behavior.

    I moved the code to perl, v5.8.0 built for i386-linux-thread-multi , and it works against the real WebLogic behavior.

    I did eventually get the SOAP::Lite interface going as well, so the backend gets exercised that way and I now have a third option for putting the front end through its paces. (With the brick wall I hit on Mechanize/Cygwin, I implemented tests in Rational Robot and LoadRunner VU Generator.)

    In case it saves anybody some time, when causing SOAP::Lite to create XML queries that could be parsed by WebLogic's implementation of SAX (JAX?) I found it helpful to expressly set 'xmlns:xsi' => "http://www.w3.org/2001/XMLSchema-instance" as a SOAP::Data attribute while building the query.

    (I apologize for the buzzword overuse, I'm hoping to provide SuperSearch as many opportunities as possible to locate this potential nugget.)

    --pilgrim

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://298151]
Approved by particle
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others avoiding work at the Monastery: (10)
As of 2024-03-28 10:50 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found