http://www.perlmonks.org?node_id=578226

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Weird problem that I can't solve after many, many attempts. I have a script that uses WWW::Mech to login to my message board.

It loads the LOGIN page fine (tested this by dumping $mech->content after I $mech->get( $forum_login ); and seeing the HTML source code.

The script then puts in the data to the forum fields (does NOT error out saying it cannot find such form fields) which appears as though it's working just fine.

Then I run a test on $mech->content to check for a FAILURE string (ie: Username and password incorrect) that would appear on a failed attempt to login. Now this FAILS because the following page comes back empty (after the login data is submitted).

$forum_login = ""; $forum_type = "number"; # either NUMBER or NAME depending if specific +form will be found by name or number $forum_call = "1"; #form number or name for login $forum_user ="username"; #NAME of the forum field for username $forum_pass = "passwrd"; #NAME of the forum field for username $username = ""; # username to signing $password = ""; # password to signin $forum_fail = "incorrect"; # text to find if login failed # # above empty variables are emptied for obvious reasons # in this sample source code use LWP::Simple; use WWW::Mechanize; my $mech = WWW::Mechanize->new(); $mech->cookie_jar(HTTP::Cookies->new); $mech->get( $forum_login ); if ($forum_type eq "number") { $mech->submit_form( form_number => $forum_call, fields => { $forum_user => $username, $forum_pass => $password, } ); print "Good!"; } else { $mech->submit_form( form_name => $forum_call, fields => { $forum_user => $username, $forum_pass => $password, } ); print "Good!"; } print $mech->content; # comes back empty exit; . . ####################### # Was sign in successful? ####################### if ($mech->content =~ m/$forum_fail/i || $mech->content eq "") { my $time = localtime(); open(LOG, ">> error.txt") or die "Error: $!"; print LOG $time; print LOG $mech->response; print LOG "\n"; close(LOG) or die "Error: $!"; print $mech->content; print "ERROR:" . $mech->response; } else { print "LOGGED IN!"; }
Can anyone help me figure out why I'm losing my mech content after the login form is submitted?

Replies are listed 'Best First'.
Re: $mech->content() returns empty
by sulfericacid (Deacon) on Oct 14, 2006 at 02:51 UTC
    I have a number of web bots I made with WWW::Mechanize and your code looks fine for me.

    When using message boards, there's a few things to keep in mind.

  • Some boards have reload times- if you load the page and post the data too quickly, the bot protection will catch you and prevent your last action. Use a 5 or 10 second sleep() time between actions to bypass the bot traps and to be nicer to the server's resources
  • Some forums do use redirects once you sign in or after something is posted. Some forums also require the very next page to load before it'll post your data. If your forum redirects you upon successful login but does not say you logged in, chances are you weren't redirected to the page that set your SSID or cookie. Go through it manually and see where you're being redirected, then load the page using WWW::Mechanize AFTER your last action. I'm 95% certain this is your current problem.


    "Age is nothing more than an inaccurate number bestowed upon us at birth as just another means for others to judge and classify us"

    sulfericacid
      That did it!!! I tried your idea of loading the page it redirected me to after the login attempt and it works every time now. Thank you/
Re: $mech->content() returns empty
by kwaping (Priest) on Oct 13, 2006 at 22:58 UTC
    $mech->submit_form(...) returns an HTTP::Response object (from the docs). It is that object that you need to call response on. Example:
    # ... do stuff ... my $resp = $mech->submit_form(%params); print $resp->content if $resp->is_success;

    ---
    It's all fine and dandy until someone has to look at the code.

      It also advances where the $mech object points to. $mech->content should be ok after the submit. I do agree, however, with your sentiment that checking the return value of ->submit_form is a good idea. The documentation mentions other things to check as well like ->success and ->status.

      ⠤⠤ ⠙⠊⠕⠞⠁⠇⠑⠧⠊

        I tried condensing the code a little bit to make it easier to figure out what goes wrong. The CONTENT is still empty but the URL shows that it did infact redirect to the proper page!

        In theory I could just get that URL, but it'd be a new instance of it when I need to know what the exact source code was once the form was submitted (to check for login errors).

        Any other suggestions?

        use WWW::Mechanize; my $mech = WWW::Mechanize->new(); $mech->cookie_jar(HTTP::Cookies->new); $mech->get( $forum_login ); if ($forum_type eq "number") { $mech->submit_form( form_number => $forum_call, fields => { $forum_user => $username, $forum_pass => $password, } ); die "Couldn't submit form" unless $mech->success; my $stuff = $mech->content; print $stuff; my $url = $mech->uri; print $url; }
      I have used this syntax for quite a while where I can call $mech->content after submitting a form or following a link to get the next page's HTML source code.

      But just out of curiousity, I tried your suggestion but it, too, comes back completely empty.