http://www.perlmonks.org?node_id=492536

cdherold has asked for the wisdom of the Perl Monks concerning the following question:

Dearest Monks,

I once again return to the monastery for monk wisdom.

I have been working with WWW::Mechanize. My goal is to browse through my brokerage account and scrape pages and fill our forms to get stock information. I realize I have a lot of challenging work before me in terms of dealing with returned cookies and such in order to finally get in. All I want to get to right now is the point where I can see that I have pulled down the logon page and actually submit my user/pass. Then I can worry (and learn) about the cookies and other verification issues that need to be addressed.

I have read over Mechanize and see its power, but must confess my general lack of fluency with object oriented programming. I have mostly been using perl for link extraction and regex.

So now I am trying to learn some new stuff.

So I know where to start (straight from the docs with simple modifications) ...

use WWW::Mechanize; my $mech = WWW::Mechanize->new(); $url = "http://www.omniomix.com"; $mech->get( $url ); $mech->content( format => "text" ); print $mech;
In the above code I was trying to get a view of the url text that (I had hoped) was extracted.

I got ...

Can't locate HTML/TreeBuilder.pm in @INC [...] at /usr/lib/perl5/site_ +perl/5.8.3/WWW/Mechanize.pm line 496.

And when I do this ...

my $res = $ua->request(HTTP::Request->new(GET => "http://www.yahoo.no" +)); print $res;
I get this ...

LWP::UserAgent=HASH(0x8a415e8)HTTP::Response=HASH(0x8c1df48)

dudes, monks, I am lost.

Any words of wisdom that might help me in my quest would be greatly appreciated.

Thanks Monks,

Chris Herold

Replies are listed 'Best First'.
Re: Working with WWW::Mechanize
by svenXY (Deacon) on Sep 16, 2005 at 07:32 UTC
    Hi,
    I'd recommend using WWW::Mechanize::Shell. It is a shell that you can use almost like a browser. It will allow you to check HTTP return values, check where you surfed to etc.
    But the cool thing is that After you've done what you want to do by using it, you can call the script() method to display your current session history as a Perl script using WWW::Mechanize. This code can then be saved.
    With that you can then start adjusting to your needs

    perl -MWWW::Mechanize::Shell -eshell will get you going.
    Regards,
    svenXY
Re: Working with WWW::Mechanize
by ikegami (Patriarch) on Sep 16, 2005 at 06:50 UTC

    Have you tried installing HTML-Tree (which contains HTML::TreeBuilder) to solve your first problem?

    Concerning your second problem, the print statement is indicating $res is an HTTP::Response object. Have you read the documentation for this class? The is_* and content methods should be of primary interest. There's also as_string, which returns the whole response (headers and all) as text.

Re: Working with WWW::Mechanize
by chb (Deacon) on Sep 16, 2005 at 07:27 UTC
    Everytime a simple print $var; shows strange stuff with HASH(...) or ARRAY(...) try using Data::Dumper (its a core module, included in every standard perl installation):
    use Data::Dumper; print Dumper($var);
Re: Working with WWW::Mechanize
by dn (Acolyte) on Sep 16, 2005 at 11:43 UTC
    The request() method of LWP::UserAgent returns an HTTP::Response object. This means you will need to deal with $res as an HTTP::Response object, not a simple string. From the HTTP::Response manpage, the appropriate way to deal with these would be the content() method.
    if ($res->is_success) { print $res->content; } else { warn $res->status_line, "\n"; }
    See HTTP::Request and HTTP::Response.
      or if you're in the debugger,

      x $var

      will do the something similar

Re: Working with WWW::Mechanize
by Akhasha (Scribe) on Sep 17, 2005 at 22:12 UTC
    There is another module that might make your life easier, HTTP::Recorder has few dependencies and allows you to record a session as you browse into a site normally. The output for repeating the session is in the form of a WWW::Mechanise script.