Working with WWW::Mechanize

cdherold has asked for the wisdom of the Perl Monks concerning the following question:

Dearest Monks,

I once again return to the monastery for monk wisdom.

I have been working with WWW::Mechanize. My goal is to browse through my brokerage account and scrape pages and fill our forms to get stock information. I realize I have a lot of challenging work before me in terms of dealing with returned cookies and such in order to finally get in. All I want to get to right now is the point where I can see that I have pulled down the logon page and actually submit my user/pass. Then I can worry (and learn) about the cookies and other verification issues that need to be addressed.

I have read over Mechanize and see its power, but must confess my general lack of fluency with object oriented programming. I have mostly been using perl for link extraction and regex.

So now I am trying to learn some new stuff.

So I know where to start (straight from the docs with simple modifications) ...

use WWW::Mechanize;
my $mech = WWW::Mechanize->new();
$url = "http://www.omniomix.com";
$mech->get( $url );
$mech->content( format => "text" );
print $mech;
[download]

In the above code I was trying to get a view of the url text that (I had hoped) was extracted.

I got ...

Can't locate HTML/TreeBuilder.pm in @INC [...] at /usr/lib/perl5/site_
+perl/5.8.3/WWW/Mechanize.pm line 496.
[download]

And when I do this ...

my $res = $ua->request(HTTP::Request->new(GET => "http://www.yahoo.no"
+));
print $res;
[download]

I get this ...

LWP::UserAgent=HASH(0x8a415e8)HTTP::Response=HASH(0x8c1df48)

dudes, monks, I am lost.

Any words of wisdom that might help me in my quest would be greatly appreciated.

Thanks Monks,

Chris Herold

Comment on Working with WWW::Mechanize Select or Download Code

Replies are listed 'Best First'.
Re: Working with WWW::Mechanize by svenXY (Deacon) on Sep 16, 2005 at 07:32 UTC
Hi, I'd recommend using WWW::Mechanize::Shell. It is a shell that you can use almost like a browser. It will allow you to check HTTP return values, check where you surfed to etc. But the cool thing is that After you've done what you want to do by using it, you can call the script() method to display your current session history as a Perl script using WWW::Mechanize. This code can then be saved. With that you can then start adjusting to your needs `perl -MWWW::Mechanize::Shell -eshell` will get you going. Regards, svenXY	[reply] [d/l]
Re: Working with WWW::Mechanize by ikegami (Patriarch) on Sep 16, 2005 at 06:50 UTC
Have you tried installing HTML-Tree (which contains HTML::TreeBuilder) to solve your first problem? Concerning your second problem, the `print` statement is indicating `$res` is an HTTP::Response object. Have you read the documentation for this class? The `is_*` and `content` methods should be of primary interest. There's also `as_string`, which returns the whole response (headers and all) as text.	[reply] [d/l] [select]
Re: Working with WWW::Mechanize by chb (Deacon) on Sep 16, 2005 at 07:27 UTC
Everytime a simple `print $var;` shows strange stuff with `HASH(...)` or `ARRAY(...)` try using `Data::Dumper` (its a core module, included in every standard perl installation): `use Data::Dumper; print Dumper($var);` [download]	[reply] [d/l] [select]
Re: Working with WWW::Mechanize by dn (Acolyte) on Sep 16, 2005 at 11:43 UTC
The request() method of LWP::UserAgent returns an HTTP::Response object. This means you will need to deal with $res as an HTTP::Response object, not a simple string. From the HTTP::Response manpage, the appropriate way to deal with these would be the content() method. `if ($res->is_success) { print $res->content; } else { warn $res->status_line, "\n"; }` [download] See HTTP::Request and HTTP::Response.	[reply] [d/l]
Re^2: Working with WWW::Mechanize by singingfish (Novice) on Sep 19, 2005 at 12:19 UTC
or if you're in the debugger, `x $var` will do the something similar	[reply] [d/l]
Re: Working with WWW::Mechanize by Akhasha (Scribe) on Sep 17, 2005 at 22:12 UTC
There is another module that might make your life easier, HTTP::Recorder has few dependencies and allows you to record a session as you browse into a site normally. The output for repeating the session is in the form of a WWW::Mechanise script.	[reply]

Back to Seekers of Perl Wisdom