Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

Re^2: Completely Confused with Mechanize::Firefox Forms

by help_3452 (Initiate)
on Jul 12, 2013 at 17:36 UTC ( [id://1044036]=note: print w/replies, xml ) Need Help??


in reply to Re: Completely Confused with Mechanize::Firefox Forms
in thread Completely Confused with Mechanize::Firefox Forms

Ok that's great, thank you very much BUT now I'm a bit more confused!
This is the blurb from the Mechanize::Firefox documentation

" $mech->current_form() print $mech->current_form->{name}; Returns the current form. This method is incompatible with WWW::Mechanize. It returns the DOM <form> object and not a HTML::Form instance."

Thank you nonetheless! Is there 'proper' way to do this ?? Or is the documentation wrong / inaccurate !

  • Comment on Re^2: Completely Confused with Mechanize::Firefox Forms

Replies are listed 'Best First'.
Re^3: Completely Confused with Mechanize::Firefox Forms
by PerlSufi (Friar) on Jul 12, 2013 at 17:59 UTC
    Hi there. I'm not clear on what you want exactly. You just want the names of the forms? WWW::Mechanize has a dump_forms method- so you may not need Mechanize::Firefox. Also, If you use the firefox 'firebug' extension, you can inspect any html element and then use the name or value in the script to mechanize what you want to do..

      Ok cool I'll give that a go. My whole idea though is to try to avoid having to use firebug. I've been making scripts by hand for a little while and I wanted to take the time to make something a little more intelligent. I do understand I can only cover relatively simple scenarios, but i want to take more of the leg work out of setting up a scraper. Cheers all the same.

        I would also consider using the Web::Scraper module. I haven't used it much but it looks pretty good

      Ah ok ! Fair enough.

      What I'm trying to do is make a simplified 'web scraper' Frequently within our Org I need to collect data from various systems ( say checking the names on the internal colleague register )

      But we have many systems and some use javascript. Because of this I was leaning towards Mech::Firefox since it handles it for you.

      I envisaged putting in the internal web address, then receiving back a set of links, forms, etc which the user could then select. I would save the choices in a config file and so allow the user to avoid repetitive checks.

      So i'm bit confused by the answer as it suggest using HTML::Form but the documentation states it doesn't return this type of object. I'd like to understand it properly as I intend to expand the code substantially

        Hi help_3452, there seems to be several ways going about what you need, which makes it hard for me to tell you what to do. So I will make a few suggestions based on what I understand:
        -I would still consider trying to use just the plain WWW::Mechanize for 'collecting the data'.
        -I have bypassed java script before with that module. I would also suggest getting the 'Live HTTP Headers' module for firefox. This will help you bypass some of the java script by seeing the HTTP GET/POSTS that may be occurring as you navigate the site.
        - I think you're making this harder than it may need to be by allowing the user to select the particulars of the forms. If this feature is a 'must have' to you, I would get the form names, and allow the user to select which forms they intend to 'submit', have them enter the required input and pass that to the submit_form method that mechanize has. Each input could be a 'field' within the 'form', maybe.. Here is an example of what I used to login with Mechanize::Firefox:
        $mech->form_name('loginform'); $mech->field('email' => 'me@awesome.com'); $mech->field('password' => 'l337'); $mech->click_button(name => 'login');
        The WWW::Mechanize module is basically the same in this regard. For each of those methods I just used firebug to inspect each HTML element and then coded it into the script.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1044036]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others cooling their heels in the Monastery: (6)
As of 2024-04-18 15:04 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found