Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Proper use of HTML::Form for exctracting

by bachoA4o (Sexton)
on Aug 24, 2018 at 05:32 UTC ( [id://1220985]=perlquestion: print w/replies, xml ) Need Help??

bachoA4o has asked for the wisdom of the Perl Monks concerning the following question:

Hello , I'm making a simple project to practice my perl . The task : I want to extract the text from HTML <form> , and pass the text as an argument to a perl script , not using CGI.pm modul , the web server is Apache . The things I've tried with HTML submit didn't work out . I`ve read about HTML::Form module , but don't know the proper way to use it and didn't quite well understand the syntax. I did't understand the second line :
$form = HTML::Form->parse($html, $base_uri); $form->value(query => "Perl");
I would be grateful if someone explains Thanks

Replies are listed 'Best First'.
Re: Proper use of HTML::Form for exctracting
by marto (Cardinal) on Aug 24, 2018 at 10:26 UTC

    You want to get a page and scrape form content? Mojo::UserAgent/Mojo::DOM make this trivial. Post an example and I'll give you a working solution. If your question is something else please elaborate.

Re: Proper use of HTML::Form for exctracting
by TheloniusMonk (Sexton) on Aug 24, 2018 at 07:04 UTC
    HTML::Form doesn't extract. Maybe HTML::Extract will DWIM better (updated)
Re: Proper use of HTML::Form for exctracting
by bachoA4o (Sexton) on Aug 26, 2018 at 12:47 UTC
    This is what I've done for the test , made it as simple as possible just to try the module : 1. Page with input
    <!DOCTYPE html> <html> <head> <title>Title</title> </head> <body> <form name="MyForm" action="/cgi-bin/ex.cgi",method="post",id="f1"> <input type="text" name="textfield"> </form> </body> </html>
    2. After clicking enter it should call ex.cgi:
    #!/usr/bin/perl use strict; use warnings; use HTML::Extract; my $extractor=new HTML::Extract; print "Content-type: text/html\n\n"; print "<html><body>\n"; print"$extractor->gethtml('http://192.168.0.104/' ,tagname=body,return +type=text)<p1>"; print"</html></body>";
    When I try it this way , the page shows me : " HTML::Extract=HASH(0x557e1ad591e0)->gethtml('http://192.168.0.104/' ,tagname=body,returntype=text)" I've tried without quotes and without generating a page but then it only shows that it can't be found (error 500); Where is my mistake . Please advice me !
      Where is my mistake . Please advice me !

      Method calls are not interpolated. If you move the method call outside the string it will actually be called. Here's an example as an SSCCE for which I've obviously replaced your private URL with a public one (I've also fixed the arguments to the gethtml method, your reversed closing tags, removed the indirect object notation and removed the unmatched <p1> element).

      #!/usr/bin/perl use strict; use warnings; use HTML::Extract; my $extractor = HTML::Extract->new; print "Content-type: text/html\n\n"; print "<html><body>\n"; print $extractor->gethtml('http://perlsphere.net/', 'tagname=body', 'returntype=text'); print "</body></html>";

      This could still be improved further but at least it shows you some working code. Enjoy.

        Thank you ! This helped a lot. But it returns the whole content of the web page. I've tried to change tagname=body to tagname=form but it only returns blank page

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1220985]
Approved by marto
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others admiring the Monastery: (5)
As of 2024-04-24 03:51 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found