Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Perl Mechanize

by zoya (Initiate)
on Apr 24, 2013 at 21:03 UTC ( #1030529=perlquestion: print w/ replies, xml ) Need Help??
zoya has asked for the wisdom of the Perl Monks concerning the following question:

Hi all, I have to fill forms on multiple webpages fetch the data parse the html into text and store it in a single file. I have the following code and every webform has different fields to be filled ,this is teh one of the website i have three more can anybody plz tell how can i do this. Suggestions are appreciated thanks.

use strict; use warnings; my $timeout=40; use WWW::Mechanize; use HTML::TreeBuilder; use HTML::FormatText; use HTML::Parser; use autodie qw/ open close /; use 5.012; use Win32::IE::Mechanize; use Time::HiRes 'sleep'; my $m = WWW::Mechanize->new(autocheck => 1); my $browser = Win32::IE::Mechanize->new(visible => 1); my $snp = "rs111"; my $content= $browser->get("http://snp-nexus.org/index.html"); my $html = $browser->content; $browser->form_name ('snpnexus'); #$browser->field('query', 'dbsnp'); $browser->field('batch_text', 'dbsnp rs111'); $browser->tick('ensembl', "ensembl"); $browser->tick('refseq','refseq'); $browser->tick('ucsc','ucsc'); $browser->tick("sift",'sift'); $browser->tick("polyphen",'polyphen'); $browser->tick("chb",'chb'); $browser->tick("chd",'chd'); $browser->tick("tfbs",'tfbs'); $browser->tick("consv",'consv'); $browser->tick("gwas",'gwas'); $browser->tick("indel",'indel'); $browser->tick("mirbase" ,'mirbase'); $browser->tick('gad','gad'); $browser->tick('cnp' , 'cnp' ); $browser->click_button('value', 'RUN'); for (0 .. $timeout*20) { last if $browser->{agent}->ReadyState >=5; sleep 0.1; } my $html2 = $browser->content; my $Format =HTML::FormatText->new(); my $TreeBuilder =HTML::TreeBuilder->new(); $TreeBuilder->parse($html2); my $parsed= $Format->format($TreeBuilder); print $parsed;

Comment on Perl Mechanize
Download Code
Replies are listed 'Best First'.
Re: Perl Mechanize
by runrig (Abbot) on Apr 24, 2013 at 21:12 UTC
    Suggestions are appreciated thanks.

    Slow down. What are you having a problem with? Fetching data? Parsing HTML? Ask one question at a time, post a small amount of code that demonstrates your problem, and describe your problem (and expectations).

      My question is if i have to work with more than one website what could be the solution actually i dont lnow how to deal with multiple websites using perl mechanize. My code above is an example of what i am doing with one of the websites. I have to the same with remaining but in single code. Hopefully nw I deleivered my problem ?
        My code above is an example of what i am doing with one of the websites. I have to the same with remaining but in single code

        So change the url that you pass to the get() method? Also, do you want to use WWW::Mechanize, or Win32::IE::Mechanize, because you create an object for both, but only seem to use one?

Re: Perl Mechanize
by Anonymous Monk on Apr 25, 2013 at 03:33 UTC
      Ok thank you i will check them out.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1030529]
Approved by igelkott
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (5)
As of 2015-07-29 07:22 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (260 votes), past polls