Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

Filling in a form on a web page using perl

by jonp (Novice)
on Aug 18, 2005 at 18:15 UTC ( [id://484904] : perlquestion . print w/replies, xml ) Need Help??

jonp has asked for the wisdom of the Perl Monks concerning the following question:

Hi all,
I am trying to write a perl script that will go to a web page (http://www.arabidopsis.org/Blast/),
select the right options from the drop down menus, fill in the Input text field, and then submit the form.
Here is my code:
#!/usr/local/bin/perl use strict; use lib "/common"; use LWP; my $browser = LWP::UserAgent->new; my $sequence = 'GTGACGAGGAGGAAGAAAGAGTGCGGTTGTTTTGTTTGACACTTCTTTTCTTTC +TCCTCCAACGGTCCAACTTTGACTCTCTCTTCTCTTCTCTGAAAATTCTTCTATTCATTCATCTTCCTA +TCTTTCCCATGGAAGTGGAACAACAACAATATCTTCTCCAATTTCTCTAACCCTAAATTAACCGCTTCA +CCGCACAATTTCATATTTCCTTTCTCTCTTTGATCGGTTATCATGTCCGTTTTGGTTGTAACCGCCATG +GATTTCGCCGTCCTCGGTTTTCTTATTCCCTCGCTCTGGGAGATCGAAGTCGCTTTCGCTGCCTCGCTT +TTCGTTATTCTCGCGTATTGGTTCTTCACATTTAGAATCGCTGACCGTCACTCCGATCGATCACTTTCG +GAAAATTCCGCCGGTGATTCCGCCGACGATAAAGTCAAGATTGGCCAGTCAAGAGGAGATTCTCAAGCC +GGTTCGGCGTACCTGATTAAGTTAGAACTATTGGCTGCTAAAAATCTGATAGCTGCAAATTTAAATGGC +ACATCGGATCCTTACACCATCATCACATGCGGCAATGAAAAGCGATTCAGTTCCATGGTCCCTGGTTCA +AGAAATCCAATGTGGGGCGAAGAGTTCAATTTTTCTGTCGATGAACTTCCTGTCCAGATCAATGTCACA +ATTTATGATTGGGATATAATTTGGAAAAGTGCTGTTCTTGGTTCAGTGACCGTTCCAGTTGAAAGTGAA +GGTCAAACTGGTGCAGTGTGGCATACTTTGGACAGCCCATCAGGGCAGGTTTGTCTTCATATAAAAACA +GAAAAAATGTCTGCAAATTCTGCCAGGATAAATGGTTATGGCGGAGCCAACACTCGAAGAAGGATACCC +TTGGAAAAACAGGAACCCACAGTAGTCCATCAAAAGCCAGGACCTCTTCAAACGATATTTGAGCTTCAT +CCAGATGAGGTTGTTGATCATAGTTACTCTTGTGCACTTGAAAGGTCATTCTTGTACCATGGTCGTATG +TATGTCTCAACATGGCACATTTGTTTCCATTCCAATGTGTTCTCGAAGCAAATGAAGGTGCTTATTCCG +TTTGAAGATATAGATGAGATTCGAAGGAGTCAACATGCATTTATTAATCCTGCTATAACAATTATTCTT +CGTATGGGTGCCGGTGGACATGGTGTCCCTCCTTTGGGAAGTCCTGATGGTAGAGTCAGATATAAGTTT +GCGTCATTTTGGAACAGGAATCATGCAGTTAGAAGTCTACAACGTGCTGTAAAGAACTTCCGTGAAATG +TTGGAAACTGAGAAGAAGGAAAATGCAGAGTCAGAATTGCGTGCACATAGCAGTTCTGTTAGACGAAGT +AACATAATGGATAAGGTTCCCGAAACCAGCATGCCAAAAGCTGGAAAACGTCAAACTTTTATCAAAGAA +GAGGCTTTAGTTGGTATATACAATGATGTTTTCCCCTGCACAGCAGAGCAGTTTTTTAACTTATTGTTA +AAGGACGATTCAAAATTTACTAGCAAGTATCGTTCAGCACGAAAGGATACTAATCTTGTGATGGGACAA +TGGCATACAGCAGAAGAATATGACGGTCAAGTCCGGGAGATAACCTTCAGATCCCTTTGTAACAGCCCT +ATGTGCCCGCCAGACACAGCCATTACTGAGTGGCAACATGTTGTTCTATCATCTGACAAGAAAAACCTG +GTGTTTGAGACTGTGCAACAGGCACACGATGTTCCACTCGGGTCCTGTTTTGAGGTGCACTGTAAATGG +GGTTTGGAGACAACTGGTGAAAGTTCATGTACTCTGGACATAAGAGTGGGTGCACATTTCAAGAAATGG +TGTGTGATGCAATCCAAAATAAAATCAGGGGCAATCAATGAGTACAAGAAAGAAGTTGATGTGATGTTA +GATGTTGCTCGTTCATATATAAAGCCGCATACTTCTGATGACGAGAATGATAAGGCATCTTCGCCCCCT +GCGGCAACTTTGGAAAATAAATTTTTCTGTGTAACCTTAGATTAGGACTTTGTTGGTCTGTGCAATATT +GTAACTTTCCTTCTCTTAAGTTATTTATTTATTCTTGCAACACAGCGCCCAAAGCCATGTATATTTATT +TTGATGCACAGTTGGTTTTTGTCTTGTGTATCTCTGTGGGTAACTTG'; my $URL = "http://www.arabidopsis.org/Blast/"; #BLASTX:NT query, AA db ->blastx #AGI Proteins (Protein) ->ATH1_pep #query sequence ->QueryText my $response = $browser->post( $URL, [ 'Algorithem' => 'blastx', 'BlastTargetSet' => 'ATH1_pep', 'QueryText' => $sequence, ] ); my $page = "/common/perlscripts/WebReading/page.html"; open(OUT,">$page") or die "can't open ", $page; print OUT $response->decoded_content; close(OUT);

I am trying to get the form to submit (the 'Run Blast' button), so that $response contains the data from the results page, but I can't figure out how to do that. If anyone knows a good web site for me to look or knows the answer, I would really appreciate it.
Thanks
Jon

Replies are listed 'Best First'.
Re: Filling in a form on a web page using perl
by davidrw (Prior) on Aug 18, 2005 at 18:50 UTC
    You might want to try to use WWW::Mechanize instead.. it can make this type of thing extremely easy:
    use WWW::Mechanize; use strict; use warnings; my $mech = WWW::Mechanize->new; my $sequence = '...'; $mech->get('http://www.arabidopsis.org/Blast/'); $mech->submit_form( form_name => 'myForm', fields => { 'Algorithm' => 'blastx', 'BlastTargetSet' => 'ATH1_pep', 'QueryText' => $sequence, }, ); print $mech->content;
    Note: your code had 'Algorithem' but the actual field element is 'Algorithm'
      Thanks a bunch, Mechanize works great.
Re: Filling in a form on a web page using perl
by davidj (Priest) on Aug 18, 2005 at 18:54 UTC
    hey jonp,

    You are off to a good start. As you already know you need to set the values of the appropriate dropdown menus, input fields, etc and submit them (which you are currently doing). Here's your next step: First, remember that a pushdown button (the "Run BLAST" submit button) is just like the other form objects. It, too, has to be set when you submit the form. I viewed the source of the form and the html for the "Run BLAST" submit button is as follows:

    <INPUT TYPE="submit" NAME="value" VALUE="Run BLAST">
    Therefore, all you should need to do is change the following code
    my $response = $browser->post( $URL, [ 'Algorithem' => 'blastx', 'BlastTargetSet' => 'ATH1_pep', 'QueryText' => $sequence, ] );
    to
    my $response = $browser->post( $URL, [ 'Algorithem' => 'blastx', 'BlastTargetSet' => 'ATH1_pep', 'QueryText' => $sequence, 'value' => 'Run BLAST', ] );
    This will at least submit the form as though the "Run BLAST" submit button has been pushed. For further information you should check out Perl & LWP.

    hope this helps,

    davidj

Re: Filling in a form on a web page using perl
by gjb (Vicar) on Aug 18, 2005 at 18:47 UTC

    First of, note that what you try to do might be illegal since a number of sites prohibit the use of scripts to interact with them. You'll have to check the end user agreement to make sure.

    Technically, a number of things went wrong, see the code below for something that seems to work:

    my $URL = "http://www.arabidopsis.org/cgi-bin/Blast/TAIRblast.pl"; #BLASTX:NT query, AA db ->blastx #AGI Proteins (Protein) ->ATH1_pep #query sequence ->QueryText my $response = $browser->post( $URL, [ 'Algorithm' => 'blastx', 'BlastTargetSet' => 'ATH1_pep', 'QueryText' => $sequence, 'ReplyTo' => 'monk@perlmonks.org' ] );
    The URL for the request was wrong, the first parameter is 'Algorithm' and not 'Algorithem' and you need to supply an email address, apparently.

    Hope this helps, -gjb-