Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

Re: Extracting Text After <pre> tag in HTML

by gellyfish (Monsignor)
on Sep 22, 2006 at 21:25 UTC ( #574437=note: print w/ replies, xml ) Need Help??


in reply to Extracting Text After <pre> tag in HTML

Just for the sake of completeness here's how you might do it with HTML::Parser:

use HTML::Parser; my $VAR1 = '<html><title>GAL7</title> <body bgcolor=white> <h2 align=center>GAL7</h2><hr> <form method="post" action="/cgi-bin/SCPD/getgene2?GAL7" enctype="appl +ication/x-www-form-urlencoded"> <input type="submit" name="action" value="Get mapped sites" /><input t +ype="submit" name="action" value="Get putative sites" /><input type=" +submit" name="action" value="Get interg enic region" /><br /><input type="submit" name="action" value="Retriev +e sequence" />Start<-ATG <input type="text" name="start" value="-450" + size="5" maxlength="5" />ATG->End <inp ut type="text" name="end" value="50" size="5" maxlength="5" /><div></d +iv></form><hr> <pre> >YBR018C GAL7 275433 275933 TTTGATATCACTCACAACTATTGCGAAGCGCTTCAGTGAAAAAATCATAA GGAAAAGTTGTAAATATTATTGGTAGTATTCGTTTGGTAAAGTAGAGGGG GTAATTTTTCCCCTTTATTTTGTTCATACATTCTTAAATTGCTTTGCCTC TCCTTTTGGAAAGCTATACTTCGGAGCACTGTTGAGCGAAGGCTCATTAG ATATATTTTCTGTCATTTTCCTTAACCCAAAAATAAGGGAAAGGGTCCAA AAAGCGCTCGGACAACTGTTGACCGTGATCCGAAGGACTGGCTATACAGT GTTCACAAAATAGCCAAGCTGAAAATAATGTGTAGCTATGTTCAGTTAGT TTGGCTAGCAAAGATATAAAAGCAGGTCGGAAATATTTATGGGCATTATT ATGCAGAGCATCAACATGATAAAAAAAAACAGTTGAATATTCCCTCAAAA ATGACTGCTGAAGAATTTGATTTTTCTAGCCATTCCCATAGACGTTACAA </pre>Some other stuff</body></html>'; sub default_start { my ($self, $tagname) = @_; if ( $tagname eq 'pre' ) { $self->handler(text => \&get_text, "self,dtext"); $self->handler(end => \&end_text, "self,tagname"); } } sub get_text { my ($self, $text) = @_; if ( not exists $self->{_text} ) { $self->{_text} = $text; } else { $self->{_text} .= $text; } } sub end_text { my ( $self, $tagname) = @_; if ( $tagname eq 'pre' ) { $self->handler(text => ''); $self->handler(start => ''); $self->handler(end => ''); } } my $parser = HTML::Parser->new(start_h => [\&default_start,'self,tagna +me']); $parser->parse($VAR1); print $parser->{_text};
This might have the advantage over using other parsers if you are dealing with large documents as it doesn't build a preparsed representation of the documentation before handing the events to you.

/J\


Comment on Re: Extracting Text After <pre> tag in HTML
Download Code
Re^2: Extracting Text After <pre> tag in HTML
by Anonymous Monk on Mar 30, 2007 at 08:41 UTC
    how to work
     PRE tag within text area

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://574437]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others avoiding work at the Monastery: (10)
As of 2014-09-22 10:02 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (186 votes), past polls