Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid

Re: Extracting Text After <pre> tag in HTML

by gellyfish (Monsignor)
on Sep 22, 2006 at 21:25 UTC ( #574437=note: print w/ replies, xml ) Need Help??

in reply to Extracting Text After <pre> tag in HTML

Just for the sake of completeness here's how you might do it with HTML::Parser:

use HTML::Parser; my $VAR1 = '<html><title>GAL7</title> <body bgcolor=white> <h2 align=center>GAL7</h2><hr> <form method="post" action="/cgi-bin/SCPD/getgene2?GAL7" enctype="appl +ication/x-www-form-urlencoded"> <input type="submit" name="action" value="Get mapped sites" /><input t +ype="submit" name="action" value="Get putative sites" /><input type=" +submit" name="action" value="Get interg enic region" /><br /><input type="submit" name="action" value="Retriev +e sequence" />Start<-ATG <input type="text" name="start" value="-450" + size="5" maxlength="5" />ATG->End <inp ut type="text" name="end" value="50" size="5" maxlength="5" /><div></d +iv></form><hr> <pre> >YBR018C GAL7 275433 275933 TTTGATATCACTCACAACTATTGCGAAGCGCTTCAGTGAAAAAATCATAA GGAAAAGTTGTAAATATTATTGGTAGTATTCGTTTGGTAAAGTAGAGGGG GTAATTTTTCCCCTTTATTTTGTTCATACATTCTTAAATTGCTTTGCCTC TCCTTTTGGAAAGCTATACTTCGGAGCACTGTTGAGCGAAGGCTCATTAG ATATATTTTCTGTCATTTTCCTTAACCCAAAAATAAGGGAAAGGGTCCAA AAAGCGCTCGGACAACTGTTGACCGTGATCCGAAGGACTGGCTATACAGT GTTCACAAAATAGCCAAGCTGAAAATAATGTGTAGCTATGTTCAGTTAGT TTGGCTAGCAAAGATATAAAAGCAGGTCGGAAATATTTATGGGCATTATT ATGCAGAGCATCAACATGATAAAAAAAAACAGTTGAATATTCCCTCAAAA ATGACTGCTGAAGAATTTGATTTTTCTAGCCATTCCCATAGACGTTACAA </pre>Some other stuff</body></html>'; sub default_start { my ($self, $tagname) = @_; if ( $tagname eq 'pre' ) { $self->handler(text => \&get_text, "self,dtext"); $self->handler(end => \&end_text, "self,tagname"); } } sub get_text { my ($self, $text) = @_; if ( not exists $self->{_text} ) { $self->{_text} = $text; } else { $self->{_text} .= $text; } } sub end_text { my ( $self, $tagname) = @_; if ( $tagname eq 'pre' ) { $self->handler(text => ''); $self->handler(start => ''); $self->handler(end => ''); } } my $parser = HTML::Parser->new(start_h => [\&default_start,'self,tagna +me']); $parser->parse($VAR1); print $parser->{_text};
This might have the advantage over using other parsers if you are dealing with large documents as it doesn't build a preparsed representation of the documentation before handing the events to you.


Comment on Re: Extracting Text After <pre> tag in HTML
Download Code
Replies are listed 'Best First'.
Re^2: Extracting Text After <pre> tag in HTML
by Anonymous Monk on Mar 30, 2007 at 08:41 UTC
    how to work
     PRE tag within text area

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://574437]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others examining the Monastery: (5)
As of 2015-11-28 18:34 GMT
Find Nodes?
    Voting Booth?

    What would be the most significant thing to happen if a rope (or wire) tied the Earth and the Moon together?

    Results (743 votes), past polls