Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic

(need feedback) Re: HTML::LinkExtractor

by PodMaster (Abbot)
on Aug 24, 2002 at 14:56 UTC ( #192544=note: print w/ replies, xml ) Need Help??

in reply to HTML::LinkExtractor

I thought about adding the following

=head2 SNIPPET You've just gotten a link with C<_TEXT> but you don't want the HTML crap that comes with the text. While C<HTML::LinkExtractor> won't get rid of it for you, it's easier than easy with C<HTML::TokeParser::Simp +le> use HTML::TokeParser::Simple; my $Link = { '_TEXT' => '<a href=""> I am a LINK!! +! </a>'}; warn StripHTML( \$Link->{_TEXT} ); warn StripHTML( \'<q>Turn on your love light BABY!</q>' ); sub StripHTML { my $HtmlRef = shift; my $tp = new HTML::TokeParser::Simple( $HtmlRef ); my $t = $tp->get_token(); # MUST BE A START TAG (@TAGS_IN_NEED +) # otherwise it ain't come from LinkE +xtractor if($t->is_start_tag) { return $tp->get_trimmed_text( '/'.$t->return_tag ); } else { die " IMPOSSIBLE!!!! "; } } =head1 AUTHOR
But then it hit me, why not just provide this as a package method?

Or provide an option to do this automatically?

Use get_text instead of get_trimmed_text (maybe make this an option as well)?

BTW ~ I'm gonna stick with HTML::TokeParser::Simple. Ovid doesn't need the publicy, but I like it. This'll be on CPAN before monday.

update: well, I made some changes and put it up on CPAN

** The Third rule of perl club is a statement of fact: pod is sexy.

Comment on (need feedback) Re: HTML::LinkExtractor
Download Code

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://192544]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others romping around the Monastery: (10)
As of 2014-12-20 20:58 GMT
Find Nodes?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?

    Results (98 votes), past polls