Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask

Re: Strip HTML tags again

by Ovid (Cardinal)
on Jun 30, 2002 at 20:20 UTC ( #178410=note: print w/replies, xml ) Need Help??

in reply to Strip HTML tags again

This problem looks tailor-made for my HTML::TokeParser::Simple module, when combined with HTML::Tagset. The following test will demonstrate:

#!/usr/bin/perl -w use strict; use HTML::TokeParser::Simple; use HTML::Tagset; my $html = <<'END_HTML'; <a href="mylink">text1</a> <this is normal text> END_HTML my $p = HTML::TokeParser::Simple->new( \$html ); while ( my $token = $p->get_token ) { next if ! $token->is_text and exists $HTML::Tagset::isKnown{ $token->return_tag }; print $token->return_text; }


<this is normal text>


Join the Perlmonks Setiathome Group or just click on the the link and check out our stats.

Replies are listed 'Best First'.
Re^2: Strip HTML tags again
by Your Mother (Chancellor) on Mar 05, 2009 at 01:02 UTC

    ++ for the original. I'm posting an updated example because some changes to the module seem to have borked your example. This is an in place stripper--based on the one you posted--with the newer/working syntax.

    sub strip_html { my $renew = ""; my $p = HTML::TokeParser::Simple->new(\$_[0]); no warnings "uninitialized"; while ( my $token = $p->get_token ) { next if ! $token->is_text and exists $HTML::Tagset::isKnown{ $token->get_tag }; $renew .= $token->as_is; } $_[0] = $renew; }

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://178410]
[1nickt]: TCLion are you parsing the dates with DateTime or another tool?
[1nickt]: I note DateTime::Format:: Flexible, announced immediately below this box in the CPAN feed
[Corion]: 1nickt: That one sounds sensible as it will tell you when it found an error
[1nickt]: Throws on error, I believe
[1nickt]: But should handle month as number or word
[TCLion]: I am breaking down the line and putting the date in the correct order during output

How do I use this? | Other CB clients
Other Users?
Others studying the Monastery: (19)
As of 2017-03-23 14:48 GMT
Find Nodes?
    Voting Booth?
    Should Pluto Get Its Planethood Back?

    Results (287 votes). Check out past polls.