Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight

Re: Stripping HTML tags from a document

by cjf (Parson)
on Jun 30, 2002 at 15:55 UTC ( #178380=note: print w/ replies, xml ) Need Help??

in reply to Strip HTML tags again

Have a look at HTML::Tagset it contains various lists of valid HTML tags for different sections of a document.

Update: ++ to Ovid for providing the working example below.

Comment on Re: Stripping HTML tags from a document
Replies are listed 'Best First'.
Re: Re: Stripping HTML tags from a document
by dda (Friar) on Jun 30, 2002 at 16:27 UTC
    Thanks!!! It is the stuff I was looking for. Now I'd like to know how to use it in a 'perl' manner. Currently I have the following code (right from perlfaq):
    sub strip_html { my $t = shift; $t =~ s/<(?:[^>'"]*|(['"]).*?\1)*>//gs; return $t; }
    Seems like I have to use %HTML::Tagset::isKnown hash, but how to apply it to my sub? I can't find any quick way...


Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://178380]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chanting in the Monastery: (10)
As of 2015-10-13 21:29 GMT
Find Nodes?
    Voting Booth?

    Does Humor Belong in Programming?

    Results (316 votes), past polls