Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

Re: Stripping HTML tags from a document

by cjf (Parson)
on Jun 30, 2002 at 15:55 UTC ( #178380=note: print w/ replies, xml ) Need Help??


in reply to Strip HTML tags again

Have a look at HTML::Tagset it contains various lists of valid HTML tags for different sections of a document.

Update: ++ to Ovid for providing the working example below.


Comment on Re: Stripping HTML tags from a document
Re: Re: Stripping HTML tags from a document
by dda (Friar) on Jun 30, 2002 at 16:27 UTC
    Thanks!!! It is the stuff I was looking for. Now I'd like to know how to use it in a 'perl' manner. Currently I have the following code (right from perlfaq):
    sub strip_html { my $t = shift; $t =~ s/<(?:[^>'"]*|(['"]).*?\1)*>//gs; return $t; }
    Seems like I have to use %HTML::Tagset::isKnown hash, but how to apply it to my sub? I can't find any quick way...

    --dda

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://178380]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others rifling through the Monastery: (10)
As of 2014-09-21 14:45 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (172 votes), past polls