Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Re: Stripping HTML tags from a document

by cjf (Parson)
on Jun 30, 2002 at 15:55 UTC ( #178380=note: print w/replies, xml ) Need Help??


in reply to Strip HTML tags again

Have a look at HTML::Tagset it contains various lists of valid HTML tags for different sections of a document.

Update: ++ to Ovid for providing the working example below.

  • Comment on Re: Stripping HTML tags from a document

Replies are listed 'Best First'.
Re: Re: Stripping HTML tags from a document
by dda (Friar) on Jun 30, 2002 at 16:27 UTC
    Thanks!!! It is the stuff I was looking for. Now I'd like to know how to use it in a 'perl' manner. Currently I have the following code (right from perlfaq):
    sub strip_html { my $t = shift; $t =~ s/<(?:[^>'"]*|(['"]).*?\1)*>//gs; return $t; }
    Seems like I have to use %HTML::Tagset::isKnown hash, but how to apply it to my sub? I can't find any quick way...

    --dda

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://178380]
help
Chatterbox?
[erix]: I recognise the makings of a fine argument
[LanX]: lanx wonders ... how likely is it to talk >95% BS without intention?
[erix]: "gigantic amounts of data" is also not SQLite (imho)
talexb wonders why sqlite2 was deprecated in favour of sqlite3.
[erix]: looks like a fork, rather, no?
LanX /me /me
[erix]: /hehehe
[marto]: LanX yesterday I found out about Gish gallop tactic

How do I use this? | Other CB clients
Other Users?
Others rifling through the Monastery: (14)
As of 2017-07-28 15:35 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    I came, I saw, I ...
























    Results (431 votes). Check out past polls.