Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot

Re: Remove HTML tags from document

by LazerRed (Pilgrim)
on Aug 03, 2003 at 22:12 UTC ( #280520=note: print w/replies, xml ) Need Help??

in reply to Remove HTML tags from document

Here's something I've been playing with lately. Maybe it'll help you.

sub strip { my $html = shift; my $p = HTML::PullParser->new( doc => $html, text => 'text', ); my $result = ''; while ( my $t = $p->get_token ) { $result .= $t->[0]; } return $result; }

I use this sub in a script that checks a status page on many different servers. It feeds the raw stats pages through the above sub, then parses the output text to generate a consolodated status report.

Whip me, Beat me, Make me use Y-ModemG.