http://www.perlmonks.org?node_id=107246


in reply to How to strip HTML using latest module

Here's a version using the HTML::Parser v.2 interface:

#!/usr/bin/perl -w use strict; use LWP::Simple qw(get); use HTML::Parser; my $parser = Example->new(); my $html = get("http://www.perlmonks.org") or die "Cannot fetch the HTML\n"; $parser->parse($html); package Example; use base qw(HTML::Parser); sub text { my ($self,$text) = @_; print $text; }

And here's the same script, but using the HTML::Parser version 3 interface. This one is easier to use because you generally don't have to make a new package to parse the html (though you can, if you really want to!).

#!/usr/bin/perl -w use strict; use LWP::Simple qw(get); use HTML::Parser; my $html = get("http://www.perlmonks.org"); my $parser = HTML::Parser->new( text_h => [ sub { print shift }, 'dtext' ] ); $parser->parse($html);
<kbd>--
my $OeufMayo = new PerlMonger::Paris({http => 'paris.mongueurs.net'});</kbd>

Replies are listed 'Best First'.
Re: Re: How to strip HTML using latest module
by f0dder (Novice) on Aug 23, 2001 at 22:04 UTC
    Sweet!!! Thank You. I tried both examples and they work. I now feel so giddy. I also just learned how to turn on autocomplete in the NT cmd shell. This allows bash like autocomplete in both NT & W2k.

    In HKEY_CURRENT_USER|Software|Microsoft|CommandProcessor change CompletionChar to 9