Re: How to strip HTML using latest module

in reply to How to strip HTML using latest module

Here's a version using the HTML::Parser v.2 interface:

#!/usr/bin/perl -w
use strict;
use LWP::Simple qw(get);
use HTML::Parser;

my $parser = Example->new();
my $html   = get("http://www.perlmonks.org")
    or die "Cannot fetch the HTML\n";

$parser->parse($html);

package Example;
use base qw(HTML::Parser);
sub text {
    my ($self,$text) = @_;
    print $text;
}
[download]

And here's the same script, but using the HTML::Parser version 3 interface. This one is easier to use because you generally don't have to make a new package to parse the html (though you can, if you really want to!).

#!/usr/bin/perl -w
use strict;
use LWP::Simple qw(get);
use HTML::Parser;

my $html = get("http://www.perlmonks.org");

my $parser = HTML::Parser->new(
    text_h => [ sub { print shift }, 'dtext' ]
);
$parser->parse($html);
[download]

<kbd>--
my $OeufMayo = new PerlMonger::Paris({http => 'paris.mongueurs.net'});</kbd>

Comment on Re: How to strip HTML using latest module Select or Download Code

Replies are listed 'Best First'.
Re: Re: How to strip HTML using latest module by f0dder (Novice) on Aug 23, 2001 at 22:04 UTC
Sweet!!! Thank You. I tried both examples and they work. I now feel so giddy. I also just learned how to turn on autocomplete in the NT cmd shell. This allows bash like autocomplete in both NT & W2k. In HKEY_CURRENT_USER\|Software\|Microsoft\|CommandProcessor change CompletionChar to 9	[reply]

In Section Seekers of Perl Wisdom