Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Re: Parsing HTML

by thunders (Priest)
on Jun 14, 2004 at 06:05 UTC ( #366418=note: print w/ replies, xml ) Need Help??


in reply to Parsing HTML

Many of the modules that inherit from HTML::Parser are quite capable for this kind of task. I like HTML::TokeParser::Simple.

#!/usr/bin/perl -w use strict; use LWP::Simple; use HTML::TokeParser::Simple; my $site = $ARGV[0]; my $content = get($site); my $parser = HTML::TokeParser::Simple->new(\$content); while ( my $token = $parser->get_token ) { next unless $token->is_text; print $token->as_is; }


Comment on Re: Parsing HTML
Download Code

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://366418]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others browsing the Monastery: (13)
As of 2015-07-01 18:24 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (16 votes), past polls