http://www.perlmonks.org?node_id=607887


in reply to Simplify parsing a file

For the parsing part I would use HTML::TokeParser::Simple, wrote by our brother Ovid; here it is an example from the documentation:

use HTML::TokeParser::Simple; my $p = HTML::TokeParser::Simple->new( $somefile ); while ( my $token = $p->get_token ) { # This prints all text in an HTML doc (i.e., it strips the HTML) next unless $token->is_text; print $token->as_is; }
Nice, isn't it? HTML parsing is not easy as it may seem, relying on a well written module is not a sin :)

Ciao, Valerio

Replies are listed 'Best First'.
Re^2: Simplify parsing a file
by myrrdyn (Novice) on Apr 02, 2007 at 18:30 UTC
    Wow. That is sweet. I'll post my implementation of it when I get a chance. Still, any thoughts on the multiple-file issue, or do you think this will address that too?