http://www.perlmonks.org?node_id=406572


in reply to HTML TokeParser - help with using get_text, get_trimmed_text

HTML::Parser, while harder to use, will give you control over what tags are parsed and how that parsing is handled per tag if needed.

Since HTML::TokeParser is breaking all tags down for you, you have to examine every token it gives you and put those "back together" that you want to output. Each token has enough information to reconstruct the data for output.

  • Comment on Re: HTML TokeParser - help with using get_text, get_trimmed_text