http://www.perlmonks.org?node_id=1012982


in reply to HTML Parser suggestions

I'm biased, but I'll suggest HTML::HTML5::Parser. It uses the HTML5 parsing algorithm, so if faced with messy tag soup HTML, should very closely match how most desktop browsers parse HTML.

Quick example:

use 5.010; use strict; use warnings; use HTML::HTML5::Parser; use XML::LibXML::QuerySelector; my @elements = HTML::HTML5::Parser:: -> load_html(location => "http://www.perlmonks.org/?node_id=101298 +0") -> querySelectorAll("title"); say for @elements;
perl -E'sub Monkey::do{say$_,for@_,do{($monkey=[caller(0)]->[3])=~s{::}{ }and$monkey}}"Monkey say"->Monkey::do'