Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW

Re: HTML Parser suggestions

by tobyink (Abbot)
on Jan 11, 2013 at 21:27 UTC ( #1012982=note: print w/replies, xml ) Need Help??

in reply to HTML Parser suggestions

I'm biased, but I'll suggest HTML::HTML5::Parser. It uses the HTML5 parsing algorithm, so if faced with messy tag soup HTML, should very closely match how most desktop browsers parse HTML.

Quick example:

use 5.010; use strict; use warnings; use HTML::HTML5::Parser; use XML::LibXML::QuerySelector; my @elements = HTML::HTML5::Parser:: -> load_html(location => " +0") -> querySelectorAll("title"); say for @elements;
perl -E'sub Monkey::do{say$_,for@_,do{($monkey=[caller(0)]->[3])=~s{::}{ }and$monkey}}"Monkey say"->Monkey::do'

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1012982]
[RonW]: RPerl is only a curiosity to me. I can see where some one who primarily codes in Perl might find RPerl useful, but to me, given the choice between RPerl's restrictions and C, I'd choose C
[LanX]: Rperl had better chances as alternative for inline::cpp
[LanX]: thus optimizing inner loops
[LanX]: .... well

How do I use this? | Other CB clients
Other Users?
Others romping around the Monastery: (6)
As of 2017-05-22 20:45 GMT
Find Nodes?
    Voting Booth?