Pathologically Eclectic Rubbish Lister | |
PerlMonks |
comment on |
( [id://3333]=superdoc: print w/replies, xml ) | Need Help?? |
Hi Monks,
When I say parse, I mean extracting the text and formatting information like font size, bold, is anchor etc... The main requirements are speed and fault tolerance. There are a few possible solutions that came to my mind: - HTML::Parser would be the easiest, but is it the fastest? - Parse::RecDescent - Regex - XML::Parser, maybe not fault tolerant enought - Build something with flex or bison and make XS binding What do you think is the fastest way to parse realworld(tm) HTML? In reply to What is the fastest way to parse HTML? by sri
|
|