good chemistry is complicated,
and a little bit messy -LW
Re^4: Any spider framework?by jdrago999 (Pilgrim)
|on Jan 08, 2012 at 06:40 UTC||Need Help??|
As promised, the patches/updates/POD have been applied, github now hosts the code and I've put the newest release on github at https://github.com/jdrago999/WWW-Crawler-Lite
Thanks everyone for your suggestions and time...
Now you can get the HTML::LinkExtor version of link-parsing by specifying 'link_parser => "HTML::LinkExtor"' in the constructor. Otherwise you get the 'default' (original, regexp-based) way.
Maybe this could be changed...actually...to use something slick like Web::Query to get at that information (which, for me, was the whole point).