Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid

Re^4: Any spider framework?

by jdrago999 (Pilgrim)
on Jan 08, 2012 at 06:40 UTC ( #946828=note: print w/replies, xml ) Need Help??

in reply to Re^3: Any spider framework?
in thread Any spider framework?


As promised, the patches/updates/POD have been applied, github now hosts the code and I've put the newest release on github at

Thanks everyone for your suggestions and time...

Now you can get the HTML::LinkExtor version of link-parsing by specifying 'link_parser => "HTML::LinkExtor"' in the constructor. Otherwise you get the 'default' (original, regexp-based) way.

Maybe this could be use something slick like Web::Query to get at that information (which, for me, was the whole point).

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://946828]
[ambrus]: If you want to typeset a manuscript, you can still do much less work then in the manual typesetting ages and get good formatting.
[ambrus]: All with only cheap modern computers and software.
[ambrus]: Something you can have at home and your corner print shop, without a whole printing press's worth of equipment.
[ambrus]: As for TeX, I'm not trying to discourage anyone from writing carefully beautifully typeset documents, in maths or outside. But most people aren't willing to do that, and will spend only little time about the formatting,
[ambrus]: and try to leave everything else to automated systems without checking how what they write came out format-wise, and for those people, discounting the part about journals with a specific format above,
[ambrus]: just blindly recommending to use LaTeX is a bad idea now.

How do I use this? | Other CB clients
Other Users?
Others browsing the Monastery: (10)
As of 2017-09-26 11:18 GMT
Find Nodes?
    Voting Booth?
    During the recent solar eclipse, I:

    Results (293 votes). Check out past polls.