Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

Re^2: Web scraping toolkit?

by mzedeler (Pilgrim)
on Jan 27, 2012 at 08:44 UTC ( #950285=note: print w/ replies, xml ) Need Help??


in reply to Re: Web scraping toolkit?
in thread Web scraping toolkit?

I think that App::scrape may turn out to be insufficient, not covering some edge cases that needs handling. But again - thats my general worry, not having tried any of the scraping modules yet (the same goes for Web::Scraper and Scrappy).

WWW::Mechanize::Firefox looks very promising, and implementing the few extra features that Scrapie has (logging and such) shouldn't be a problem. The real drawback lies in having to rely on firefox (or some similar component) in development and production.

I'll go back to the drawing board and see what to do. Thanks for the pointers.


Comment on Re^2: Web scraping toolkit?

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://950285]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others contemplating the Monastery: (4)
As of 2015-08-31 04:20 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The oldest computer book still on my shelves (or on my digital media) is ...













    Results (352 votes), past polls