good chemistry is complicated, and a little bit messy -LW |
|
PerlMonks |
comment on |
( [id://3333]=superdoc: print w/replies, xml ) | Need Help?? |
Hello all. I am attempting to write a web crawler in Scrappy. Well, there's your problem! Even you say It is hard to find examples of a working Scrappy script -- there is a good reason for that, scrappy is too much pee :) I would not recommend scrappy but Web::Scraper See WWW::Mechanize, subclass WWW::Scripter::Plugin::JavaScript, WWW::Mechanize::Firefox, Web::Scraper, App::scrape extracting data from html using xpath extract data from html with xpath extract a substring between two emements, its css/xpath time again creating a web crawler with Mechanize Super Search for Mechanize/Scripter examples See Re^5: can't get WWW::Mechanize to sign in on JustAnswer, Re: Mimicking Internet Explorer (IE) via LWP or Mechanize? Are there any memory-efficient web scrapers?, Get 10,000 web pages fast, Async DNS with LWP In reply to Re: Scrappy user_agent error
by Anonymous Monk
|
|