Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

Re^2: The State of Web spidering in Perl

by digital_carver (Sexton)
on Sep 22, 2013 at 16:49 UTC ( #1055192=note: print w/ replies, xml ) Need Help??


in reply to Re: The State of Web spidering in Perl
in thread The State of Web spidering in Perl

I'll give HTML::Parser a second look, thanks for the suggestion. How do you match something like //div[@id='blah']/p though, do you explicitly maintain state?

As for LWP vs Mech, LWP does work for my use case, I just prefer Mech for a few niceties like autocheck, auto-delegation of $mech->content() to $response->decoded_content(), cookie_jar defaulting to on, etc.

Replies are listed 'Best First'.
Re^3: The State of Web spidering in Perl
by Anonymous Monk on Sep 23, 2013 at 00:03 UTC

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1055192]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others having an uproarious good time at the Monastery: (4)
As of 2016-07-01 05:28 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    My preferred method of making French fries (chips) is in a ...











    Results (406 votes). Check out past polls.