Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

Re^2: The State of Web spidering in Perl

by digital_carver (Sexton)
on Sep 22, 2013 at 16:49 UTC ( #1055192=note: print w/replies, xml ) Need Help??


in reply to Re: The State of Web spidering in Perl
in thread The State of Web spidering in Perl

I'll give HTML::Parser a second look, thanks for the suggestion. How do you match something like //div[@id='blah']/p though, do you explicitly maintain state?

As for LWP vs Mech, LWP does work for my use case, I just prefer Mech for a few niceties like autocheck, auto-delegation of $mech->content() to $response->decoded_content(), cookie_jar defaulting to on, etc.

Replies are listed 'Best First'.
Re^3: The State of Web spidering in Perl
by Anonymous Monk on Sep 23, 2013 at 00:03 UTC

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1055192]
help
Chatterbox?
[marto]: it makes things less stressful for the parents that's for sure :P
[Corion]: Yeah (although their mother was somewhat stressed out regardless) - but it made for an otherwise relaxed evening ;)

How do I use this? | Other CB clients
Other Users?
Others having an uproarious good time at the Monastery: (5)
As of 2016-12-08 09:58 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    On a regular basis, I'm most likely to spy upon:













    Results (139 votes). Check out past polls.