Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Re^2: Why doesn't my scraper work?

by jdlev (Scribe)
on Nov 19, 2013 at 22:25 UTC ( #1063403=note: print w/replies, xml ) Need Help??


in reply to Re: Why doesn't my scraper work?
in thread Why doesn't my scraper work?

Is there a way for HTML TableExtract to look up a table with the attribute "class = etc"? I've tried that before and it seems it doesn't like looking for a class name?
I love it when a program comes together - jdhannibal

Replies are listed 'Best First'.
Re^3: Why doesn't my scraper work?
by tangent (Priest) on Nov 19, 2013 at 23:00 UTC
    it seems it doesn't like looking for a class name
    In what way does it not like it? As long as you initialise the module with the attributes you want it should not have a problem:
    my $te = HTML::TableExtract->new( attribs=> { class=>'class-name' } ); $te->parse($html_string); for my $ts ($te->tables) { print "Table with class 'class-name' found\n"; }
Re^3: Why doesn't my scraper work?
by jdlev (Scribe) on Nov 19, 2013 at 22:58 UTC
    For others wondering I figured it out by using WWW::Mechanize as opposed to LWP::Simple when fetching the original data. It's at least saving the full code from the page now by using www::mechanize.
    I love it when a program comes together - jdhannibal

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1063403]
help
Chatterbox?
[shmem]: choroba: the "and" extension for a "use" expression would be very nifty, indeed
[shmem]: e.g. use Foo and Bar; - use Bar.pm if loading Foo.pm succeeds; and use Foo or Bar; - load Bar.pm if loading Foo.pm fails

How do I use this? | Other CB clients
Other Users?
Others romping around the Monastery: (12)
As of 2017-11-21 17:22 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    In order to be able to say "I know Perl", you must have:













    Results (306 votes). Check out past polls.

    Notices?