http://www.perlmonks.org?node_id=1063400


in reply to Why doesn't my scraper work?

My best guess is that there's some Javascript involved, whch makes things a lot more complicated when scraping is involved.

You should also keep in mind that scraping a site like http://www.cbssports.com might be against their Terms Of Use. If there's an API that you can use instead, all the better.

Alex / talexb / Toronto

Thanks PJ. We owe you so much. Groklaw -- RIP -- 2003 to 2013.

Replies are listed 'Best First'.
Re^2: Why doesn't my scraper work?
by jdlev (Scribe) on Nov 19, 2013 at 22:25 UTC
    Is there a way for HTML TableExtract to look up a table with the attribute "class = etc"? I've tried that before and it seems it doesn't like looking for a class name?
    I love it when a program comes together - jdhannibal
      it seems it doesn't like looking for a class name
      In what way does it not like it? As long as you initialise the module with the attributes you want it should not have a problem:
      my $te = HTML::TableExtract->new( attribs=> { class=>'class-name' } ); $te->parse($html_string); for my $ts ($te->tables) { print "Table with class 'class-name' found\n"; }
      For others wondering I figured it out by using WWW::Mechanize as opposed to LWP::Simple when fetching the original data. It's at least saving the full code from the page now by using www::mechanize.
      I love it when a program comes together - jdhannibal