Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW

Re: Re: HTML:TableExtract...?

by Trimbach (Curate)
on Dec 30, 2002 at 22:37 UTC ( #223201=note: print w/replies, xml ) Need Help??

in reply to Re: HTML:TableExtract...?
in thread HTML:TableExtract...?

The "table states" are used to deal with situations where you have tables within cells of other tables within cells of other tables (...etc.) The primary aim of the module is to "extract" data from heavily formatted web pages, not tables used to store plain data. Yeah, it's overkill if you want to suck data out a data table with no embedded subtables, but that's why there's shortcut methods.

It's actually a pretty useful module for anyone who's tried to zero in on some piece of data on a webpage and have pulled their hair out trying to get a home-rolled regex-based or HTML::Parser solution to work.

It's quite a spiffy module. Lets you get on and worry about other things than deciphering a page full of td tags. :-)

Gary Blackburn
Trained Killer

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://223201]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chanting in the Monastery: (4)
As of 2021-10-26 21:13 GMT
Find Nodes?
    Voting Booth?
    My first memorable Perl project was:

    Results (90 votes). Check out past polls.