Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses

Re: Untemplating

by Chmrr (Vicar)
on Jul 16, 2002 at 03:39 UTC ( #181989=note: print w/replies, xml ) Need Help??

in reply to Untemplating

By far the most common solution to this is to use one of the HTML modules. Yes, you say that the html is "non-standard" -- but, truth to be told, most HTML out there is, and the HTML-parsing modules know that, and are perfectly able to cope. If they were only able to deal with perfectly syntactic HTML, they'd be called XML-parsing, not HTML-parsing. :)

My personal favorite tool for extracting data from web pages is HTML::TreeBuilder -- in your case, it would be a simple matter of asking for all <td> elements, and grabbing the various answers out of them. You may find the dump method particularly useful in examining what the parser makes of your HTML.

perl -pe '"I lo*`+$^X$\"$]!$/"=~m%(.*)%s;$_=$1;y^`+*^e v^#$&V"+@( NO CARRIER'

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://181989]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others browsing the Monastery: (3)
As of 2016-10-21 00:35 GMT
Find Nodes?
    Voting Booth?
    How many different varieties (color, size, etc) of socks do you have in your sock drawer?

    Results (284 votes). Check out past polls.