Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine

Re: Untemplating

by Chmrr (Vicar)
on Jul 16, 2002 at 03:39 UTC ( #181989=note: print w/replies, xml ) Need Help??

in reply to Untemplating

By far the most common solution to this is to use one of the HTML modules. Yes, you say that the html is "non-standard" -- but, truth to be told, most HTML out there is, and the HTML-parsing modules know that, and are perfectly able to cope. If they were only able to deal with perfectly syntactic HTML, they'd be called XML-parsing, not HTML-parsing. :)

My personal favorite tool for extracting data from web pages is HTML::TreeBuilder -- in your case, it would be a simple matter of asking for all <td> elements, and grabbing the various answers out of them. You may find the dump method particularly useful in examining what the parser makes of your HTML.

perl -pe '"I lo*`+$^X$\"$]!$/"=~m%(.*)%s;$_=$1;y^`+*^e v^#$&V"+@( NO CARRIER'

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://181989]
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (7)
As of 2018-06-24 03:55 GMT
Find Nodes?
    Voting Booth?
    Should cpanminus be part of the standard Perl release?

    Results (126 votes). Check out past polls.