Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

Re: Untemplating

by Chmrr (Vicar)
on Jul 16, 2002 at 03:39 UTC ( #181989=note: print w/ replies, xml ) Need Help??


in reply to Untemplating

By far the most common solution to this is to use one of the HTML modules. Yes, you say that the html is "non-standard" -- but, truth to be told, most HTML out there is, and the HTML-parsing modules know that, and are perfectly able to cope. If they were only able to deal with perfectly syntactic HTML, they'd be called XML-parsing, not HTML-parsing. :)

My personal favorite tool for extracting data from web pages is HTML::TreeBuilder -- in your case, it would be a simple matter of asking for all <td> elements, and grabbing the various answers out of them. You may find the dump method particularly useful in examining what the parser makes of your HTML.

perl -pe '"I lo*`+$^X$\"$]!$/"=~m%(.*)%s;$_=$1;y^`+*^e v^#$&V"+@( NO CARRIER'


Comment on Re: Untemplating

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://181989]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others having an uproarious good time at the Monastery: (6)
As of 2014-12-27 10:54 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (177 votes), past polls