Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

Re: Untemplating

by Chmrr (Vicar)
on Jul 16, 2002 at 03:39 UTC ( #181989=note: print w/ replies, xml ) Need Help??


in reply to Untemplating

By far the most common solution to this is to use one of the HTML modules. Yes, you say that the html is "non-standard" -- but, truth to be told, most HTML out there is, and the HTML-parsing modules know that, and are perfectly able to cope. If they were only able to deal with perfectly syntactic HTML, they'd be called XML-parsing, not HTML-parsing. :)

My personal favorite tool for extracting data from web pages is HTML::TreeBuilder -- in your case, it would be a simple matter of asking for all <td> elements, and grabbing the various answers out of them. You may find the dump method particularly useful in examining what the parser makes of your HTML.

perl -pe '"I lo*`+$^X$\"$]!$/"=~m%(.*)%s;$_=$1;y^`+*^e v^#$&V"+@( NO CARRIER'


Comment on Re: Untemplating

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://181989]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others about the Monastery: (4)
As of 2015-08-01 02:54 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (285 votes), past polls