Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Re: using the headers method of HTML::TableExtract to find an image

by kal (Hermit)
on Apr 02, 2001 at 17:19 UTC ( #68989=note: print w/replies, xml ) Need Help??


in reply to using the headers method of HTML::TableExtract to find an image

Forgive me, but I'm not exactly sure if I understand your question. If I haven't, try to rephrase - with examples, if possible.

Now, by my understanding, you're trying to pick out a table with a <img ..> tag in the <th..> tag? I've never tried this myself, but it's quite possible that it's only evaluating text nodes - that is, the tag is markup, not content, even if it has attributes. This is obvious, because <img ..> is an empty tag - in X/HTML, it would be written <img ../>, making it plain it contains no text nodes.

Probably the best way will be to write your own parser in HTML::Parser, or (better) extend HTML::TableExtract to make it possible to use 'nodes' (the tags :) and their attributes within the evaluation. Or, if you're dealing with XHTML, you could parse it using an XML::Parser, and then use XML::XPath to generate a query which would automatically find your answer! (Check out XPath if you haven't before - you can search through parsed XML trees for tags based on their name, their text content, their attributes, their lineage, etc. - sooper :) That's the preferred way, probably, but I suspect you're parsing someone else's web pages, so I guess it's probably not possible.

Have I made any sense??

  • Comment on Re: using the headers method of HTML::TableExtract to find an image

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://68989]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others examining the Monastery: (11)
As of 2019-06-18 15:39 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Is there a future for codeless software?



    Results (82 votes). Check out past polls.

    Notices?
    • (Sep 10, 2018 at 22:53 UTC) Welcome new users!