Dear Perl Monks,
I am working on parsing html to obtain all the hrefs that match a particular url(lets call it "target url") and then get the anchor text. I have used TreeBuilder, TokeParser etc but if html contains a link like below:
a href="http://www.yahoo.com" target=_blank><img src=http://us.i1.yimg.com/nw.gif height=11 width=11 border=0 alt="Open this result in new window"> </anchor>
If my target url is "http://www.yahoo.com", since there is no anchor text, I get the text in alt text of the img tag("Open this result in new window") as the anchor text.
I was wondering if any one can help me out with a regexp to parse all the anchor tags and if the href matches my target url, then return anchor text if exists or "image" if there is a img tag.
Thanks in advance.