ganeshPerlStarter has asked for the wisdom of the Perl Monks concerning the following question:
Dear Friends
I am learning perl and trying to use it in my project.
I want to extract data from content in html files.
But, I want to ignore tables & images from html files. I used HTML::TreeBuilder::XPath for html parsing, but could not find a way to specifically ignore certain tags.
I also thought of greping out the lines between opening and closing tags, but it broke some other html tags.
How can I ignore such tags from html file and then get the text content of that file?
Thanks in advance for your help and time.
Best Regards
ganesh