Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Re: In HTML , I Want to process only Data and Not tags

by GrandFather (Saint)
on Jul 25, 2006 at 20:39 UTC ( [id://563628]=note: print w/replies, xml ) Need Help??


in reply to In HTML , I Want to process only Data and Not tags

No. No way. Never ever (well not often) try to use simple rexen for parsing markup - life is too short to spend the months you would likley need to get the bugs out when others have already done it for you. For HTML see HTML::TreeBuilder. For XHTML or XML see XML::Twig.

See some of the answers to how to eliminate all html tags in a given string ??, and in particular the sample code shown in Re: how to eliminate all html tags in a given string ?? for some sample code and other related suggestions.


DWIM is Perl's answer to Gödel
  • Comment on Re: In HTML , I Want to process only Data and Not tags

Replies are listed 'Best First'.
Re^2: In HTML , I Want to process only Data and Not tags
by revdiablo (Prior) on Jul 25, 2006 at 21:33 UTC

    I second the vote for HTML::TreeBuilder, but I also would like to recommend XML::TreeBuilder. It uses the same handy API, which just makes my life so much simpler. There are most likely cases where other modules -- such as XML::Twig -- make more sense, but I don't know of them off the top of my head.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://563628]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having a coffee break in the Monastery: (6)
As of 2024-04-19 15:16 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found