Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

Re: Looking for a module that strips an HTML tag and its associated 'TEXT'

by jcb (Parson)
on Jul 30, 2020 at 02:08 UTC ( [id://11120045]=note: print w/replies, xml ) Need Help??


in reply to Looking for a module that strips an HTML tag and its associated 'TEXT'

I would suggest HTML::Parser and a simple state machine, but it will be more than two or three lines. You might even be able to play some tricks with the ignore_elements, ignore_tags, report_tags, and skipped_text features to make the XS code do most of the filtering work. Then the handler callbacks simply print or discard the text as needed, or you can have the XS code stuff a parse trace into an array and use that later in your program.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11120045]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others contemplating the Monastery: (5)
As of 2024-04-23 20:08 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found