Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

Re: HTML parsing OR capturing text from a string within tags

by astaines (Curate)
on Dec 24, 2006 at 02:52 UTC ( #591488=note: print w/ replies, xml ) Need Help??


in reply to HTML parsing OR capturing text from a string within tags

Well, let's see. LWP::UserAgent returns a HTTP::Response object from it's get function. According to the documents the content function of this in turn returns a HTTP::Message object, and the content function of this returns the text body of the webpage, as a string of bytes. You then need to do something intelligent with this string, presumably.

You don't describe how you are using HTML::Strip, but this is really intended to produce a pure text representation of the page. I suspect something like HTML::TreeBuilder which actually parses the HTML, and HTML::Element which lets you disassemble it at your leisure, would suit your needs better.

-- Anthony Staines


Comment on Re: HTML parsing OR capturing text from a string within tags

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://591488]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others about the Monastery: (6)
As of 2014-12-28 19:08 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (182 votes), past polls