Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris

Re: html analysis tool via regex

by marto (Archbishop)
on Oct 13, 2005 at 08:24 UTC ( #499811=note: print w/replies, xml ) Need Help??

in reply to html analysis tool via regex

Hi stabu,

I am not 100% sure what you are trying to achieve, however you may want to check out the WWW::Mechanize and HTML::TokeParser modules. They may suite your requirements of getting html pages and extrating information.

Hope this helps.


Replies are listed 'Best First'.
Re^2: html analysis tool via regex
by pajout (Curate) on Oct 13, 2005 at 08:33 UTC
    Yes, I think that regexp's are for the simplest digs from html or xml only. Of course, you can write very sophisticated regexp, but this way is, imho, read only and more painfull.
    So I suggest some html parser, especially, if you _really_ cannot get better data sources than html.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://499811]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others browsing the Monastery: (8)
As of 2020-05-29 08:24 GMT
Find Nodes?
    Voting Booth?
    If programming languages were movie genres, Perl would be:

    Results (168 votes). Check out past polls.