Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

Re: html analysis tool via regex

by davorg (Chancellor)
on Oct 13, 2005 at 09:12 UTC ( #499826=note: print w/replies, xml ) Need Help??


in reply to html analysis tool via regex

I use regex to precisely tell perl what I want pulled out.

You really don't want to do that. Regular expressions will potentially break on all but the simplest HTML pages. If you want to parse HTML then you should use HTML::Parser or one of its subclasses. Personally I usually use HTML::TreeBuilder.

--
<http://dave.org.uk>

"The first rule of Perl club is you do not talk about Perl club."
-- Chip Salzenberg

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://499826]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others rifling through the Monastery: (3)
As of 2020-05-25 03:01 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    If programming languages were movie genres, Perl would be:















    Results (143 votes). Check out past polls.

    Notices?