Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot

Re: HTML Parsing

by BazB (Priest)
on Jan 09, 2002 at 20:20 UTC ( [id://137479] : note . print w/replies, xml ) Need Help??

in reply to Extracting information

Why try and parse this sort of thing when Perl's not-exactly-secret-weapon CPAN has plenty well tested modules that'll do all this for you?

HTML::Parser would be one place to start - it includes modules to slice and dice your HTML in several different ways - check the README. As far as I can see, you should be able to replace the snippet you've posted with these modules.



Update: Is that even valid HTML? It looks pretty horrid either way. A much nicer way of doing it would be:

<font size="-1"><a href="link">this is a link</a></font>
Further update:Yes, it is valid HTML. Still looks hideous :-)