As ww hinted, using a parser to do the heavy lifting is better than hand rolling code to parse XML. The down side for someone starting out is typical modules (in this case XML::Twig is recommended) are pretty daunting. They look like they will take much longer to learn to use than hand rolling just the little bit of code that it seems you will need. In fact that is generally wrong. Especially for XML and HTML, unless you are generating the source files yourself, there are many subtleties and edge cases that will bite you.
If you are writing this code to learn Perl then rolling your own parser will be very educational, but don't expect to get a reliable script written any time soon. If you are writing this to get a job done and learn some Perl along the way it is well worth spending the time to figure out how to use CPAN and (in this case) XML::Twig. Learning to use modules from CPAN is an important part of learning to use Perl effectively.
True laziness is hard work
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.
| & || & |
| < || < |
| > || > |
| [ || [ |
| ] || ] ||