First I would try Parse::RecDescent and see if you can figure that out.

If that fails I would use the fact that when you match in scalar context with /g, you can loop over the variable as you parse it. This allows you to create little parse engines. Not as cute for small problems, but it allows one huge regex to turn into a series of small ones and some looping logic, which is much, much better!

Confession: I don't face this kind of problem often, and I have not faced it since hearing of Parse::RecDescent. So while I think that is a better answer, parse engines are what I have personally done.

    In addition to Parse::RecDescent (which is EXCELLENT --- I am using it 2 hours per day --- it has a ton of awesome features), DCONWAY has also written Text::Balanced which has an HTML/XML tag-parsing function built in.

