The stupid question is the question not asked | |
PerlMonks |
Re: HTML document modificationby ryantate (Friar) |
on May 28, 2004 at 21:43 UTC ( [id://357379]=note: print w/replies, xml ) | Need Help?? |
It might be simplest to just use quantifier greediness to your advantage. You wouldn't have to read the file backward or use an awkward lookahead, which requires the presence of a closing HTML tag in already suspect documents. The example you link to looks like this:
If you take the following, similar regex, and run it against the HTML document (as a whole), you should match only the last closing body tag, even if there are multiple closing body tags in the document. *This is untested*:
The greedy plus sign ("+") and the match-all dot (".") will eat up all text in the document, then backtrack from the end of the file to allow the closing body tag to match. This approach has the advantage of great implementational simplicity. The disadvantages are that you have to slurp the whole HTML document into memory, and there is likely significant overhead associated with swapping text into and out of the first match ($1 which holds the results of "(.+)") as the regex engine backtracks to the closing body tag.
In Section
Seekers of Perl Wisdom
|
|