I can't provide a precise answer, but that recursion passing $page by value
is probably a good place to start looking. But more importantly, if in fact
your task is a simple as stated, why not just use a progressive
regex ? Here's my quick hack at it, and it runs quite quickly
(Note I haven't actually validated the output string, other than
noting the length change reported at exit):
use Carp;
use strict;
use warnings;
my $page = join('',
"This is some filler.\n" x 20000,
"<SECTION[test1]>\n",
"This is some filler.\n" x 20000,
"</SECTION>\n",
"This is some filler.\n" x 20000,
"<SECTION[test2]>\n",
"This is some filler.\n" x 20000,
"</SECTION>\n",
"This is some filler.\n" x 20000);
print "Calling process_section. Length of page=[" . length($page) ."]
+\n";
my $newpage = process_section(\$page, {test1 => 1});
print "Done. Length of newpage=[" . length($newpage) ."]\n";
sub process_section {
my ($page, $hashref) = @_;
my $return = '';
while ($$page=~/\G(.*?)<SECTION\[([^\]]+)\]>\n?/igcs) {
$return .= $1;
my $tag = $2;
my $post = pos($$page);
if ($$page=~/\G(.*?)<\/SECTION>\n?/igcs) {
$return .= $1
if exists $hashref->{$tag};
next;
}
my $excerpt = substr($$page, $post, 100);
print STDERR "\n";
Carp::carp("Warning: Unbalanced SECTION tags. Fix the templat
+e! Error near: $excerpt\n");
pos($$page) = $post;
}
$return .= substr($$page, pos($$page))
if (pos($$page) < length($$page));
return $return;
}
Note that I've changed to use a ref to the $page, and eliminated recursion.
Text::Balanced is a great tool for dealing with complex things like
parsing Perl code, but if you can get by with a good old RE,
I'd say its probably a much better solution.
Perl Contrarian & SQL fanboy