Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

Re: Am I using Text::Balanced correctly? Speed issues.

by renodino (Curate)
on Aug 31, 2007 at 20:16 UTC ( #636422=note: print w/ replies, xml ) Need Help??


in reply to Am I using Text::Balanced correctly? Speed issues.

I can't provide a precise answer, but that recursion passing $page by value is probably a good place to start looking. But more importantly, if in fact your task is a simple as stated, why not just use a progressive regex ? Here's my quick hack at it, and it runs quite quickly (Note I haven't actually validated the output string, other than noting the length change reported at exit):

use Carp; use strict; use warnings; my $page = join('', "This is some filler.\n" x 20000, "<SECTION[test1]>\n", "This is some filler.\n" x 20000, "</SECTION>\n", "This is some filler.\n" x 20000, "<SECTION[test2]>\n", "This is some filler.\n" x 20000, "</SECTION>\n", "This is some filler.\n" x 20000); print "Calling process_section. Length of page=[" . length($page) ."] +\n"; my $newpage = process_section(\$page, {test1 => 1}); print "Done. Length of newpage=[" . length($newpage) ."]\n"; sub process_section { my ($page, $hashref) = @_; my $return = ''; while ($$page=~/\G(.*?)<SECTION\[([^\]]+)\]>\n?/igcs) { $return .= $1; my $tag = $2; my $post = pos($$page); if ($$page=~/\G(.*?)<\/SECTION>\n?/igcs) { $return .= $1 if exists $hashref->{$tag}; next; } my $excerpt = substr($$page, $post, 100); print STDERR "\n"; Carp::carp("Warning: Unbalanced SECTION tags. Fix the templat +e! Error near: $excerpt\n"); pos($$page) = $post; } $return .= substr($$page, pos($$page)) if (pos($$page) < length($$page)); return $return; }
Note that I've changed to use a ref to the $page, and eliminated recursion.

Text::Balanced is a great tool for dealing with complex things like parsing Perl code, but if you can get by with a good old RE, I'd say its probably a much better solution.


Perl Contrarian & SQL fanboy


Comment on Re: Am I using Text::Balanced correctly? Speed issues.
Download Code

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://636422]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others scrutinizing the Monastery: (7)
As of 2014-07-29 01:18 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (211 votes), past polls