Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Re: Am I using Text::Balanced correctly? Speed issues.

by renodino (Curate)
on Aug 31, 2007 at 20:16 UTC ( #636422=note: print w/ replies, xml ) Need Help??


in reply to Am I using Text::Balanced correctly? Speed issues.

I can't provide a precise answer, but that recursion passing $page by value is probably a good place to start looking. But more importantly, if in fact your task is a simple as stated, why not just use a progressive regex ? Here's my quick hack at it, and it runs quite quickly (Note I haven't actually validated the output string, other than noting the length change reported at exit):

use Carp; use strict; use warnings; my $page = join('', "This is some filler.\n" x 20000, "<SECTION[test1]>\n", "This is some filler.\n" x 20000, "</SECTION>\n", "This is some filler.\n" x 20000, "<SECTION[test2]>\n", "This is some filler.\n" x 20000, "</SECTION>\n", "This is some filler.\n" x 20000); print "Calling process_section. Length of page=[" . length($page) ."] +\n"; my $newpage = process_section(\$page, {test1 => 1}); print "Done. Length of newpage=[" . length($newpage) ."]\n"; sub process_section { my ($page, $hashref) = @_; my $return = ''; while ($$page=~/\G(.*?)<SECTION\[([^\]]+)\]>\n?/igcs) { $return .= $1; my $tag = $2; my $post = pos($$page); if ($$page=~/\G(.*?)<\/SECTION>\n?/igcs) { $return .= $1 if exists $hashref->{$tag}; next; } my $excerpt = substr($$page, $post, 100); print STDERR "\n"; Carp::carp("Warning: Unbalanced SECTION tags. Fix the templat +e! Error near: $excerpt\n"); pos($$page) = $post; } $return .= substr($$page, pos($$page)) if (pos($$page) < length($$page)); return $return; }
Note that I've changed to use a ref to the $page, and eliminated recursion.

Text::Balanced is a great tool for dealing with complex things like parsing Perl code, but if you can get by with a good old RE, I'd say its probably a much better solution.


Perl Contrarian & SQL fanboy


Comment on Re: Am I using Text::Balanced correctly? Speed issues.
Download Code

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://636422]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others imbibing at the Monastery: (15)
As of 2015-07-30 14:42 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (271 votes), past polls