Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

Re^2: substitute with recursive regexps?

by hexcoder (Hermit)
on Sep 04, 2011 at 19:19 UTC ( #924114=note: print w/ replies, xml ) Need Help??


in reply to Re: substitute with recursive regexps?
in thread substitute with recursive regexps?

Thanks for the clarification and the pointers! I will have a look. Regarding building a parse tree with a recursive regexp, I found an inspiring example at ppp from demerphq (which seems a tiny bit hackish though). It uses a different technique with assertions that evaluate Perl code:

use strict; use warnings; use Data::Dumper; $^R=undef; my $re = qr/ (?{[$^R]}) # initialize a new node in the par +setree # preserving the old as the 0th +element \( # match a paren (?> # atomic match ([^()]+) # capture (?{ push @{$^R},$1; # push the capture string into the + current node $^R # propagate the current node }) | (?0) # recurse (?{ push @{$^R->[0]},$^R; # push the current node into the o +ld node shift @{$^R} # and restore the old node to be t +he current # (while removing it from the new) }) )* \) /x; "(a(b)(c(d)e)(f)g)" =~ m/$re/x and print "matched\n"; print Dumper $^R;
which gives (with perl 5.10.0):
matched $VAR1 = [ undef, 'a', [ 'b' ], [ 'c', [ 'd' ], 'e' ], [ 'f' ], 'g' ];
But my current goal is not parsing, it is optimization of stringified output from parsed productions. It would have been cool to do optimizing transforms with recursive pattern substitutions. Then optimizations could be applied on the most nested level first, and propagate to higher levels.


Comment on Re^2: substitute with recursive regexps?
Select or Download Code
Re^3: substitute with recursive regexps?
by Anonymous Monk on Sep 04, 2011 at 21:14 UTC
    Yes, Regexp::Grammars basically just automates the same technique as demerphq's example uses: injection of a distributed stack-based tree-constructor via embedded code blocks.

    But you really want that automation if the regex gets much more complex than the above example. Apart from the obvious maintainabilty issues of sprinkling dozens of (?{...}) blocks through your regex, there's a much more fundamental problem: if the regex can ever backtrack a repetition, that will almost certainly screw up the synchronization between the stack and the parser state, unless you take special precautions to unwind the stack correctly (which is what the module does for you).

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://924114]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others musing on the Monastery: (6)
As of 2014-10-25 13:26 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    For retirement, I am banking on:










    Results (143 votes), past polls