Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number

Re^2: substitute with recursive regexps?

by hexcoder (Chaplain)
on Sep 04, 2011 at 19:19 UTC ( #924114=note: print w/replies, xml ) Need Help??

in reply to Re: substitute with recursive regexps?
in thread substitute with recursive regexps?

Thanks for the clarification and the pointers! I will have a look. Regarding building a parse tree with a recursive regexp, I found an inspiring example at ppp from demerphq (which seems a tiny bit hackish though). It uses a different technique with assertions that evaluate Perl code:
use strict; use warnings; use Data::Dumper; $^R=undef; my $re = qr/ (?{[$^R]}) # initialize a new node in the par +setree # preserving the old as the 0th +element \( # match a paren (?> # atomic match ([^()]+) # capture (?{ push @{$^R},$1; # push the capture string into the + current node $^R # propagate the current node }) | (?0) # recurse (?{ push @{$^R->[0]},$^R; # push the current node into the o +ld node shift @{$^R} # and restore the old node to be t +he current # (while removing it from the new) }) )* \) /x; "(a(b)(c(d)e)(f)g)" =~ m/$re/x and print "matched\n"; print Dumper $^R;
which gives (with perl 5.10.0):
matched $VAR1 = [ undef, 'a', [ 'b' ], [ 'c', [ 'd' ], 'e' ], [ 'f' ], 'g' ];
But my current goal is not parsing, it is optimization of stringified output from parsed productions. It would have been cool to do optimizing transforms with recursive pattern substitutions. Then optimizations could be applied on the most nested level first, and propagate to higher levels.

Replies are listed 'Best First'.
Re^3: substitute with recursive regexps?
by Anonymous Monk on Sep 04, 2011 at 21:14 UTC
    Yes, Regexp::Grammars basically just automates the same technique as demerphq's example uses: injection of a distributed stack-based tree-constructor via embedded code blocks.

    But you really want that automation if the regex gets much more complex than the above example. Apart from the obvious maintainabilty issues of sprinkling dozens of (?{...}) blocks through your regex, there's a much more fundamental problem: if the regex can ever backtrack a repetition, that will almost certainly screw up the synchronization between the stack and the parser state, unless you take special precautions to unwind the stack correctly (which is what the module does for you).

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://924114]
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (3)
As of 2018-06-25 04:34 GMT
Find Nodes?
    Voting Booth?
    Should cpanminus be part of the standard Perl release?

    Results (126 votes). Check out past polls.