Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery

Eternal question of parsing parentheses

by Anonymous Monk
on Oct 24, 2009 at 12:59 UTC ( #803034=perlquestion: print w/replies, xml ) Need Help??
Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

This must be a common question, but I can't find any answer.

In a string containing nested parentheses, I want to find the first parenthesis, in order to evaluate things for that before the rest. So for $_='a(b(c(d)(e))f)g(h)((i)j)'; the result should be b(c(d)(e))f.

This can be done the sad and boring way by splitting the string and going through it, counting parenthesis signs as you go. But surely there must be some clever regex to do it?

I came up with one solution, which looks wonderfully cryptic:

while(/^[^\(]*\([^\)]*\(/){s/\(([^\(\)]*)\)/\[\1\]/} s/.*?\((.*?)\).*/\1/; tr/\[\]/\(\)/;

While the first '(' is follwed by another '(' before any ')', i.e. the first parenthesis has inner parentheses, find the first '(' which is NOT followed by another '(' before the ')', i.e. the first innermost parenthesis, and replace that with [...]. Then extract the first remaining parenthesis, and replace all [ ] with ( ).

Is this a good idea? Can you improve it? Is there a better way?

Replies are listed 'Best First'.
Re: Eternal question of parsing parentheses
by JavaFan (Canon) on Oct 24, 2009 at 13:23 UTC
    use Regexp::Common qw /balanced/; /($RE{balanced}{-parens=>’()’})/ and print $1;
Re: Eternal question of parsing parentheses
by ikegami (Pope) on Oct 24, 2009 at 16:39 UTC
Re: Eternal question of parsing parentheses
by LanX (Bishop) on Oct 24, 2009 at 17:44 UTC
    Eternal answer RTFM¹ ! :)

    perldoc perlre ( >5.10) and searching for "recursive" brought the following code, where I only had to exchange "foo" with \w*

    $_='a(b(c(d)(e))f)g(h)((i)j)'; $re = qr{ ( # paren group 1 (full function) \w* ( # paren group 2 (parens) \( ( # paren group 3 (contents of parens) (?: (?> [^()]+ ) # Non-parens without backtracking | (?2) # Recurse to start of paren group 2 )* ) \) ) ) }x; @matches=/$re/; print "@matches";

    perl /tmp/ a(b(c(d)(e))f) (b(c(d)(e))f) b(c(d)(e))f Compilation finished at Sat Oct 24 19:42:58

    Cheers Rolf

    (¹) SCNR 8)

    UPDATE: untabified code.


      I simplified the code, and replaced tr/()/<>/ to make it more readable:

      $_='a<b<c<d><e>>f>g<h><<i>j>'; $re = qr{ < # anchor at first paren as wanted ( # paren group 1 (?: (?> [^<>]+) # Non-parens without backtracking | < (?1) # Recurse to start of paren group 1 > )* ) }x; /$re/; print $1;
      perl /tmp/ b<c<d><e>>f

      Cheers Rolf

      For those still limited to 5.8 and before, see also the discussion of the
       (??{ code }) construct under Extended Patterns (just before the discussion of
      (?PARNO)) in perlre.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://803034]
Approved by Corion
Front-paged by toolic
[Discipulus]: exists!
[shmem]: holli, agreed. "dry airan" would be slightly better :P
[holli]: virtualsue: How much was left?
[shmem]: Discipulus: oh. tatsächlich.
shmem missed his monkday yesterday :P - 12 years and way to go...
[1nickt]: Happy Monkday shmem. I was at my stepson's birthday party and thought of you.
[virtualsue]: holli about half
[choroba]: "Tatsächlich, das ist die Knochenstruktur eines Bettlers!" ??
shmem bows. Thank you
[marto]: slowclap.gif

How do I use this? | Other CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (13)
As of 2017-11-21 12:52 GMT
Find Nodes?
    Voting Booth?
    In order to be able to say "I know Perl", you must have:

    Results (301 votes). Check out past polls.