Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

Count capturing parentheses in a compiled regexp

by BooK (Curate)
on May 02, 2004 at 10:28 UTC ( [id://349794]=CUFP: print w/replies, xml ) Need Help??

This subroutines count the number of capturing parentheses in a compiled regular expression. Since the regexp is compiled, we know it's correct, and thus we only have to count the opening parentheses.

Update: added /s in the while condition.

Update: This other discussion led to a much better solution, so I commented out the original code.

#sub captures { # local $_ = shift; # croak "$_ is not a compiled regexp" unless ref eq 'Regexp'; # my $n = 0; # while( /\G(?=.)/gcs ) { # /\G[^\\(]*/gc; # ignore uninteresting stuff # /\G(?:\\.)*/gc; # ignore backslashed stuff # /\G\(\?/gc; # ignore special regexps # /\G\(/gc && $n++; # a capturing (, count it! # } # $n; #} sub captures { ( @_ = '' =~ /(@{[shift]})??/ ) - 1; }

Replies are listed 'Best First'.
Re: Count capturing parentheses in a compiled regexp
by hv (Prior) on May 02, 2004 at 12:12 UTC

    Nice snippet, but a couple of problems: the outer lookahead needs to be //s, else eg:

    qr{( x )}x;
    will fail.

    Also, this will find parens in embedded code and comments and treat as captures. If that doesn't seem worth worrying about it'd be enough to add a caveat I guess, else I think you can mimic perl's simplistic parsing reasonable easily for the code (just count to the balancing close-brace). Comments may actually be the trickiest, since you'll need to know when //x is in force:

    qr{ (?x: # (comment) ) (?-x: # (capture) ) }

    Oops, another one: parens in [ ... ] should be ignored too; I'm not sure how easy those would be to parse, since not every ] closes the selection.

    Hugo

      Thanks a lot for finding these shortcomings in my code. :-) I'll submit updated versions as I correct them.

Re: Count capturing parentheses in a compiled regexp
by japhy (Canon) on May 02, 2004 at 16:03 UTC
    Once I get Regexp::Parser working (it's the update to YAPE::Regex), you'll be able to do this via:
    use Regexp::Parser; # pushes itself to @Regexp::ISA print qr/.../->nparens;
    _____________________________________________________
    Jeff[japhy]Pinyan: Perl, regex, and perl hacker, who'd like a job (NYC-area)
    s++=END;++y(;-P)}y js++=;shajsj<++y(p-q)}?print:??;
Re: Count capturing parentheses in a compiled regexp
by japhy (Canon) on Jun 27, 2004 at 03:35 UTC
    In the spirit of "match the regex to get the number of parens in it", here's another way:
    sub nparens { "" =~ /|$_[0]/ and $#+ }
    _____________________________________________________
    Jeff[japhy]Pinyan: Perl, regex, and perl hacker, who'd like a job (NYC-area)
    s++=END;++y(;-P)}y js++=;shajsj<++y(p-q)}?print:??;

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: CUFP [id://349794]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (4)
As of 2025-05-22 06:33 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?
    erzuuliAnonymous Monks are no longer allowed to use Super Search, due to an excessive use of this resource by robots.