Beefy Boxes and Bandwidth Generously Provided by pair Networks
Come for the quick hacks, stay for the epiphanies.
 
PerlMonks  

Re: Trying to count the captures in a compiled regular expression

by Roy Johnson (Monsignor)
on May 03, 2004 at 02:46 UTC ( [id://349904]=note: print w/replies, xml ) Need Help??


in reply to Trying to count the captures in a compiled regular expression

How about just using it, marked optional, and seeing how many captures you get?
$_ = 'anything'; my @capture_count; my $regex = qr/one(.*?)((two)(four))/; @capture_count = /(?:$regex)?/; print @capture_count." captures\n";

The PerlMonk tr/// Advocate

Replies are listed 'Best First'.
Re: Re: Trying to count the captures in a compiled regular expression
by BooK (Curate) on May 03, 2004 at 06:48 UTC

    Ack. I guess spending two days trying to grok yacc and Parse::Yapp led me to choose to do it the hard way. Thanks for a healthy does of simplicity.

    Unfortunately, the following regexp break your very good idea:

    $regex = qr/foo/; # should return 0

    That only means your code needs at least one capture:

    sub captures { my $re = shift; return scalar ( @_ = '' =~ /(?:($re))?/ ) - 1; }

    This passes my whole test suite, expect for this regexp:

    $regex = qr/(?x) # (comment) (?-x) # (capture) (?x) # (comment)/;

    Which dies horribly with the message:

    Unmatched ( in regex; marked by <-- HERE in m/(?:(( <-- HERE ?-xism:(?x)  # (comment)
                (?-x) # (capture) (?x) # (comment))))?/

    So, $regex compiles, but qr/(?:($regex))?/ doesn't? I'm lost.

      expect for this regexp...

      I guess the matching close paren gets commented out in the combined string, though if so I think that's probably a bug: I think we tried it fix it a while back so that a comment in an embedded qr// would not leak out to the enclosing pattern.

      Hugo

Re: Re: Trying to count the captures in a compiled regular expression
by hv (Prior) on May 03, 2004 at 08:31 UTC

    Very nice. :)

    You can go a step better by making the outer match minimal, which means it will immediately match zero times and thus avoid the time and danger of trying to match the interior at all:

    @capture_count = /($regex)??/; print @capture_count - 1, " captures\n";

    Hugo

Re: Re: Trying to count the captures in a compiled regular expression
by sgifford (Prior) on May 03, 2004 at 04:56 UTC
    Wow, that works! I knew there had to be a way to evaluate the regexp to get an answer. Good stuff! :)
      Security alert: that will run code in (??{ }) / (?p{ }). Maybe something like this instead?
      $regex = qr:(??{print "look ma, no rm -rf /\n"}):; $captures = (() = ""=~/(|$regex)/) - 1;
      Also, note that you have to add a () set and then subtract it from the count to be able to distinguish between 0 captures and 1 capture.
Re: Re: Trying to count the captures in a compiled regular expression
by eric256 (Parson) on May 03, 2004 at 14:14 UTC

    Could you explain why/how that works?


    ___________
    Eric Hodges
      Upon a successful match, the match operator returns a list of the captures (or, in scalar context, the number of captures), one element for each set of capturing parentheses, even if the captured value is empty. By making the entire pattern optional, we ensure a successful match, and thus Perl will tell us how many groupings were.

      As was pointed out by ysth and hugo, the case of no groupings will not return zero (because we need a true value to indicate a successful match), so we ought to put capturing parentheses around the expression, and subtract one from the result. And the possibility of embedded code, which we wouldn't want to run, is another caveat, so we want to use minimal matching. Hence:

      my $regex = qr/foo/; $_ = 'anything'; my $matches = (() = /($regex)??/) - 1; # oops! fixed print "There were $matches groupings\n";

      The PerlMonk tr/// Advocate
        That runs in scalar context, so won't work; try:
        $matches = (() = /($regex)??/) - 1;

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://349904]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others admiring the Monastery: (2)
As of 2024-04-20 01:51 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found