diotalevi has asked for the wisdom of the Perl Monks concerning the following question:

If I write a regex which includes multiple capturing groups (qr/(\D+)(\d+)/ and (?{ }) eval blocks is there a way to find out which numbered variable was just assigned to? So if I wrote qr/(\D+)(\d+)(\D+)(?{ foo( ?? ) })/ is there a way to find out that I should be reading from $3 and not a different variable?

I tried writing some diagnostic code to compare the addresses of $+ and the numbered variables but that didn't do anything useful. It indicated that each of $+, $1, $2 and $3 are all separate variables and $+ isn't temporarily aliased to $3 which would at least let me compare references like \ $+ == \ $1. So what works?

$text = "abc123def"; $text_match = qr/(\D+)(\d+)(?{ dumpr() })(\D+)/; $text =~ $text_match; use Devel::Peek; sub dumpr { Dump($+), Dump($1), Dump($2), Dump($3); local $\ = "\n"; print $+; print $1; print $2; print $3; }

Replies are listed 'Best First'.
Re: Finding the right $<*digit*> capture variable
by Enlil (Parson) on Apr 15, 2003 at 22:34 UTC
    I don't know where you are going with this but the $^N variable, but it might be what you are looking for (granted I am aware that it does not get you the number). According to perlvar:

    This is primarily used inside (?{...}) blocks for examining text recently matched. For example, to effectively capture text to a variable (in addition to $1, $2, etc.), replace (...) with

    (?:(...)(?{ $var = $^N }))
    By setting and then using $var in this way relieves you from having to worry about exactly which numbered set of parentheses they are.


      Not only is $^N specific to >=5.8.0 it doesn't provide any information regarding which numbered variable it refers to. Thank you though.

Re: Finding the right $<*digit*> capture variable
by Enlil (Parson) on Apr 16, 2003 at 09:09 UTC
    Let me try again. How about $#- .

    Again from perlvar:

    One can use $#- to find the last matched subgroup in the last successful match. Contrast with $#+, the number of subgroups in the regular expression.

    Some code that might or not result in what you want (as I am still not sure where you are going with this.)


Re: Finding the right $<*digit*> capture variable
by tekkie (Beadle) on Apr 16, 2003 at 14:07 UTC
    I'm not sure where you're going with this either, $#- probably does what you're looking for.

    You could also write a subroutine to check which $<DIGIT> values are defined, and return the last one found:
    sub last_match { # Accepts no args # Returns the digit of the last parenthesis match or undef if ther +e isn't one my $match_num = 1; if(defined($$match_num)) { while(defined($$match_num)) { $match_num++; } $match_num--; } else { undef $match_num; } return $match_num; }

      Beware that this approach cannot discover the last defined match variable, since there may be gaps in the list: for example after "b" =~ /(a)?(b)/, $1 will not be defined even though $2 is.

      Since there is no useful absolute limit on the highest numbered match variable you might need to check for, I wondered whether walking the symbol table might give a clue, but no such luck - the symbol table entries for *1 etc are created only if they are explicitly referenced in the code:

      "abcde" =~ /(.)(.)(.)(.)(.)/; my $var = $4; print join ', ', grep !/\D/, keys %::;
      prints "0, 4".

      Accordingly, I think it is not possible in pure perl to discover the highest numbered defined match variable using any perl before v5.6.0 (when @+ and friends were first introduced).


        I was thinking of taking the sideways route - instrument the ')' part of a '(...)' group so that the contents of @+ is copied elsewhere for safekeeping. Any time that I'm interested in knowing which newly closed group was just passed I'd look for the index of the newly defined value in @+.

        So after (a)? $+[1] and $+[2] are undefined. After (b) $+[1] is still undefined but $+[2] is defined (and equal to 2). So... if I can know which @+ entry was just created is that sufficient or will that break as well?