Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

Finding the right $<*digit*> capture variable

by diotalevi (Canon)
on Apr 15, 2003 at 22:11 UTC ( #250704=perlquestion: print w/ replies, xml ) Need Help??
diotalevi has asked for the wisdom of the Perl Monks concerning the following question:

If I write a regex which includes multiple capturing groups (qr/(\D+)(\d+)/ and (?{ }) eval blocks is there a way to find out which numbered variable was just assigned to? So if I wrote qr/(\D+)(\d+)(\D+)(?{ foo( ?? ) })/ is there a way to find out that I should be reading from $3 and not a different variable?

I tried writing some diagnostic code to compare the addresses of $+ and the numbered variables but that didn't do anything useful. It indicated that each of $+, $1, $2 and $3 are all separate variables and $+ isn't temporarily aliased to $3 which would at least let me compare references like \ $+ == \ $1. So what works?

$text = "abc123def"; $text_match = qr/(\D+)(\d+)(?{ dumpr() })(\D+)/; $text =~ $text_match; use Devel::Peek; sub dumpr { Dump($+), Dump($1), Dump($2), Dump($3); local $\ = "\n"; print $+; print $1; print $2; print $3; }

Comment on Finding the right $<*digit*> capture variable
Select or Download Code
Replies are listed 'Best First'.
Re: Finding the right $<*digit*> capture variable
by Enlil (Parson) on Apr 15, 2003 at 22:34 UTC
    I don't know where you are going with this but the $^N variable, but it might be what you are looking for (granted I am aware that it does not get you the number). According to perlvar:

    This is primarily used inside (?{...}) blocks for examining text recently matched. For example, to effectively capture text to a variable (in addition to $1, $2, etc.), replace (...) with

    (?:(...)(?{ $var = $^N }))
    By setting and then using $var in this way relieves you from having to worry about exactly which numbered set of parentheses they are.

    -enlil

      Not only is $^N specific to >=5.8.0 it doesn't provide any information regarding which numbered variable it refers to. Thank you though.

Re: Finding the right $<*digit*> capture variable
by Enlil (Parson) on Apr 16, 2003 at 09:09 UTC
    Let me try again. How about $#- .

    Again from perlvar:

    One can use $#- to find the last matched subgroup in the last successful match. Contrast with $#+, the number of subgroups in the regular expression.

    Some code that might or not result in what you want (as I am still not sure where you are going with this.)

    -enlil

Re: Finding the right $<*digit*> capture variable
by tekkie (Beadle) on Apr 16, 2003 at 14:07 UTC
    I'm not sure where you're going with this either, $#- probably does what you're looking for.

    You could also write a subroutine to check which $<DIGIT> values are defined, and return the last one found:
    sub last_match { # Accepts no args # Returns the digit of the last parenthesis match or undef if ther +e isn't one my $match_num = 1; if(defined($$match_num)) { while(defined($$match_num)) { $match_num++; } $match_num--; } else { undef $match_num; } return $match_num; }

      Beware that this approach cannot discover the last defined match variable, since there may be gaps in the list: for example after "b" =~ /(a)?(b)/, $1 will not be defined even though $2 is.

      Since there is no useful absolute limit on the highest numbered match variable you might need to check for, I wondered whether walking the symbol table might give a clue, but no such luck - the symbol table entries for *1 etc are created only if they are explicitly referenced in the code:

      "abcde" =~ /(.)(.)(.)(.)(.)/; my $var = $4; print join ', ', grep !/\D/, keys %::;
      prints "0, 4".

      Accordingly, I think it is not possible in pure perl to discover the highest numbered defined match variable using any perl before v5.6.0 (when @+ and friends were first introduced).

      Hugo

        I was thinking of taking the sideways route - instrument the ')' part of a '(...)' group so that the contents of @+ is copied elsewhere for safekeeping. Any time that I'm interested in knowing which newly closed group was just passed I'd look for the index of the newly defined value in @+.

        So after (a)? $+[1] and $+[2] are undefined. After (b) $+[1] is still undefined but $+[2] is defined (and equal to 2). So... if I can know which @+ entry was just created is that sufficient or will that break as well?

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://250704]
Approved by Limbic~Region
Front-paged by Limbic~Region
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others rifling through the Monastery: (6)
As of 2015-07-28 02:46 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (251 votes), past polls