Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

What's like $+ but not gives the ordinal?

by John M. Dlugosz (Monsignor)
on Jun 28, 2001 at 00:49 UTC ( [id://92044]=perlquestion: print w/replies, xml ) Need Help??

John M. Dlugosz has asked for the wisdom of the Perl Monks concerning the following question:

The special variable $+ will hold the same contents as the last matched capture. But which one is that? That is, given a set of possibilities
(foo)|(bar)|(baz)
I'll get something in either $1, $2, or $3. Depending on which case matched, I want to do different logic. I can do something like this:
if (defined $1) { sub1 } elseif (defined $2) { sub2 } ...
But I'm thinking that something along the lines of
$sub[$n]->();
would be much better, if only I knew the value of n. Computing it by looking at the definedness of each capture is just as bad as doing the work directly as above, so don't bother.

It seems to me that there should be a value for this somewhere. Anybody know for sure, or know of an elegant way to find which $n is chosen by $+?

—John

Replies are listed 'Best First'.
(Ovid) Re: What's like $+ but not gives the ordinal?
by Ovid (Cardinal) on Jun 28, 2001 at 01:21 UTC

    If you want the number of the backreference and you know in advance the number of possible backreferences, here's a quick hack:

    #!/usr/bin/perl -w use strict; my $string = 'abcbar'; my $sub = get_backref( $string ); print $sub; sub get_backref { my $string = shift; my $regex = '(foo)|(bar)|(baz)'; my $backref = 0; # This is the number of possible backreferences # It's easier to set than generate dynamically my $limit = 3; local ( $1, $2, $3 ); $string =~ /$regex/; for ( 1 .. $limit ) { no strict 'refs'; $backref = $_, last if defined $$_; } return $backref; }

    Update: Here's a much more robust example (though with no validation of arguments to the sub). Pass the string, the number of backreferences, and the regex and it will return to you the regex that matched:

    #!/usr/bin/perl -w use strict; my $string = 'abcbar'; my $sub = get_backref( $string, 3, '(foo)|(bar)|(baz)' ); print $sub; sub get_backref { my ( $string, $num_refs, $regex ) = @_; my $backref = 0; # This is the number of possible backreferences # easier to set than count my $local_brefs = ''; for ( 1 .. $num_refs ) { $local_brefs .= "\$$_,"; } chop $local_brefs; my $code = <<" END_OF_CODE"; local ( $local_brefs ); \$string =~ /$regex/; for ( 1 .. $num_refs ) { no strict 'refs'; \$backref = \$_, last if defined \$\$_; } END_OF_CODE eval $code; return $backref; }

    Of course, it returns the number of the backreference that matched first (i.e., if $2 matched first, it returns 2). It returns zero if no match is found. You'll have to test for this if you're using it as an array index unless $array[0] is a default.

    Cheers,
    Ovid

    Vote for paco!

    Join the Perlmonks Setiathome Group or just click on the the link and check out our stats.

      I don't understand why you need to localize the backref variables. Also, you're finding the first capture, not the last.

      Here is a way that finds the max by itself, as implied by another message on this thread:

      sub last_paren_match_ordinal() { my $n= scalar @+; # gives number of captures present. while ($n) { no strict 'refs'; last if defined $$n; --$n; } return $n; }
      In your new code, you are localizing the same name as you are my-ing. What's that for? It's not even used inside the block where it's localized.

      —John

        John, this stuff gets a bit tricky because what I've done is write a code generator.

        I don't understand why you need to localize the backref variables.

        Because if the match fails on the current attempt, but it succeeded on a previous attempt, the backref variables ($1, $2, etc) will contain the values from the previous match. The following code demonstrates this:

        my $string = '1234'; $string =~ /(2)/; $string =~ /(a)/; print $1;
        Also, you're finding the first capture, not the last.

        I misread your post then. Make the following change:

        - for ( 1 .. $num_refs ) { + for ( $num_refs .. 1 ) {
        Here is a way that finds the max by itself...

        I didn't know about the @+ variable :)

        In your new code, you are localizing the same name as you are my-ing.

        Actually, I'm not. Before the eval statement, add the statement print $code;. That will show you what's going on. The HERE document is a scalar containing generated code to be evaled. That's all.

        As I mentioned, my code is probably not worth the effort as I did not know about @+. :)

        Update 2: Duh! Of course $num_refs .. 1 isn't going to work. Sigh.

        Cheers,
        Ovid

        Vote for paco!

        Join the Perlmonks Setiathome Group or just click on the the link and check out our stats.

Re: What's like $+ but not gives the ordinal?
by japhy (Canon) on Jun 28, 2001 at 01:01 UTC
    You could do kludge in your regex:
    if ($string =~ /(foo(?{ $N = 1 }))|(bar(?{ $N = 1 }))/) { if ($N == 1) { ... } else { ... } }
    Or you could do:
    if (my @capt = $string =~ /(foo)|(bar)/) { $N = @capt; for (reverse @capt) { last if defined; $N--; } if ($N == 1) { ... } else { ... } }
    Since I seem to be (becoming) the regcomp.c pumpking, I'll see if I could kludge something together.

    japhy -- Perl and Regex Hacker
      I thought about using (?{$N=1}) etc. in each branch, but shy away from the "highly experimental" feature that may be changed without notice.

      The if to the array is interesting, but won't work for me because I'm doing a substitution, not just a match. The docs give an /G../gc idiom for lexing, but there is no equivilent for altering the parts not just detecting them.

      As for kludgeing something together, since we're getting away from funny globals altogether, how about making a bold move and providing a variable named $Regexp::last_paren_match_count, or better yet @Regexp::matches that aliases the same values as $1, $2, etc. then they can be manipulated as an array in all the normal ways, index with -1, get array length, etc.

      —John

Re: What's like $+ but not gives the ordinal?
by jima (Vicar) on Jun 28, 2001 at 01:27 UTC
    You might also have luck with @-, brand-new to 5.6.0 (it says here in perlretut).
    $-[0] is the offset of the start of the last successful match. $-[n] is the offset of the start of the substring matched by n-th subpattern, or undef if the subpattern did not match. for (qw(foo bar baz)) { if (/(foo)|(bar)|(baz)/) { print "\$+ = $+\n"; print "\@- = " . join('/',@-) . "\n"; print "scalar \@- = " . scalar(@-) . "\n"; } } __END__ $+ = foo @- = 0/0 scalar @- = 2 $+ = bar @- = 0//0 scalar @- = 3 $+ = baz @- = 0///0 scalar @- = 4
    which would seem to indicate that you could do $sub[scalar(@-)-2]->();, although I think I'd prefer something easier to understand, if I were going to maintain this.
      That's a useless use of scalar. You could write $sub[@--2]->() too. ;-)

      -- Abigail

        Some of us always use scalar rather than implying it. It saves accidents and sometimes makes it clearer. Really, though, if I don't type it frequently I'll forget how to spell it.
      Now that's something I never thought to try—the behavior of @+ and @- is totally different! (Hey Japhy, want to check out why?)

      I can put that into a function call, along with the other code in a comment, for maintainability. If a documented way arrives, easy to change. If it stops working, delete two lines and the other way takes over.

      —John

        This appears to be a bug. perlvar says that $#+ should return the number of successful groupings, but it doesn't -- it returns the number of attempted groupings. $#-, on the other hand, works as expected.

        japhy -- Perl and Regex Hacker
Re: What's like $+ but not gives the ordinal?
by grinder (Bishop) on Jun 28, 2001 at 01:09 UTC

    Look at @+ instead (documentation in perlvar). It won't quite do exactly what you want, but you'll at least be able to determine (via defined) whether a given paren matched or not).

    Another alternative would be to say

    /((?:foo)|(?:bar)|(?:baz))/

    because then you don't care what matched, you can just refer to $1. If you need to do different things according to what matched, it might just be cleaner to break it up into different regexes, but without more context, it's difficult to say.


    --
    g r i n d e r

     

    Edit: chipmunk 2001-06-27

      scalar @+ always returns the number of paren groups in the RE, regardless of which one matched. Checking defined on the elements is really no different from doing it with the vars directly, but I see the advantage is being able to use a loop with subscripts.

      I can't break it up into different regex's because replacing doesn't have the /G and one-at-a-time feature that just searching does.

      —John

Re: What's like $+ but not gives the ordinal?(boo)
by boo_radley (Parson) on Jun 28, 2001 at 01:54 UTC
    Close?
    I tested it with a variety of cases, and I think it does what you're looking for.
    my @subs = (\&hi, \&bye, \&bueno, \&vivid); my $foo="here is a test for you"; if ($foo=~/(hyah)|(is)|(a)|(yodu)/){ #pass my $m = &lastmatch; print "--$m--"; --$m; &{$subs[ $m]}; } sub lastmatch { my $found; for (1..10) {#whatever is appropriate. if (eval ("\$\$_")) {$found = $_; next;} return $found if $found;#short circuit for loop } } sub hi {print "hi"} sub bye {print "bye"} sub bueno {print "bueno"} sub vivid {print "vivid"}

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://92044]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others surveying the Monastery: (4)
As of 2024-04-25 18:12 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found