Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

%{^CAPTURE}, %{^CAPTURE_ALL} and %- don't produce expected output

by vr (Deacon)
on Jun 11, 2019 at 17:39 UTC ( #11101258=perlquestion: print w/replies, xml ) Need Help??

vr has asked for the wisdom of the Perl Monks concerning the following question:

use strict; use warnings; use Data::Dump 'dd'; my $s = "a aa aaa aaaa"; $s =~ /(?<a>a+) (?<a>a+) (?:(?<a>a+)bbb)?/; dd \@{^CAPTURE}; # all captured groups dd $+{a}; # leftmost defined "a" dd ${^CAPTURE}{a}; # ditto dd $-{a}; # all defined "a"'s groups dd ${^CAPTURE_ALL}{a}; # ditto __END__ ["a", "aa"] # correct "a" # correct undef ["a", "aa", undef] "a"

Documentation is not very verbose about new %{^CAPTURE}, %{^CAPTURE_ALL} variables, they are listed as if they are English synonyms to old %+ and %-, but they are obviously not, they look plain wrong to me.

The "aaa" was deleted from @{^CAPTURE} array (or was not even added to begin with), when rightmost cluster failed to match, and deleted as array element from @{$-{a}}, but the $#{$-{a}} was not changed from wrong 2 to expected 1, hence unexpected undefined element in @{$-{a}}.

Update: Actually, w/r/t %-, re-reading the docs, there's no phrase "all defined "a"'s groups" as I stated above.

To each capture group name found in the regular expression, it associates a reference to an array containing the list of values captured by all buffers with that name (should there be several of them), in the order where they appear.

Yes, they say "all buffers with that name", but can undef be said to be "captured"? Can failed sub-expression "capture"? It's ambiguous.

Update 2: Mixing these "CAPTURE" things is broken:

#dd \@{^CAPTURE}; dd \%{^CAPTURE};

is OK, but un-commenting 1st line results in empty unblessed hash in the 2nd.

Replies are listed 'Best First'.
Re: %{^CAPTURE}, %{^CAPTURE_ALL} and %- don't produce expected output (updated x3!)
by haukex (Chancellor) on Jun 11, 2019 at 18:15 UTC

    %{^CAPTURE_ALL} is documented to be an alias to %-, but it's clearly not. I think this qualifies as a bug. <update2> Yep: #131867 </update2> <update3> Plus #134193 </update3>

    use warnings; use strict; use Data::Dump; my $s = "a aa aaa aaaa"; $s =~ /(?<a>a+) (?<a>a+) (?:(?<a>a+)bbb)?/; dd \%+; dd \%{^CAPTURE}; dd \%-; dd \%{^CAPTURE_ALL}; tie my %hash, "Tie::Hash::NamedCapture", all => 1; dd \%hash __END__ { # tied Tie::Hash::NamedCapture a => "a", } { # tied Tie::Hash::NamedCapture a => "a", } { # tied Tie::Hash::NamedCapture a => ["a", "aa", undef], } { # tied Tie::Hash::NamedCapture a => "a", } { # tied Tie::Hash::NamedCapture a => ["a", "aa", undef], }

    Note, from perlre: "When different groups within the same pattern have the same name, any reference to that name assumes the leftmost defined group. ... Named captures are implemented as being aliases to numbered groups holding the captures ..." and I personally wouldn't use two named capture groups with the same name.

    Update: The source is confusing me even more...

    case '\003': /* $^CHILD_ERROR_NATIVE */ if (memEQs(name, len, "\003HILD_ERROR_NATIVE")) goto magicalize; /* @{^CAPTURE} %{^CAPTURE} */ if (memEQs(name, len, "\003APTURE")) { AV* const av = GvAVn(gv); const Size_t n = *name; sv_magic(MUTABLE_SV(av), (SV*)n, PERL_MAGIC_regdat +a, NULL, 0); SvREADONLY_on(av); if (sv_type == SVt_PVHV || sv_type == SVt_PVGV) require_tie_mod_s(gv, '-', "Tie::Hash::NamedCa +pture",0); } else /* %{^CAPTURE_ALL} */ if (memEQs(name, len, "\003APTURE_ALL")) { if (sv_type == SVt_PVHV || sv_type == SVt_PVGV) require_tie_mod_s(gv, '+', "Tie::Hash::NamedCa +pture",0); } break;

    And it appears %{^CAPTURE_ALL} is not tested in the test suite, possibly explaining the issue...

Re: %{^CAPTURE}, %{^CAPTURE_ALL} and %- don't produce expected output
by haukex (Chancellor) on Jun 19, 2019 at 20:48 UTC

    Update: Both issues (#131867 and #134193) should now be fixed in blead.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://11101258]
Front-paged by haukex
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chilling in the Monastery: (7)
As of 2019-06-25 19:46 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Is there a future for codeless software?



    Results (107 votes). Check out past polls.

    Notices?