Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical

Re: Perl Idioms Explained - @ary = $str =~ m/(stuff)/g

by Roger (Parson)
on Sep 16, 2003 at 00:17 UTC ( #291688=note: print w/replies, xml ) Need Help??

in reply to Perl Idioms Explained - @ary = $str =~ m/(stuff)/g

Hi, I have a question on the capturing parentheses. I have made the following test sample to explore the necessity of the parentheses:

$str = "Ab stuff Cd stuff Ef stuff"; # case 1 @ary1 = $str =~ m/(stuff)/g ; # case 2 @ary2 = $str =~ m/(?:stuff)/g ; # case 3 @ary3 = $str =~ m/stuff/g ; print "\@ary1 = @ary1\n"; print "\@ary2 = @ary2\n"; print "\@ary3 = @ary3\n";
All three cases return the same result. Explanation for case 1 is covered in earlier posts. However I am puzzled by the use of parentheses in the example, so I added ?: to it to tell the regular expression to forget the value in the capture parentheses if any. The result is the same! So the regular expression is not acting on the $1 variable captured by the parentheses at all. So I eliminated the parentheses totally, I still get the same result.

Ok, my instinct tells me that this Perl idiom is acting on the behaviour of m//g, or more specific the g modifier. It seems the g modifier introduces it's own pattern matching memory behaviour and discards the regular expression memory in some cases.

I looked up the perldoc, which states:
The /g modifier specifies global pattern matching--that is, matching as many times as possible within the string. How it behaves depends on the context. In list context, it returns a list of the substrings matched by any capturing parentheses in the regular expression. If there are no parentheses, it returns a list of all the matched strings, as if there were parentheses around the whole pattern.

Ok, my question is, what is the expected behaviour of the g modifier? Why is the /g modifier capturing the value that I want it to forget (with ?:)? Is it a feature or bug? Or perhaps /(?:pattern)/ is equivalent to /pattern/?

Replies are listed 'Best First'.
Re: Re: Perl Idioms Explained - @ary = $str =~ m/(stuff)/g
by antirice (Priest) on Sep 16, 2003 at 02:08 UTC

    I've always attributed this behavior to perl's DWIM approach to usability. The reason all three return the same thing is pretty simple: in the case where your regex doesn't capture anything, the actual instance that matches is returned instead. Also, if you have more than one capturing portion, it will push the extras onto the array as well. Try this:

    $str = "Ab stuff Cd stuff Ef stuff"; # case 1 @ary1 = $str =~ m/(stuff)/g ; @ary2 = $str =~ m/(st)u(ff)/g ; print "\@ary1 = @ary1\n"; print "\@ary2 = @ary2\n"; __DATA__ outputs: @ary1 = stuff stuff stuff @ary2 = st ff st ff st ff

    Nifty, eh? I rather like this behavior. Also please note that the g only means return all instances where the pattern matches. If you remove the g from the regexes above, then only the first match is returned.

    Hope this helps.

    I just noticed this is my 200th post. Yay.

    The first rule of Perl club is - use Perl
    ith rule of Perl club is - follow rule i - 1 for i > 1

Re: Re: Perl Idioms Explained - @ary = $str =~ m/(stuff)/g
by tachyon (Chancellor) on Sep 16, 2003 at 04:44 UTC

    An interesting observation. The behaviour for (?:...) and naked match strings is however appropriate provided you have at least one capture (...) in the RE.

    @ary = 'stuff stuff stuff' =~ m/(?:st)u(ff)/g; print "@ary"; __DATA__ ff ff ff

    Depending on your viewpoint your observation represents a bug or a feature!




Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://291688]
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (7)
As of 2017-09-20 02:19 GMT
Find Nodes?
    Voting Booth?
    During the recent solar eclipse, I:

    Results (230 votes). Check out past polls.