Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

Re: perl 5.10 bug or not?

by ww (Bishop)
on Aug 16, 2013 at 20:32 UTC ( #1049788=note: print w/ replies, xml ) Need Help??


in reply to perl 5.10 bug or not?

... and, also note that your first capture element -- the dreaded dot-star (which is anything, any number of times including zero)-- is greedy, so the second match would never occur, even if the code compiled because the first match would have swallowed everying in your (then-current) line of data.

Regexen can, of course, be "greedy" without dot-star. Greedy means the regex element -- in this case, using a star/asterisk quantifier -- will match as much data as possible until a new line (or other construct not present in your regex) shuts them down.

Super Search this site (or use big G) for "greed" and/or "greedy" for examples.

If I've misconstrued your question or the logic needed to answer it, I offer my apologies to all those electrons which were inconvenienced by the creation of this post.


Comment on Re: perl 5.10 bug or not?
Re^2: perl 5.10 bug or not?
by tobyink (Abbot) on Aug 16, 2013 at 22:09 UTC

    "so the second match would never occur, even if the code compiled because the first match would have swallowed everying in your (then-current) line of data."

    Not so!

    use v5.10; my $email = 'foo <bar>'; $email =~ /(.*) (<.*>)/; say "1='$1' 2='$2'";

    Greedy matches don't automatically swallow everything. A quote from perlre, my emphasis:

    "By default, a quantified subpattern is "greedy", that is, it will match as many times as possible (given a particular starting location) while still allowing the rest of the pattern to match."

    package Cow { use Moo; has name => (is => 'lazy', default => sub { 'Mooington' }) } say Cow->new->name
      True... (and probably I should give myself a downvote for lack of precision)... BUT , there are no GT or LT symbols in OP's sample
      my $email = 'someuser@example.com'

      hence, dot star swallows all, in the relevant example.

      If I've misconstrued your question or the logic needed to answer it, I offer my apologies to all those electrons which were inconvenienced by the creation of this post.
        ... there are no GT or LT symbols in OP's sample
            my $email = 'someuser@example.com'
        hence, dot star swallows all, in the relevant example.

        If the relevant example is
            my $email = 'someuser@example.com'
            $email =~ /(.*) (<.*>)/;
            say "1='$1' 2='$2'";
        from the OP, then dot star – i.e., (.*) – swallows nothing at all. The  '<' and  '>' characters are required to match, therefore there is no match, no swallowing, no capturing, nuttin'. As pointed out elsewhere, not even $1, $2, etc., are altered. In fact, because the first  (.*) capture group is followed by a required  ' ' (space), the angle brackets don't even figure in:  .* will initially grab everything, then backtrack repeatedly to try to match a space; finding none, it will eventually give up everything it grabbed and announce its miserable failure, triggering overall match failure.

        The problem is actually why do I get some capture if I had no previous successful match.

        In case of "foo <bar>" I definitely should get two captures, but I have no clue why do I get a capture "bar.com" with "foo@bar.com" sample. Observed on v5.10.0 DEVEL34916

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1049788]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others avoiding work at the Monastery: (3)
As of 2014-09-21 06:17 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (166 votes), past polls