Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Re: question about the star(*) quantifier

by Eimi Metamorphoumai (Deacon)
on Dec 21, 2006 at 17:32 UTC ( #591136=note: print w/replies, xml ) Need Help??


in reply to question about the star(*) quantifier

The problem is that /(\d*)/ allows zero or more digits, so it will match every string. Perl will first attempt to match starting at the beginning of the string, and match as many digits as possible. So it looks at the beginning, sees that it matches zero digits, and succeeds. The simple solution would be to use /(\d+)/ (which will force it to go until it finds at least one digit). Depending on your data and exact situation, there may be other approaches that would work better for you.

Replies are listed 'Best First'.
Re^2: question about the star(*) quantifier
by ikegami (Pope) on Dec 21, 2006 at 17:57 UTC
    Here's an illustration of what Eimi Metamorphoumai described.
    sub test { my ($sentence, $re) = @_; print("Sentence: $sentence\n"); print("Regexp: $re\n"); if ($sentence =~ /($re)/) { printf("Matched %d characters (%s) at pos %d\n", length($1), $1, $-[1]); } else { print("No match\n"); } print("\n"); } test("1234", qr/\d*/); # Matched 4 characters (1234) at pos 0 test("abc1234", qr/\d*/); # Matched 0 characters () at pos 0 test("abc", qr/\d*/); # Matched 0 characters () at pos 0 test("1234", qr/\d+/); # Matched 4 characters (1234) at pos 0 test("abc1234", qr/\d+/); # Matched 4 characters (1234) at pos 3 (*1) test("abc", qr/\d+/); # No match (*2)

    *1 - Since it failed at positions 0, 1 and 2.
    *2 - Since it failed at all positions.

    Update: Combined the code and the output to condense the node and improve readability.

Re^2: question about the star(*) quantifier
by wojtyk (Friar) on Dec 21, 2006 at 17:39 UTC
    You would see similar behavior if you try this:
    my $sentence = "I fear that i will be extinct after 1000 or 2000 years +"; if($sentence =~ /(\d+)/) { print "That said '$1' years.\n"; }
    Since Perl is stopping at the first successful match, you'll only get the first number. Since the empty string is a successful match for \d*, that's what you're getting :)
      Untrue! Please try it and I think you'll see that 1000 is indeed matched. Perl's + modifer is greedy, meaning it will match as much as it can. Perl also takes the first match it can find, which is why the first one stopped at the start of the string. (Or perhaps you meant "1000" by the "first number". I took it to mean "1" which some regex engines might yield.)

      -sam

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://591136]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others studying the Monastery: (3)
As of 2020-10-24 18:34 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    My favourite web site is:












    Results (246 votes). Check out past polls.

    Notices?