Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

Re^5: Regular expression

by kcott (Archbishop)
on Nov 01, 2017 at 03:08 UTC ( [id://1202491]=note: print w/replies, xml ) Need Help??


in reply to Re^4: Regular expression
in thread Regular expression

"print "Weights are : $1 Kg\n" while $x=~/\G(\d+)\s*kg\s*/ig; #print -1"

The condition for the first while iteration is FALSE: the while loop does not iterate.

Here's a blow-by-blow description of what's occuring. Bear in mind that character positions use a zero-based index: the first character in the string is at postion 0.

  • We start at position 0 in $x (that's character "1").
  • As no previous regex match with the 'g' modifier had occurred, the last match position is 0. The '\G' assertion is satisfied at the start of the string (i.e. position 0). That's a zero-width assertion, we stay at position 0.
  • (\d+) matches "1". This is temporarily assigned to $1 (i.e. $1 eq '1').
  • We move to position 1 in $x (that's character "" - a space).
  • \s* matches "".
  • We move to position 2 in $x (that's character "2").
  • The literal sequence "kg" does not match "2".
  • We move back to position 1 in $x and the regex engine backtracks to \s*.
  • \s* means zero or more spaces greedily. Last time one space was matched. We can also satisfy this by matching zero spaces: position stays at 1.
  • The literal sequence "kg" does not match "".
  • The temporary value in $1 is removed. The regex engine backtracks to \G looking for another way to find a match from the current position 1.
  • The last match position is 0; the current position is 1: the \G assertion is not satisfied.
  • We now move to postion 2 in $x: again, the \G assertion is not satisfied.
  • The regex engine moves along $x, one position at a time, attempting to find a match. Because none of these positions are 0, the \G assertion is never satisfied.
  • Eventually, after 139 steps, the regex engine runs out of string (i.e. the end of $x is reached) and the match is FALSE.

I got all that information by running your code through Regexp::Debugger. I highly recommend this module: not only will you find bugs in your regex, you'll also learn a lot about them (in that respect, it's just as useful for regexes that work as those that don't).

You can fix your current problem by adding '.*?' after the '\G':

$ perle 'my $x = q{1 2 3kg 4 5 6 7 8 9 10Kg 11 12 13 kg 14 15}; say $1 + while $x =~ /\G.*?(\d+)\s*kg\s*/gi' 3 10 13
" I will try reading about it more."

The link I provided before does lead to more links. One in particular, which you should definitely read, is "perlop: Regexp Quote-Like Operators". You'll need to scroll down a fair way: look for the "\G assertion" section.

— Ken

Replies are listed 'Best First'.
Re^6: Regular expression
by pravakta (Novice) on Nov 02, 2017 at 20:17 UTC

    Ahhh I understand it. Anomalous also replied on the same line. and I have a question whihc I put in my reply to him Question is about scope of \G. How long \G hold its value? Till next unsuccessful match?

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1202491]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others pondering the Monastery: (3)
As of 2024-04-24 23:24 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found