Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

anchor ^ and \G

by Anonymous Monk
on Jun 27, 2015 at 23:14 UTC ( [id://1132289]=perlquestion: print w/replies, xml ) Need Help??

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Is it possible to use both ^ and \G ?

my $string = " a 1 # "; my $i = 0; while () { if ( $string =~ /^\G\s+/gc ) { print "whitespace\n"; } elsif ( $string =~ /^\G[0-9]+/gc ) { print "integer\n"; } elsif ( $string =~ /^\G\w+/gc ) { print "word\n"; } else { print "done\n"; last; } }
desired output: whitespace word whitespace integer whitespace done

Replies are listed 'Best First'.
Re: anchor ^ and \G
by kcott (Archbishop) on Jun 27, 2015 at 23:57 UTC

    After the first match, \G will no longer be matching at the start of the string (^), so the output will just be

    whitespace done

    Both ^ and \G are called "assertions". The Assertions section of perlre has quite a bit to say about \G and provides links to further information.

    It is certainly possible to use both of those assertions in the same regular expression. In fact, changing each instance of '^\G' in your code to '^.*\G', provides your desired output.

    Here's my test code:

    #!/usr/bin/env perl use strict; use warnings; my $string = " a 1 # "; my $i = 0; while () { if ( $string =~ /^.*\G\s+/gc ) { print "whitespace\n"; } elsif ( $string =~ /^.*\G[0-9]+/gc ) { print "integer\n"; } elsif ( $string =~ /^.*\G\w+/gc ) { print "word\n"; } else { print "done\n"; last; } }

    Output:

    whitespace word whitespace integer whitespace done

    -- Ken

      Thanks!! helped so much
Re: anchor ^ and \G
by roboticus (Chancellor) on Jun 28, 2015 at 00:26 UTC

    Anonymous Monk:

    It's certainly possible to use both \G and ^ in a regular expression, but I expect that would be a very unusual use case. Both are markers telling the regex a reference point in a string. I've never needed to use both in a regex before, but you can certainly do it. Here's a contrived example:

    $ cat gah.pl use strict; use warnings; my $t = "The quick red fox jumped over the lazy brown dog."; if ($t =~ /(jump)/g) { print "Found $1 ending at ", pos($t), ".\n"; } if ($t =~ /(\w+)\s+(\w+)\G/) { print "Found <$1> before <$2>\n"; } $t = <<EOTxt; Now a funkier thing we can do is to find weird stuffs such as the text before a word on the same line EOTxt if ($t =~ /(weird)/g) { print "Found $1\n"; if ($t =~ /^([^\n]+)\G/gm) { print "<$1>\n"; } } $ perl gah.pl Found jump ending at 22. Found <fox> before <jump> Found weird <do is to find weird>

    ...roboticus

    When your only tool is a hammer, all problems look like your thumb.

      \G and ^ makes sense with /m (though I think I'd just match \n).

      It also makes sense if you want to treat the first line specially. For example, /\G((?:^=|\+).*\n)/ extracts the first message from the following log file, line by line:

      = This is a + very long + error message = Message 2 = Message 3 + is also + very long
Re: anchor ^ and \G
by stevieb (Canon) on Jun 28, 2015 at 00:34 UTC

    As others have clearly pointed out, ^ matches only at the beginning of a string. Another way to make \G and ^ work in your scenario would be to lop off each match as you go (and the /c modifier is useless in this case, so it has been removed).

    #!/usr/bin/perl use warnings; use strict; my $string = " a 1 # "; my $i = 0; while () { if ( $string =~ s/^\G\s+//g ) { print "whitespace\n"; } elsif ( $string =~ s/^\G[0-9]+//g ) { print "integer\n"; } elsif ( $string =~ s/^\G\w+//g ) { print "word\n"; } else { print "done\n"; last; } }

    I can't see much use for both being used in the same context in your case though. Found in the perlretut is use re 'debug';. Very helpful for things like this to see why parts of a regex aren't (or are) matching.

    -stevieb

Re: anchor ^ and \G
by GrandFather (Saint) on Jun 28, 2015 at 00:16 UTC

    In the example you give ^ and \G aren't helpful. However you can achieve what you want using multiple captures and alternation:

    use strict; use warnings; my $string = " a 1 # "; while ($string =~ /\G(?: (\s+) | ([0-9]+) | (\w+))/xgc) { if (defined $1) { print "whitespace\n"; } elsif (defined $2) { print "integer\n"; } elsif (defined $3) { print "word\n"; } }

    Prints:

    whitespace word whitespace integer whitespace
    Perl is the programming world's equivalent of English
Re: anchor ^ and \G
by ww (Archbishop) on Jun 28, 2015 at 00:17 UTC

    What happened when you tried it?

    You did try it, didn't you? Or is that why you didn't tell us the actual results of your code snippet?

    And did you read http://perldoc.perl.org/perlre.html (search for "metacharacters")?

    "In particular the following metacharacters have their standard egrep-ish meanings: \ Quote the next metacharacter ^ Match the beginning of the line ... By default, the "^" character is guaranteed to match only the beginning of the string..."
    and, from http://perldoc.perl.org/perlre.html ({search for "Assertions"):
    \G Match only at pos() (e.g. at the end-of-match position of prior m//g) .... The \G assertion can be used to chain global matches (using m//g), as described in Regexp Quote-Like Operators in perlop. It is also useful when writing lex -like scanners, when you have several patterns that you want to match against consequent substrings of your string; see the previous reference. The actual location where \G will match can also be influenced by using pos() as an lvalue: see pos.

    But, as the docs make clear, the caret limits the regex engine's attempts to match your expression, unless that matches at the beginning of the string.

    You need to read the docs (and understand them! THEN, if you don't understand something, come back with code that reflects info from the docs and tell us how the output doesn't match your understanding of what you've studied.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1132289]
Approved by GrandFather
Front-paged by stevieb
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others exploiting the Monastery: (6)
As of 2024-04-25 08:16 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found