Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw

Why this Regex code is not working??

by Rohit Jain (Sexton)
on Sep 22, 2012 at 17:28 UTC ( #995111=perlquestion: print w/replies, xml ) Need Help??
Rohit Jain has asked for the wisdom of the Perl Monks concerning the following question:

I want make a pattern that matches if any word ends with the letter 'a', then also capture upto the next 5 character after the word match..

Here's the code I have come up with:

#!/perl/bin use v5.14; use warnings; while (<>) { chomp; if (m/(?<WORD>\b\w*a\b)(?<EXTRA>.{0, 5})/) { say "Word contains '$+{WORD}'"; say "Extra characters after Word are '$+{EXTRA}'"; } else { say "No Match: |$_|"; } }

But I am getting "No Match" for the below given input:

Input: I saw Wilma Yesterday

Output I expected should be:

Word contains 'Wilma' Extra Characters are ' Yest'

Replies are listed 'Best First'.
Re: Why this Regex code is not working??
by AnomalousMonk (Chancellor) on Sep 22, 2012 at 18:17 UTC

    The regex compiler doesn't like the extra whitespace within the  .{0, 5} expression.

    >perl -wMstrict -le "use v5.14; use warnings; ;; while (<>) { chomp; if (m/(?<WORD>\b\w*a\b)(?<EXTRA>.{0,5})/) { say qq{Word contains '$+{WORD}'}; say qq{Extra characters after Word are '$+{EXTRA}'}; } else { say qq{No Match: |$_|}; } } " I saw Wilma yesterday Word contains 'Wilma' Extra characters after Word are ' yest' ^Z

      My God!! I could not have found that problem in hell lot of days because of my coding style.. Thanks so much.

      That will save me from several problems in future.

        The  use re 'debug'; statement/pragma (see re) is useful to gain insight into what the Perl regex compiler thinks you wrote.

        The debug output is just a teensy bit opaque, but some insight can be had just by noting that in both cases below, the  'a' literal substring part of the regex is denoted as
             6:   EXACT <a> (8)
        whereas in the case that doesn't work, what you think is a counted quantifier is denoted
            14:   EXACT <{0, 5}> (17)
        (i.e., the regex compiler thinks it's a  '{0, 5}' literal substring), versus
            13:   CURLY {0,5} (16)
            15:     REG_ANY (0)
        in the case of the correctly written counted quantifier (the one that works as expected).

        >perl -wMstrict -le "use re 'debug'; ;; my $rx = qr/(?<WORD>\b\w*a\b)(?<EXTRA>.{0, 5})/; " Compiling REx "(?<WORD>\b\w*a\b)(?<EXTRA>.{0, 5})" Final program: 1: OPEN1 'WORD' (3) 3: BOUND (4) 4: STAR (6) 5: ALNUM (0) 6: EXACT <a> (8) 8: BOUND (9) 9: CLOSE1 'WORD' (11) 11: OPEN2 'EXTRA' (13) 13: REG_ANY (14) 14: EXACT <{0, 5}> (17) 17: CLOSE2 'EXTRA' (19) 19: END (0) floating "{0, 5}" at 2..2147483647 (checking floating) stclass BOUND m +inlen 8 Freeing REx: "(?<WORD>\b\w*a\b)(?<EXTRA>.{0, 5})" >perl -wMstrict -le "use re 'debug'; ;; my $rx = qr/(?<WORD>\b\w*a\b)(?<EXTRA>.{0,5})/; " Compiling REx "(?<WORD>\b\w*a\b)(?<EXTRA>.{0,5})" Final program: 1: OPEN1 'WORD' (3) 3: BOUND (4) 4: STAR (6) 5: ALNUM (0) 6: EXACT <a> (8) 8: BOUND (9) 9: CLOSE1 'WORD' (11) 11: OPEN2 'EXTRA' (13) 13: CURLY {0,5} (16) 15: REG_ANY (0) 16: CLOSE2 'EXTRA' (18) 18: END (0) floating "a" at 0..2147483647 (checking floating) stclass BOUND minlen + 1 Freeing REx: "(?<WORD>\b\w*a\b)(?<EXTRA>.{0,5})"

        The YAPE::Regex::Explain module can also offer helpful insight, but unfortunately it doesn't support regex constructs much beyond Perl version 5.6 and that's just what you're using!

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://995111]
Approved by moritz
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others about the Monastery: (3)
As of 2018-03-19 16:45 GMT
Find Nodes?
    Voting Booth?
    When I think of a mole I think of:

    Results (244 votes). Check out past polls.