Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

ERROR ... nested quantifiers in regex

by ŞuRvīvőr (Novice)
on Jul 06, 2011 at 11:18 UTC ( [id://912948]=perlquestion: print w/replies, xml ) Need Help??

ŞuRvīvőr has asked for the wisdom of the Perl Monks concerning the following question:

Hello Everyone, I'm new in this community, I came here after i faced such an error that I don't know what is it or how to solve it ... the code was
my $regex = qr/ ( # start of bracket 1 {{ # match an opening {{ bracket (?: [^{}]++ # one or more {} brackets, non backtracking | (?1) # recurse to bracket 1 )* }} # match a closing }} bracket ) # end of bracket 1 /x;
I found this regex and I tried to understand it but I failed, and then when I tried to run it, it gave an error "nested quantifiers in regex" ... Can anyone help me with this regex !!

Replies are listed 'Best First'.
Re: ERROR ... nested quantifiers in regex
by Corion (Patriarch) on Jul 06, 2011 at 11:21 UTC

    { is special to the regex engine. See perlre.

    You might want to replace your literal { with \{ or [{].

      FWIW, v5.12.2 has no trouble "parsing" this regex
      Compiling REx "%n ( # start of bracket 1%n {{ # match an opening + {{ b"... Final program: 1: OPEN1 (3) 3: EXACT <{{> (5) 5: CURLYX[0] {0,32767} (30) 7: BRANCH (24) 8: SUSPEND (29) 10: PLUS (22) 11: ANYOF[\0-z|~-\377][{unicode_all}] (0) 22: SUCCEED (0) 23: TAIL (28) 24: BRANCH (FAIL) 25: GOSUB1[-24] (29) 28: TAIL (29) 29: WHILEM[2/1] (0) 30: NOTHING (31) 31: EXACT <}}> (33) 33: CLOSE1 (35) 35: END (0) anchored "{{" at 0 floating "}}" at 2..2147483647 (checking floating) +minlen 4
      and (?1) # recurse to bracket 1 appears to be a feature introduced in 5.10 (?PARNO) (?-PARNO) (?+PARNO) (?R) (?0)
        I think this maybe the problem, because I use active perl 5.8, I guess I have to download the newest version ... I'll try it and then let you know, thanks a lot for your support
      it seems that this line [^{}]++ # one or more {} brackets, non backtracking is causing the error ... and to be honest, I tried to understand it but i couldn't
        You need to use a perl older than 5.10 to use these regex features you were trying to use, they were introduced in
        p595-Possessive Quantifiers
        Perl now supports the "possessive quantifier" syntax of the "atomic match" pattern. Basically a possessive quantifier matches as much as it can and never gives any back. Thus it can be used to control backtracking. The syntax is similar to non-greedy matching, except instead of using a '?' as the modifier the '+' is used. Thus ?+, *+, ++, {min,max}+ are now legal quantifiers. (Yves Orton)
      I replaced every { and every } with \{\} and still giving the same error
Re: ERROR ... nested quantifiers in regex
by AnomalousMonk (Archbishop) on Jul 06, 2011 at 20:01 UTC

    Not only possessive quantifiers, but also the  (?1) family of extended patterns were introduced with 5.10. However, the 'recursive parsing' trick can still be done with 5.8, so if 5.10/5.12 cannot be installed, check back here for more info. (But see Update below.)

    (See Text::Balanced for all of the following functionality – and there's more!)

    Following code requires 5.10+.

    I find it useful to decompose regexes. (Closing sequence arbitrarily redefined to ']]' in example; could be any multi-character sequence.)

    >perl -wMstrict -le "my $open = '{{'; my $close = ']]'; ;; my $opener = qr{ \Q$open\E }xms; my $closer = qr{ \Q$close\E }xms; my $body = qr{ [^\Q$open$close\E] }xms; ;; my $regex = qr{ ( $opener (?: $body++ | (?1) )* $closer ) }xms; ;; my $s = 'xxx {{ foo {{ bar ]] baz ]] yyy {{ fee ]] zzz'; ;; print qq{'$1'} while $s =~ m{ $regex }xmsg; " '{{ foo {{ bar ]] baz ]]' '{{ fee ]]'

    This approach breaks down when we alter the string being searched to
        my $s =
          'xxx {{ foo {{ bar ]] baz [OK] ]] [NO] yyy {{ fee ]] zzz';
    producing the output
        '{{ bar ]]'
        '{{ fee ]]'
    because of the presence of the substring '[OK]' having the character ']' from the closing sequence.

    This problem can be fixed by changing the definition of  $body to
        my $body = qr{ (?! $opener) (?! $closer) . }xms;
    which restores the output to the expected
        '{{ foo {{ bar ]] baz [OK] ]]'
        '{{ fee ]]'
    again.

    Update: Oh, what the heck... Here's the 5.8.9 version:

    >perl -wMstrict -le "print qq{perl version $]}; ;; my $opener = qr{ \{\{ }xms; my $closer = qr{ \]\] }xms; my $body = qr{ (?! $opener) (?! $closer) . }xms; ;; use re 'eval'; our $regex = qr{ $opener (?: (?> $body+) | (??{ $regex }) )* $closer }xms; ;; my $s = 'xxx {{ foo {{ bar ]] baz [OK] ]] [NO] yyy {{ fee ]] zzz'; ;; print qq{'$1'} while $s =~ m{ ($regex) }xmsg; " perl version 5.008009 '{{ foo {{ bar ]] baz [OK] ]]' '{{ fee ]]'

      Would you please provide some notations to the code you just posted !! because I'm still a newbie in regex especially the freaky nested regex.

      Besides, Do you have any simple reference that helps understanding regex with all its tricks !!

        Would you please provide some notations to the code you just posted ...

        I don't have time at the moment, but will try to do so tomorrow.

        ... simple reference that helps understanding regex with all its tricks ...

        Believe me, brother, there ain't no such thing! If there's one thing regular expressions are not, it's simple. (They're also not regular.) The following  perldoc and on-line documentation should be helpful: perlre, perlretut, perlreref, perlfaq6, perlrecharclass, perlrebackslash. Jeffrey Friedl's book is excellent – and priced accordingly: Mastering Regular Expressions. See also the Tutorials section of the Monastery.

        Would you please provide some notations to the code you just posted ...

        I have taken a look at the discussions of the  (??{ code }) (5.8 and 5.10+) and  (?PARNO) (5.10+ only) constructs in the Extended Patterns section of perlre and I must say that while they are not ideal for a new learner of regex, they are better than anything I could provide.

        Let me suggest that you (re-)read these sections thoroughly, ponder the replies to your posts above, do some (better yet, lots of) experiments, and then come back to the Monastery and post any remaining or new questions you may have. If you do all this right, you should have lots of questions because regular expressions are very powerful and can be quite subtle. Don't expect to learn it all overnight, but the effort you invest will be well rewarded.

        BTW: If you have any more regex questions, be sure to mention the version of Perl you settle on using.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://912948]
Approved by kcott
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others surveying the Monastery: (8)
As of 2024-04-23 14:21 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found