Not only possessive quantifiers, but also the (?1) family of extended patterns were introduced with 5.10. However, the 'recursive parsing' trick can still be done with 5.8, so if 5.10/5.12 cannot be installed, check back here for more info. (But see Update below.)
(See Text::Balanced for all of the following functionality – and there's more!)
Following code requires 5.10+.
I find it useful to decompose regexes. (Closing sequence arbitrarily redefined to ']]' in example; could be any multi-character sequence.)
>perl -wMstrict -le
"my $open = '{{';
my $close = ']]';
;;
my $opener = qr{ \Q$open\E }xms;
my $closer = qr{ \Q$close\E }xms;
my $body = qr{ [^\Q$open$close\E] }xms;
;;
my $regex = qr{
(
$opener
(?:
$body++
|
(?1)
)*
$closer
)
}xms;
;;
my $s = 'xxx {{ foo {{ bar ]] baz ]] yyy {{ fee ]] zzz';
;;
print qq{'$1'} while $s =~ m{ $regex }xmsg;
"
'{{ foo {{ bar ]] baz ]]'
'{{ fee ]]'
This approach breaks down when we alter the string being searched to
my $s =
'xxx {{ foo {{ bar ]] baz [OK] ]] [NO] yyy {{ fee ]] zzz';
producing the output
'{{ bar ]]'
'{{ fee ]]'
because of the presence of the substring '[OK]' having the character ']' from the closing sequence.
This problem can be fixed by changing the definition of $body to
my $body = qr{ (?! $opener) (?! $closer) . }xms;
which restores the output to the expected
'{{ foo {{ bar ]] baz [OK] ]]'
'{{ fee ]]'
again.
Update: Oh, what the heck... Here's the 5.8.9 version:
>perl -wMstrict -le
"print qq{perl version $]};
;;
my $opener = qr{ \{\{ }xms;
my $closer = qr{ \]\] }xms;
my $body = qr{ (?! $opener) (?! $closer) . }xms;
;;
use re 'eval';
our $regex = qr{
$opener
(?: (?> $body+) | (??{ $regex }) )*
$closer
}xms;
;;
my $s =
'xxx {{ foo {{ bar ]] baz [OK] ]] [NO] yyy {{ fee ]] zzz';
;;
print qq{'$1'} while $s =~ m{ ($regex) }xmsg;
"
perl version 5.008009
'{{ foo {{ bar ]] baz [OK] ]]'
'{{ fee ]]'
|