Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

Maximal match in a recursive regex

by diotalevi (Canon)
on Jun 26, 2003 at 16:48 UTC ( #269311=perlquestion: print w/ replies, xml ) Need Help??
diotalevi has asked for the wisdom of the Perl Monks concerning the following question:

Given data like "a[b[c[d]]" the following regex matches the inner-most [d]. I'd like suggestions on how I can match the outermost [c[d]]. I feel like there's something simple I'm missing but ... well, I'm missing it.

my $re; $re = qr/ \[ # Opening bracket ( # Capture the contents [^][]+ # Body of a link | (??{$re}) # Or recurse ) \] # Closing bracket /x; $k = "a[b[c[d]]"; $k =~ s/$re/<$1>/g; print $k;

Comment on Maximal match in a recursive regex
Select or Download Code
Re: Maximal match in a recursive regex
by diotalevi (Canon) on Jun 26, 2003 at 17:06 UTC

    Ah I see. The key is to change the capturing group to match one or more times instead of just once. It just becomes )+ from ).

    Added: I goofed. That is *part* of the key. The above change has a maximal match but loses the contents of the non-innermost matches. Here's a version that *works*

    $re = qr/ \[ # Opening bracket ((?: # Capture the contents [^][]+ # Body of a link | (??{$re}) # Or recurse )+) # and allow repeats internally \] # Closing bracket /x;

    Noted: for a brief period there was also some pushing into @f. That shouldn't have been posted so I removed it.

      This is how I fixed it.

      my $re; $re = qr/ \[ # Opening bracket ( # Capture the contents [^][]+ # Body of a link | (??{$re}) # Or recurse )+ # added per diotalevi's instructions \] # Closing bracket /x; $k = "a[b[c[d]]"; $k =~ s/($re)/<$1>/g; # I added the ()'s :) print $k;

      Bah! It captures the outer square brackets too :(

Re: Maximal match in a recursive regex ([^][]+)
by tye (Cardinal) on Jun 26, 2003 at 18:26 UTC

    Just a quick note that I consider using [^][]+ in a regex to be obfuscation. (: I realize that backwhacks are a bit ugly, but I don't condone relying on the little-used trick that ] is not special when it is the first character (including after the optional "^") of a character class.

    I'd prefer [^\[\]]+, even though the eye doesn't have the easiest time lining up the brackets (it is ugly while your construct is pretty but misleading, like an optical illusion). :)

                    - tye

      Huh. And I wasn't even attempting to mentally match the internal brackets with the external ones. I just wrote it correctly so I wouldn't need backwhacks and that it coincidentally looks like two classes entirely escaped me. Thanks for altering me to that mental blindspot.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://269311]
Approved by Thelonius
Front-paged by Thelonius
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others imbibing at the Monastery: (6)
As of 2014-09-24 05:03 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (246 votes), past polls