http://www.perlmonks.org?node_id=982969


in reply to regex extraction for variable number of args

hi,

I think you're looking for the balance parenthesis regex of Jeffery Friedl. After tweeking for quoted strings it looks something like ...

my ($np); # The initial balance parentheses expression with an embedded set is: # $np=qr/\( ([^()] | (??{$np})) *\)/x; # And quoted strings are "(?:[^"]|\")*" and '(?:[^']|\')*' or ('|")(? +:[^\1]|\\1)*\1 so $np=qr/ \( # The opening "(" (( # We'll want this hence the cap +turing () '(?:[^']|\')*?' #' a single quote string | "(?:[^"]|\")*?" #" a double quote string | [^()] # not a parentheses | (??{$np}) )*) \) # and the closing ")" /x;

Replies are listed 'Best First'.
Re^2: regex extraction for variable number of args
by clueless newbie (Curate) on Jul 22, 2012 at 14:33 UTC

    Hi, NetWallah,

    The OP is indeed interesting and goads me into this.

    Thanks!

    #!/usr/bin/perl use strict; use warnings; use Smart::Comments; print "\n" x5; my ($np_1,$np_2); # The initial balance parentheses expression with an embedded set +is: # $np=qr/\( ([^()] | (??{$np})) *\)/x; # And quoted strings are "(?:[^"]|\")*" and '(?:[^']|\')*' or ('|" +)(?:[^\1]|\\1)*\1 so $np_1=qr/ \( # The opening "(" ((?: # We'll want this hence the + capturing () | [^'"()] #' not a ' " ( or ) | (?:'(?:[^']|\')*?') #' a single quote string | (?:"(?:[^"]|\")*?") #" a double quote string | (??{$np_1}) # parenthesized expression )*) \) # and the closing ")" /x; # Deal with the argument list $np_2=qr/ ( # We'll want this hence the + capturing () (?: # [^'"(),] #' not a ' " ( ) or , | (?:'(?:[^']|\')*?') #' a single quote string | (?:"(?:[^"]|\")*?") #" a double quote string | $np_1 # parenthesized expression +see above )* )(,|\z) # termining , or \z /x; my $string=q{other stuff &COMPAREEQUAL(First-param.one,'(,',Third-pa +ram); more stuff other stuff &COMPAREEQUAL(one(foo('bar'))+two(foobar +),'(,',Third-param); more stuff dude() }; ### $string while ($string =~ m/\b(\w+)\s*$np_1/g) { my ($subroutine_name,$argument_list)=($1,$2); ### $subroutine_name ### $argument_list while ($argument_list =~ m/$np_2/g and $1) { ### argument: $1 }; }; __END__
    ### $string: 'other stuff &COMPAREEQUAL(First-param.one,\'(,\',Third-p +aram); more stuff other stuff &COMPAREEQUAL(one(foo(\'bar\'))+two(foo +bar),\'(,\',Third-param); more stuff dude() ' ### $subroutine_name: 'COMPAREEQUAL' ### $argument_list: 'First-param.one,\'(,\',Third-param' ### argument: 'First-param.one' ### argument: '\'(,\'' ### argument: 'Third-param' ### $subroutine_name: 'COMPAREEQUAL' ### $argument_list: 'one(foo(\'bar\'))+two(foobar),\'(,\',Third-param' ### argument: 'one(foo(\'bar\'))+two(foobar)' ### argument: '\'(,\'' ### argument: 'Third-param' ### $subroutine_name: 'dude' ### $argument_list: ''