http://www.perlmonks.org?node_id=1229817


in reply to Re: regex for strings with escaped quotes
in thread regex for strings with escaped quotes

It somehow stops working when I use the regex directly and I cannot see why:
my $str = q{ x "foo \"bar\"" y }; $str =~ /(?:(?|(?:\")(?:[^\\\"]*(?:\\.[^\\\"]*)*)(?:\")))/; print $1, "\n"; # prints nothing

Replies are listed 'Best First'.
Re^3: regex for strings with escaped quotes
by haukex (Archbishop) on Feb 12, 2019 at 15:55 UTC
    $str =~ /(?:(?|(?:\")(?:[^\\\"]*(?:\\.[^\\\"]*)*)(?:\")))/;

    It's missing the capture group that I added: /($RE{delimited}{-delim=>'"'})/

      I don't quite understand...

      I do basically this:

      use Regexp::Common qw/delimited/; print $RE{delimited}{-delim=>'"'};
      And take what is printed.

      How can I get to a pure regex that works then?

        The regex is:

        (?:(?|(?:\")(?:[^\\\"]*(?:\\.[^\\\"]*)*)(?:\")))

        which contains no capturing groups, hence $1 isn't populated. In my first bit of code, in the regex I added an extra set of parentheses, /($RE{delimited}{-delim=>'"'})/, so there is a capturing group there. That's the equivalent of:

        ((?:(?|(?:\")(?:[^\\\"]*(?:\\.[^\\\"]*)*)(?:\"))))

        Alternatively, you can take the first regex above and just change the non-capturing group (?:...) that surrounds the entire expression into a capturing group:

        ((?|(?:\")(?:[^\\\"]*(?:\\.[^\\\"]*)*)(?:\")))

        Or, since you said you wanted to capture the stuff between the quotes, change that non-capturing group into a capturing one:

        (?:(?|(?:\")([^\\\"]*(?:\\.[^\\\"]*)*)(?:\")))

        Plus, there's a bunch of simplifications that could be made to the first regex above anyway (really just removing unnecessary groups):

        / \" [^\\\"]* (?: \\. [^\\\"]* )* \" /x

        and just add capturing groups to that as needed.