http://www.perlmonks.org?node_id=1229820


in reply to Re^3: regex for strings with escaped quotes
in thread regex for strings with escaped quotes

I don't quite understand...

I do basically this:

use Regexp::Common qw/delimited/; print $RE{delimited}{-delim=>'"'};
And take what is printed.

How can I get to a pure regex that works then?

Replies are listed 'Best First'.
Re^5: regex for strings with escaped quotes
by haukex (Bishop) on Feb 12, 2019 at 16:47 UTC

    The regex is:

    (?:(?|(?:\")(?:[^\\\"]*(?:\\.[^\\\"]*)*)(?:\")))

    which contains no capturing groups, hence $1 isn't populated. In my first bit of code, in the regex I added an extra set of parentheses, /($RE{delimited}{-delim=>'"'})/, so there is a capturing group there. That's the equivalent of:

    ((?:(?|(?:\")(?:[^\\\"]*(?:\\.[^\\\"]*)*)(?:\"))))

    Alternatively, you can take the first regex above and just change the non-capturing group (?:...) that surrounds the entire expression into a capturing group:

    ((?|(?:\")(?:[^\\\"]*(?:\\.[^\\\"]*)*)(?:\")))

    Or, since you said you wanted to capture the stuff between the quotes, change that non-capturing group into a capturing one:

    (?:(?|(?:\")([^\\\"]*(?:\\.[^\\\"]*)*)(?:\")))

    Plus, there's a bunch of simplifications that could be made to the first regex above anyway (really just removing unnecessary groups):

    / \" [^\\\"]* (?: \\. [^\\\"]* )* \" /x

    and just add capturing groups to that as needed.

      Ahh - I did not pay attention that you introducded a new capturing group by puttting the regex into an extra pair of brackets.

      Now it all makes sense - many thanks.

      I've never used Regexp::Common before but it seems to contain useful stuff - I see a bright future for it in my Go-Programs :-)