Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

regex for strings with escaped quotes

by morgon (Priest)
on Feb 12, 2019 at 15:10 UTC ( [id://1229800]=perlquestion: print w/replies, xml ) Need Help??

morgon has asked for the wisdom of the Perl Monks concerning the following question:

Hi,

I am trying to construct a regex that extracts the string beween two quotes with the complication that that string may contain backslash-escaped quotes.

To illustrate:

my $rexeg = ???? # this is what I am after my ($m1) = '"hubba bubba"' =~ $regex; print "ok\n" if $m1 eq 'hubba bubba'; # should print "ok" my ($m2) = '"hubba \"bubba\""' =~ $regex; print "ok\n" if $m2 eq 'hubba "bubba"'; # should also print "ok"
I hope that is understandable...

I tried to do this with negative lookbehinds, but I attempt failed with "Variable length lookbehind not implemented", so am looking for some help here.

Many thanks!

Replies are listed 'Best First'.
Re: regex for strings with escaped quotes
by haukex (Archbishop) on Feb 12, 2019 at 15:13 UTC
    use Regexp::Common qw/delimited/; my $str = q{ x "foo \"bar\"" y }; $str =~ /($RE{delimited}{-delim=>'"'})/; print $1, "\n"; print $RE{delimited}{-delim=>'"'}, "\n"; __END__ "foo \"bar\"" (?:(?|(?:\")(?:[^\\\"]*(?:\\.[^\\\"]*)*)(?:\")))
      It somehow stops working when I use the regex directly and I cannot see why:
      my $str = q{ x "foo \"bar\"" y }; $str =~ /(?:(?|(?:\")(?:[^\\\"]*(?:\\.[^\\\"]*)*)(?:\")))/; print $1, "\n"; # prints nothing
        $str =~ /(?:(?|(?:\")(?:[^\\\"]*(?:\\.[^\\\"]*)*)(?:\")))/;

        It's missing the capture group that I added: /($RE{delimited}{-delim=>'"'})/

      thanks, but I forgot to mention one detail:

      I do not need this for a perl-program but for a perl5-compatible regex engine in Go.

      I there a way to print the regex that is used?

        I there a way to print the regex that is used?

        That's what the second line of output above is. Simplification of that regex is left as an exercise to the reader :-) (Update: Nevermind.)

        BTW, you can also use the -keep feature to get only the part between the quotes:

        use Regexp::Common qw/delimited/; q{ x "foo \"bar\"" y } =~ /$RE{delimited}{-delim=>'"'}{-keep}/; print $3, "\n"; # prints: foo \"bar\"

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1229800]
Approved by haukex
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others romping around the Monastery: (4)
As of 2026-03-10 21:12 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?
    hippoepoptai's answer Re: how do I set a cookie and redirect was blessed by hippo!
    erzuuliAnonymous Monks are no longer allowed to use Super Search, due to an excessive use of this resource by robots.