FWIW and just as a matter of interest, the reason your OPed regex
my @strings = $data =~ /\"[^\"]+\"/g;
was "... extracting almost every line..." may be because it will not handle an empty (i.e., zero-length) string properly: the [^\"]+ regex sub-expression requires at least one non-double-quote character. If there is any "" empty string in the text, parsing would get "out of sync" by taking the end quote of the empty quote as the start of the spurious body of a quote.
use warnings;
use strict;
use Data::Dump qw(dd);
my $data = do { local $/; <DATA> };
my @strings = $data =~ /\"[^\"]+\"/g;
dd \@strings;
__DATA__
nothing
"hello"
foo "bar" quz
"hello2" "world"
foo2 "bar2" quz2 "baz" blah
blah2 "" blah3
many
lines
of
unquoted stuff
"example 1 for instance"
Output:
c:\@Work\Perl\monks\kepler>perl extract_double_quote_bodies_2.pl
[
"\"hello\"",
"\"bar\"",
"\"hello2\"",
"\"world\"",
"\"bar2\"",
"\"baz\"",
"\" blah3\nmany\nlines\nof\nunquoted stuff\n\"",
]
Note that [^"] "not a double-quote" includes the newline character.
Update: Also note that /"[^"]+"/g and /"[^"]*"/g will not properly handle a double-quoted string containing an escaped double-quote (e.g., "x\"y") and will end up "out of sync" in the same way as /"[^"]+"/g with an empty string.
Give a man a fish: <%-{-{-{-<
|