Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

How can I find nested delimiters?

( #6879=categorized question: print w/ replies, xml ) Need Help??
Contributed by Anonymous Monk on Apr 05, 2000 at 01:27 UTC
Q&A  > regular expressions


Description:

Given the string
QUERY = "SOME QUERY WITH "" (DOUBLE QUOTES)" YEEHA
What regular expression will match the entire string within the outer double quotes, plus the string after the double-quoted string? In other words, after it matches i want
SOME QUERY WITH "" (DOUBLE QUOTES)
in $1 and
YEEHA
in $2

Answer: Double quotes within double quotes
contributed by little_mistress

I'm starting to get a little timid about posting on this site but im going to anyway
here goes

One thing to remeber about regular expressions in perl are that the charactors "*" and "+" are "greedy". In other words they will gobble up all the charactors they possibly can befor they fail. Hence the "*?" and "+?" constructions that make them "non-greedy" or rather they stop gobbling up charactors as soon as they make a match. so here is where greedy regular expressions come in handy

$text = "QUERY = \"SOME QUERY WITH \"\" (DOUBLE QUOTES)\" YEEHA"; $text =~ m/^(.*)"+\s(\w+)$/g; print "dollar one =$1\n\ndollar two = $2\n\n"; # $1 == QUERY = "SOME QUERY WITH "" (DOUBLE QUOTES) # $2 == YEEHA

Hope that helped

Just for fun try it like this and see what you get"

$text =~ m/^(.*)?"+\s(\w+)?$/g;

little_mistress@mainhall.com

Answer: Double quotes within double quotes
contributed by btrott

my $query = qq("SOME QUERY WITH "" (DOUBLE QUOTES)" YEEHA); if ($query =~ /"(.*)"(.*)/) { my $inside = $1; my $outside = $2; print $inside, "\n", $outside, "\n"; }
This works because the regex is greedy.

But I'm not sure if this is really what you want-- was that query merely an example, or is that really what all your queries look like?

If it's the former, and you're trying to do something more complicated than what you describe above--for example, match balanced text--you might take a look at perlfaq6, Can I use Perl regular expressions to match balanced text?.

Also, see Text::Balanced on CPAN. -- Ed.

Answer: How can I find nested delimiters?
contributed by ignatz

Using Text::Balanced:

#!/usr/bin/perl -w use strict; use Text::Balanced "extract_delimited"; my @queries = ( qq(QUERY = "SOME QUERY WITH "" (DOUBLE QUOTES)" YEEHA), qq(Johnson looked up and said "Pie is tasty!" People in Clarkstown + liked ""Pie"".), qq("You sir are an ""A\$\@hole""!" The chipmunks were naturally sh +ocked.), qq("""No Way!""" """Way!""") ); for (@queries) { my ($extracted, $remainder, $prefix) = extract_delimited( undef, # defaults to $_ '"', # Our chosen delimiter '[^"]*', # Allow for text before the delimiter '"'); # Escape delimiter when doubled # Strip the delimiter since Text::Balanced leaves it in $extracted =~ s/^\"(.*)\"$/$1/; print; print "\n\$prefix = '$prefix'\n"; print "\$extracted = '$extracted'\n"; print "\$remainder = '$remainder'\n\n"; }
RETURNS:
QUERY = "SOME QUERY WITH "" (DOUBLE QUOTES)" YEEHA $prefix = 'QUERY = ' $extracted = 'SOME QUERY WITH "" (DOUBLE QUOTES)' $remainder = ' YEEHA' Johnson looked up and said "Pie is tasty!" People in Clarkstown liked +""Pie"". $prefix = 'Johnson looked up and said ' $extracted = 'Pie is tasty!' $remainder = ' People in Clarkstown liked ""Pie"".' "You sir are an ""A$@hole""!" The chipmunks were naturally shocked. $prefix = '' $extracted = 'You sir are an ""A$@hole""!' $remainder = ' The chipmunks were naturally shocked.' """No Way!""" """Way!""" $prefix = '' $extracted = '""No Way!""' $remainder = ' """Way!"""'
mixing double delimiters
Answer: How can I find nested delimiters?
contributed by Anonymous Monk

$query = qq("SOME QUERY WITH "" (DOUBLE QUOTES)" YEEHA); $query =~ /"((?:[^"]|"")*)"(.*)/; print "dollar one =$1\n\ndollar two = $2\n\n";
Answer: How can I find nested delimiters?
contributed by Anonymous Monk

Either of these will match the longest possible quoted string with only an even number of quotes inside. m#"(?:"[^"]*"|[^"]*)"#;
m#"(?:("?)[^"]*\1)+"#;

Please (register and) log in if you wish to add an answer



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • Outside of code tags, you may need to use entities for some characters:
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?
    Username:
    Password:

    What's my password?
    Create A New User
    Chatterbox?
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others pondering the Monastery: (7)
    As of 2014-12-28 14:43 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?

      Is guessing a good strategy for surviving in the IT business?





      Results (181 votes), past polls