Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical

Parsing Statements with Quotes in Parse::RecDescent

by Starky (Chaplain)
on Feb 04, 2008 at 15:55 UTC ( #665986=perlquestion: print w/replies, xml ) Need Help??
Starky has asked for the wisdom of the Perl Monks concerning the following question:

I am just getting started with Parse::RecDescent, a powerful mind bender if there ever was one, to build a parser for a somewhat obscure language and am a bit flummoxed at the first hurdle.

Like many languages, semicolons separate statements. However, there may be semicolons in quotes; e.g.,:

This is a statement ; Another statement ; A statement "with a quoted ';'" ;
I am inclined to start with something like
startrule: statement(s /;/) statement: ...
but am stuck on how to elegantly handle the quoting issues. Does one need to resort to a big, nasty regex or is there a more elegant way?

I've taken a look at resources such as the Parse::RecDescent Tutorial but haven't been able to find an answer. This seems the kind of thing that P::RD monks would have encountered on many an occasion.

Replies are listed 'Best First'.
Re: Parsing Statements with Quotes in Parse::RecDescent
by Anonymous Monk on Feb 04, 2008 at 16:35 UTC
    The elegant way is to define tokens and delimiters. Set up a rule that defines a string first. Start by looking for a quote and stopping when you find the next quote. Then you won't worry about stripping out semi-colons.

    Do a super search on this module's name and you will find lots of examples to work from.

      Ah, the old "once you find a quote, keep going until you find the next one" trick.

      Somehow, I suspected there would be something fancier involved. I seem to have outsmarted myself.

Re: Parsing Statements with Quotes in Parse::RecDescent
by ikegami (Pope) on Feb 05, 2008 at 18:22 UTC
    As for the string literal,
    { use strict; use warnings; sub dequote_single { for (my $s = @_ ? $_[0] : $_) { s/^'//; s/'$//; s{\\([\\'])}{$1}gs; return STR_LIT => $_; } } sub dequote_double { for (my $s = @_ ? $_[0] : $_) { ... if (@pieces) { return $pieces[0]; } else { return CONCAT => \@pieces; } } } } STR_LIT: /'(?:\\.|[^\\'])*'/s { [ dequote_single($item[1]) ] } | /"(?:...)*'/s { [ dequote_double($item[1]) ] }

    (assuming Perl-like literals)

Re: Parsing Statements with Quotes in Parse::RecDescent
by ikegami (Pope) on Feb 05, 2008 at 18:09 UTC

    /;/ won't detect the semicolon in the middle of the quoted construct because the regex match is anchored at the current position (at the end of the statement).

    stmt_list: statement stmt_list_(?) stmt_list_: /;/ stmt_list
Re: Parsing Statements with Quotes in Parse::RecDescent
by metaperl (Curate) on Feb 05, 2008 at 17:55 UTC

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://665986]
Approved by Corion
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others lurking in the Monastery: (5)
As of 2018-04-21 08:34 GMT
Find Nodes?
    Voting Booth?