http://www.perlmonks.org?node_id=483279

halley has asked for the wisdom of the Perl Monks concerning the following question:

I have an application which has two separate mini-languages. What the languages are, doesn't really matter, so I'll use surrogates that people are familiar with.

I wrote the first parser already, and it works well. It's a complicated grammar that reads a single entity. You could think of it like "how to parse a single database query." (It's not SQL but it's roughly the same complexity.) Call the parser on a string, and it returns an entity.

I was then going to write the other parser, which sometimes needs to parse entities of the first kind. Think of it like "how to parse a user interface template" like web or gui or something. Call the parser, and read the whole template.

In several circumstances, the template could include a full "query" type entity. The question is how to parse both mini-languages which can be interlaced, without using just one huge grammar.

# query example query:[ select foo from table X where foo.whatever is acceptable ]
# template example identifier TEMPLATE { code; code; code; query:[....]; more code that uses query results; etc. }
Now, I'm sure that technically, I *could* just write one huge massive grammar that handles both cases, where the query is just one possible sub-rule.

However, I don't want to go that route, since templates and queries are not really the same thing, and I want to maintain and/or release them separately.

$TemplateParser = new Parse::RecDescent(<<'__GRAMMAR__'); template: identifier 'TEMPLATE' '{' statement(s) '}' statement: expression | conditional | loopcontrol | <do-this-other-parser: $QueryParser->query() > __GRAMMAR__
What I'd like to see is a Parse::RecDescent way of calling a different parser to accomplish a subrule. Is there any such mechanism that I've missed in the documentation and examples?

--
[ e d @ h a l l e y . c c ]

Replies are listed 'Best First'.
Re: using two Parse::RecDescent parsers together
by ikegami (Patriarch) on Aug 12, 2005 at 14:26 UTC

    From the docs,

    $text: The remaining (unparsed) text. Changes to $text do not propagate out of unsuccessful productions, but do survive successful productions.

    So we need to remove from $text what the query parser matches. Also from the docs,

    If, however, the text to be matched is passed by reference [...] then any text which was consumed during the match will be removed from the start of [the referenced string].

    So it's quite simple:

    $TemplateParser = Parse::RecDescent->new(<<'__GRAMMAR__'); template: identifier 'TEMPLATE' '{' statement(s) '}' statement: expression | conditional | loopcontrol | { $QueryParser->query(\$text) } __GRAMMAR__

    Tested.

Re: using two Parse::RecDescent parsers together
by GrandFather (Saint) on Aug 12, 2005 at 14:10 UTC

    Can you write a light weight pre-parser that spits stuff to one parser or the other as appropriate? Your sample makes it look like that could be done with a little finessing of the output from the query parser to make it palatable to the template parser.


    Perl is Huffman encoded by design.
      Aye, I was thinking that if the "template" knew that there was a construct like "query:[ stuffstuffsstuff ]" that it could just absorb it and try to parse it separately in the perl block for that rule.

      However, I'm concerned that "stuffstuffstuff" may need to deal with matching up square brackets and string literals that include brackets, to accurately find the end of the stuff and the close of the query.

      It's much like the magic that Perl must use to identify the proper beginnings and endings of regular expressions in code, all while allowing weirdities like s###. (I hope my case ends up being simpler.)

      --
      [ e d @ h a l l e y . c c ]