Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical

Comment on

( #3333=superdoc: print w/replies, xml ) Need Help??
Recently I have been participating in the Parse::RecDescent mailing list. One of the subjects that came up was that of

What type of parser is Parse::RecDescent, LL or LR and whats the difference?

In answering this question (twice, once poorly, once not so poorly ;-) I made the observation that since P::RD is a topdown parser it can only handle LL grammars, and that these grammars have the property that they cannot contain left recursion. (All of these points are made in the documentation.)

I went on to explain what this meant and to provide a simple method by which this issue can be resolved. merlyn responded back by saying that he had had an interesting day or two trying to resolve a left recursion problem that he encountered while writing one of his columns. Well it occured to me that if left recursion left (no pun intended) merlyn banging his head on a table for a while then maybe there were a few more of our community that would appreciate an overview of this matter. So here goes...

(Caveat, this is extracted from Red Dragon.. Use the force read the source...)
UPDATE See abstracts node below for a link to the Red Dragon aka the Dragon Book. An excellent resource.

What is Left Recursion?

<super>(No its not when the commies get re-elected.)</super>
Left recusion is when you have a rule that calls itself as its very first subrule. Ie:

expr: expr '+' term | term
When Parse::RecDescent encounters something like this it should error (its documented that it will, but I haven't checked) as it will go into an endless loop on the 'expr' rule.


Changed the original paragraph that was here as it originally asserted that TheDamian had not implemented multilevel left recursion checks. He has. What he has not done is provided automatic checking for more subtle cases of left-recursion. He lists this as a bug/irritation in the documentation. See that for a better explanation.

Eliminating Left Recursion

Now there is a very simple technique for eliminating left recursion like this. Lets consider an abstract rule

A : A x | y
Now A is an arbitrary rule. x and y are other rules that DO NOT begin with A. This maps quite nicely onto our earlier example, A being "expr" x being "'+' term" and y being "term".

We can convert this rule into a non left recursive equivelent by changing it to

A : y R R : x R | e
R is a new rule that we will synthesize and e is the empty rule. (It always matches.)

So lets apply this to the earlier example: (Updated as per Anonny and Abigail-II's correction)

expr : term expr_tail expr_tail : '+' term expr_tail | {1}
I use the {1} here to emulate the behaviour of an e match. Perhaps theres a better way, if so let me know.

Anyway, hopefully somebody out there saves themselves some table/head banging because they read this.

Any comments and or corrections are welcome indeed.

Oh, it turns out that there is an algorithmic way of eliminating _all_ left recursion, at whatever depth from an unambiguous grammer. If anyone is interested ill post it in pseudo code, but I do think it would be interesting to write a helper module for P::RD that would be able to this automatically. So if anyone is looking for an interesting project....

Yves / DeMerphq
Writing a good benchmark isnt as easy as it might look.

In reply to Eliminating Left Recursion in Parse::RecDescent by demerphq

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?

    What's my password?
    Create A New User
    and all is quiet...

    How do I use this? | Other CB clients
    Other Users?
    Others about the Monastery: (6)
    As of 2018-06-19 09:00 GMT
    Find Nodes?
      Voting Booth?
      Should cpanminus be part of the standard Perl release?

      Results (111 votes). Check out past polls.