Eliminating Left Recursion in Parse::RecDescent

Recently I have been participating in the Parse::RecDescent mailing list. One of the subjects that came up was that of

What type of parser is Parse::RecDescent, LL or LR and whats the difference?

In answering this question (twice, once poorly, once not so poorly ;-) I made the observation that since P::RD is a topdown parser it can only handle LL grammars, and that these grammars have the property that they cannot contain left recursion. (All of these points are made in the documentation.)

I went on to explain what this meant and to provide a simple method by which this issue can be resolved. merlyn responded back by saying that he had had an interesting day or two trying to resolve a left recursion problem that he encountered while writing one of his columns. Well it occured to me that if left recursion left (no pun intended) merlyn banging his head on a table for a while then maybe there were a few more of our community that would appreciate an overview of this matter. So here goes...

(Caveat, this is extracted from Red Dragon.. Use the force read the source...)
UPDATE See abstracts node below for a link to the Red Dragon aka the Dragon Book. An excellent resource.

What is Left Recursion?

_{<super>(No its not when the commies get re-elected.)</super>}
Left recusion is when you have a rule that calls itself as its very first subrule. Ie:

expr: expr '+' term | term
[download]

When Parse::RecDescent encounters something like this it should error (its documented that it will, but I haven't checked) as it will go into an endless loop on the 'expr' rule.

UPDATE

Changed the original paragraph that was here as it originally asserted that TheDamian had not implemented multilevel left recursion checks. He has. What he has not done is provided automatic checking for more subtle cases of left-recursion. He lists this as a bug/irritation in the documentation. See that for a better explanation.

Eliminating Left Recursion

Now there is a very simple technique for eliminating left recursion like this. Lets consider an abstract rule

A : A x | y
[download]

Now A is an arbitrary rule. x and y are other rules that DO NOT begin with A. This maps quite nicely onto our earlier example, A being "expr" x being "'+' term" and y being "term".

We can convert this rule into a non left recursive equivelent by changing it to

A : y R
R : x R | e
[download]

R is a new rule that we will synthesize and e is the empty rule. (It always matches.)

So lets apply this to the earlier example: (Updated as per Anonny and Abigail-II's correction)

expr      :     term expr_tail
expr_tail : '+' term expr_tail | {1}
[download]

I use the {1} here to emulate the behaviour of an e match. Perhaps theres a better way, if so let me know.

Anyway, hopefully somebody out there saves themselves some table/head banging because they read this.

Any comments and or corrections are welcome indeed.

Oh, it turns out that there is an algorithmic way of eliminating _all_ left recursion, at whatever depth from an unambiguous grammer. If anyone is interested ill post it in pseudo code, but I do think it would be interesting to write a helper module for P::RD that would be able to this automatically. So if anyone is looking for an interesting project....

Yves / DeMerphq
---
Writing a good benchmark isnt as easy as it might look.

Comment on Eliminating Left Recursion in Parse::RecDescent Select or Download Code

Replies are listed 'Best First'.
Re: Eliminating Left Recursion in Parse::RecDescent by jmcnamara (Monsignor) on Mar 21, 2002 at 00:13 UTC
Left-recursion and the workaround is discussed in D. Conway's article The man(1) of descent in The Perl Journal #12. It is also available in the tutorial dir of the Parse::RecDescent distro. See the section entitled "Left-recurse ye not!" where the following example is given. The left-recursive: `Addition: Addition '+' Term \| Term` Can be handled as follows: `Addition: (Term '+')(s?) Term` -- John.	[reply] [d/l] [select]
Re: Eliminating Left Recursion in Parse::RecDescent by abstracts (Hermit) on Mar 21, 2002 at 08:24 UTC
Hello Just wanted to point out (for readers interested in this topic) that a eliminating left recursion is discussed thoroughly in the dragon book Compilers--Principles, Techniques and Tools. Refer to algorithm 4.1. Aziz,,,	[reply]
Re: Eliminating Left Recursion in Parse::RecDescent by cogent (Monk) on Mar 21, 2002 at 17:01 UTC
I'm currently going through the Dragon Book in order to write a parser for JavaScript in pure Perl. Since I'd planned on using Parse::RecDescent, and the grammar (summarized at the end of the standard) is quite large, I'm planning on writing a module to systematically eliminate left recursion. (I also plan to do some other grammar munging, like eliminating epsilon-expressions.)	[reply]
Re: Eliminating Left Recursion in Parse::RecDescent by Matts (Deacon) on Mar 21, 2002 at 18:53 UTC
Also note that if you have a tokeniser as a regexp, then once you've eliminated left-recursion from a grammar it's dead easy to write a pure perl parser for that grammar without using a parser module. See XML::XPath::Parser for an example of this, and also XML::SAX::PurePerl. Edited: Sun Mar 31 22:11:49 2002, by footpad.	[reply]
Re: Eliminating Left Recursion in Parse::RecDescent by ikegami (Patriarch) on Jun 06, 2006 at 19:14 UTC
I use the `{1}` here to emulate the behaviour of an e match. Perhaps theres a better way, if so let me know. Just use nothing. `expr : term expr_tail expr_tail : '+' term expr_tail \|` [download] or `expr : term expr_tail expr_tail : '+' term expr_tail \|` [download] or a comment `expr : term expr_tail expr_tail : '+' term expr_tail \| # Nothing` [download] or an empty rule `expr : term expr_tail expr_tail : '+' term expr_tail \| Epsilon Epsilon :` [download]	[reply] [d/l] [select]
Re: Eliminating Left Recursion in Parse::RecDescent by Anonymous Monk on May 29, 2002 at 00:16 UTC
all this does is change the expression to expr : term '+' term \| term which works fine but does not solve the problem. (a+b) + (c+d) etc.	[reply]
Re: Eliminating Left Recursion in Parse::RecDescent by Abigail-II (Bishop) on May 29, 2002 at 10:42 UTC
all this does is change the expression to `expr : term '+' term \| term` which works fine but does not solve the problem. `(a+b) + (c+d)` That is because demerphq applied the rule wrongly. After applying the rule, we should get: `expr : term expr_tail expr_tail : '+' term expr_tail \| {1}` [download] Abigail	[reply] [d/l]
Re: Re: Eliminating Left Recursion in Parse::RecDescent by demerphq (Chancellor) on May 29, 2002 at 14:55 UTC
Doh. Yes. I did too. And you guys are the first to spot the error as well. In my defence, I got it right in one part, here `A : y R R : x R \| e` [download] But omitted the R element in the example. I have updated the node. Thanks for your hawklike eyes. Yves / DeMerphq --- Writing a good benchmark isnt as easy as it might look.	[reply] [d/l]


Clear questions and runnable code get the best and fastest answer
	PerlMonks