Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

Parse::RecDescent matching same line twice

by kudra (Vicar)
on Mar 01, 2009 at 16:22 UTC ( #747319=perlquestion: print w/ replies, xml ) Need Help??
kudra has asked for the wisdom of the Perl Monks concerning the following question:

It's been a few years since I used Parse::RecDescent, so it might be that my expectations are wrong, but I was under the impression that once some text had matched a rule, that same text would not be used to match a subsequent rule.

Excluding the actions that I take (printing the result) and the subrules (which are all regexes), the relationship between the rules is this:

start : (index pre risk) | (pre risk) risk : (risk1 | risk2) index : (pre1 | pre2 | pre3) pre : (pre1 | pre2 | pre3)
Most of my test cases match the 'index pre risk' pattern, often using the same pre# rule for both the index and pre matches. However, in 8 out of my 100 cases, the index and pre match are matching exactly the same line (as seen in both the output and $thisline).

Perhaps someone more familiar with P::RD could tell me if this is something they've seen before, and how I might prevent it.

Update: I do have a minimal test case, but because the sample data file is so large I don't want to attach it to this post.

Comment on Parse::RecDescent matching same line twice
Download Code
Re: Parse::RecDescent matching same line twice
by ikegami (Pope) on Mar 01, 2009 at 18:12 UTC

    When the score is settled, each character of the text can only be matched by two rules if one rule is a production of another.

    text: struct { int foo ; int bar ; } ------ - --- --- - --- --- - - IDENT "{" IDENT IDENT ";" IDENT IDENT ";" "}" --- --- --- --- type var type var --------- --------- decl decl ------------------------- decl_list --------------------------------------------- struct --------------------------------------------- parse

    But in reaching that state, a rule can match, then be unmatched by a backtrack. For example, given the grammar

    parse : foo1 foo2 | bar1 bar2

    foo1 could matched, but PRD will backtrack if it can't follow with a foo2 match. It will then try bar1.

    I'm guessing one of your productions has side-effects, so you falsely believed it has matched even though a backtrack unmatched it. I could very well be wrong because I have very little data to go on.

    Update: In the following example, you'll see foo1 on the screen even though it wasn't matched.

    use strict; use warnings; use Parse::RecDescent qw( ); my $grammar = <<'__EOI__'; { use strict; use warnings; } parse : foo1 foo2 /\Z/ { [ @item[0,1,2] ] } | bar1 bar2 /\Z/ { [ @item[0,1,2] ] } foo1 : "X" { print("$item[0]\n"); [ @item[0,1] ] } foo2 : "Y" { print("$item[0]\n"); [ @item[0,1] ] } bar1 : "X" { print("$item[0]\n"); [ @item[0,1] ] } bar2 : "Z" { print("$item[0]\n"); [ @item[0,1] ] } __EOI__ Parse::RecDescent->Precompile($grammar, 'Grammar') or die("Bad grammar\n");
    use strict; use warnings; use Data::Dumper qw( Dumper ); use Grammar qw( ); my $parser = Grammar->new(); my $matches = $parser->parse('XZ') or die("Bad input\n"); print("\n"); print(Dumper($matches));
    foo1 bar1 bar2 $VAR1 = [ 'parse', [ 'bar1', 'X' ], [ 'bar2', 'Z' ] ];
      Thank you for your answer; that appears to be what is happening. I figured it was due to my lack of knowledge about P::RD. I was puzzled by the way it appeared some of my test cases were rejecting the first rule without consequence.
        It's just like in regexps
        'XZ' =~ / ^ (?: X (?{ print "X" }) Y (?{ print "Y" }) | X (?{ print "X" }) Z (?{ print "Z" }) ) \z /x; print("\n");
        XXZ

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://747319]
Front-paged by Arunbear
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others wandering the Monastery: (7)
As of 2015-07-04 22:50 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (60 votes), past polls