Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

First steps with Marpa::R2 and BNF

by Discipulus (Abbot)
on Jan 13, 2021 at 13:35 UTC ( #11126847=perlquestion: print w/replies, xml ) Need Help??

Discipulus has asked for the wisdom of the Perl Monks concerning the following question:

Hello nuns and monks!

I've get off the rust over my hands with a toy project recently uploaded to CPAN. It is a dice roller system.

After ~100 lines of coding, in the above module I had a sudden desire to use Marpa::R2 to accomplish the task of parsing dice expressions. But I terminate my module without any grammar and it accept dice expressions like:

3d6 # simplest one 3d6+3 # with a result modifier 3d8r1 # reroll and discard any 1 3d8rlt3 # reroll and discard any lesser than 3 3d8rgt6 # reroll and discard any greater than 6 3d8rgt6+2 # reroll and discard any greater than 6 and add +2 to the f +inal result

..and so on. See the synopsis of my module for more examples.

Now I want to parse these expressions with Marpa::R2 and I produced cool code I'm proud of you can review at the end of the post.

Marpa and BNF in general has a lot of documentation and well.. I did not read it all :) It is damn complicated and the enormous amount of documentation produced by Marpa::R2 author is invaluable but also sparse and not precisely destinated to beginners. Or I'm dumb.

Anyway I have some questions and I'd like simple answers.

1 - Marpa uses regexes but..

RHS (Right Hand Symbol) are often like: digits ~ [\d]+ but it seems only character classes can be used. I tried with:  something ~ [(?:this|or|that)]+ but it seems is not the correct usage.

What can be put at the end of a definition chain?

2 - Optional terms

In the below example Dice_with_modifier_x rule works but in the Dice_with_modifier_r I wanted to introduce optional terms: it must accept all these expressions:  3d6r1 3d6rgt4 3d6rlt3 so the r will be always present but followed by an optional gt or lt (greater than and lesser than).

When I pass 3d8r1 and 3d8rgt1 I get different sized lists ( as shown by dd by Data::Dump ):

# 3d8r1 modifier_r received: ({}, { die_type => "1d8", rolls => [8, 1, 7] }, " +r", 1) # 3d8rgt1 modifier_r received: ({}, { die_type => "1d8", rolls => [4, 8, 1] }, " +r", "gt", 1)

Should I work on the size of the list? In my head I'd like something Optional_Modifier_Comparison and if it is not present assume eq as default.

How must I treat optional eventually empty terms?

I have put Die_Modifier_Comp ~ 'gt' | 'lt' but wonder if this is the way. Anyway I tried Die_Modifier_Comp ~ 'gt' | 'lt' | '' and it dies.

3 - Returned structures

I put :default ::= action => [name,values] and I see that my subs where I return a hashref are modified and the actual return structure is \["Dice_Expression", { die_type => "1d4", rolls => [2] }] if I remove name from the default action it eliminates the "Dice_Expression" part. Anyway a reference to an arrayref is returned and then I have to write ugly things like: $$$new[1]->{rolls}->[0]

How can I use name in a profitable way? There is a way to simplify returned structures (well I can unwrap it at the begin of the sub..)?

Below my actual code.

use Marpa::R2; use Data::Dump; # resurces (along with ones on cpan): # http://marpa-guide.github.io/ # http://savage.net.au/Perl-modules/html/marpa.faq/faq.html # http://savage.net.au/Perl-modules/html/marpa.papers/ # https://github.com/choroba/marpa-enhanced-calculator # https://perlmaven.com/marpa-for-building-parsers # https://perlmaven.com/marpa-debugging my $dsl = <<'END_OF_DSL'; #:default ::= action => [values] :default ::= action => [name,values] lexeme default = latm => 1 Dice_Expression ::= Simple_Dice |Dice_with_modifier_x |Dice_with_modifier_r Dice_with_modifier_x ::= Simple_Dice 'x' Die_Modifier_Val action => mo +difier_x Dice_with_modifier_r ::= Simple_Dice 'r' Die_Modifier_Val action => mo +difier_r |Simple_Dice 'r' <Die_Modifier_Comp> Die_M +odifier_Val action => modifier_r Simple_Dice ::= Rolls 'd' Sides action => do_simple_roll Die_Modifier_Val ~ digits Die_Modifier_Comp ~ 'gt' | 'lt' Rolls ~ digits Sides ~ digits digits ~ [\d]+ :discard ~ whitespace whitespace ~ [\s]+ END_OF_DSL my $grammar = Marpa::R2::Scanless::G->new( { source => \$dsl } ); my $input = $ARGV[0] // '6d4x1'; my $value_ref = $grammar->parse( \$input, 'My_Actions' ); print "\n\nFinal result: ";dd $value_ref; sub My_Actions::modifier_r{ print "modifier_r received: ";dd @_; } sub My_Actions::do_simple_roll { my ( undef, $rolls, undef, $sides ) = @_; print "do_simple_roll received: "; dd @_; my $res = []; map{ $die = 1+int(rand($sides)); print "\tRolled : $die\n"; push @$res, $die} 1..$rolls; my $return = { die_type => "1d$sides", rolls => $res}; print "do_simple_roll returning: "; dd $return; return $return; } sub My_Actions::modifier_x { my ( undef, $rolls_ref, $modifier, $modifier_val ) = @_; print "modifier_x received: "; dd @_; #dd ($rolls_ref,$modifier, $ +modifier_val ); my @descr = @{$rolls_ref->{rolls}}; # some roll need to be exploded while ( 0 < grep{ $_ =~ /^$modifier_val$/ }@descr ){ foreach my $roll( @descr ){ print "\tanalyzing: $roll\n"; if ( $roll == $modifier_val ){ $roll = $roll."x"; print "\t\texploding a die..\n"; my $new = $grammar->parse( \$rolls_ref->{die_type} +, 'My_Actions' ); print "\tdo_simple_roll returned: ";dd $new; push @descr, $$$new[1]->{rolls}->[0]; } } } my @numbers = map{ $_=~/(\d+)/; $1 }@descr; my $sum = 0; $sum += $_ for @numbers; my $return = { result => $sum, description => join ' ',@descr}; print "do_roll_with_die_modifier_x returning: "; dd $return; return $return; }

Thanks for reading

L*

PS January 16 2021 I published A dice roller system with Marpa::R2

There are no rules, there are no thumbs..
Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.

Replies are listed 'Best First'.
Re: First steps with Marpa::R2 and BNF
by duelafn (Vicar) on Jan 13, 2021 at 15:01 UTC

    I'd suggest separating parsing from evaluation. The first argument to the action subroutines is a hash that you can use for any purpose. In this case, I'd store parts of the specification in there, then return that hash (which makes the convenient default ::= action => ::first work for the Dice_Expression rule). Also, I split modifier_r_comp from modifier_r action, but you could equally merge them and count the number of arguments.

    use Marpa::R2; use Data::Dump; my $dsl = <<'END_OF_DSL'; :default ::= action => ::first lexeme default = latm => 1 Dice_Expression ::= Dice_Expression1 | Dice_Expression1 add_modifier Dice_Expression1 ::= Simple_Dice | Simple_Dice x_modifier | Simple_Dice r_modifier Simple_Dice ::= Rolls 'd' Sides action => sim +ple_roll add_modifier ::= '+' Die_Modifier_Val action => mod +ifier_add | '-' Die_Modifier_Val action => mod +ifier_add x_modifier ::= 'x' Die_Modifier_Val action => mod +ifier_x r_modifier ::= 'r' Die_Modifier_Val action => mod +ifier_r | 'r' Die_Modifier_Comp Die_Modifier_Val action => mod +ifier_r_comp Die_Modifier_Val ~ digits Die_Modifier_Comp ~ 'gt' | 'lt' Rolls ~ digits Sides ~ digits digits ~ [\d]+ :discard ~ whitespace whitespace ~ [\s]+ END_OF_DSL my $grammar = Marpa::R2::Scanless::G->new( { source => \$dsl } ); my $input = $ARGV[0] // '6d4x1'; my $parsed = $grammar->parse( \$input, 'My_Actions' ); print "\n\nParsed result: ";dd $parsed; # print "\n\nFinal result: ";dd evaluate_rolls($parsed); sub evaluate_rolls { my $spec = shift; # TODO... } sub My_Actions::modifier_add { my ( $self, $sign, $val ) = @_; $$self{add} = 0 + "$sign$val"; $self; } sub My_Actions::modifier_r { my ( $self, undef, $reroll ) = @_; $$self{r} = $reroll; return $self; } sub My_Actions::modifier_r_comp { my ( $self, undef, $comp, $reroll ) = @_; $$self{comp} = $comp; $$self{r} = $reroll; return $self; } sub My_Actions::simple_roll { my ( $self, $rolls, undef, $sides ) = @_; $$self{rolls} = $rolls; $$self{sides} = $sides; return $self; } sub My_Actions::modifier_x { my ( $self, $modifier, $modifier_val ) = @_; $$self{x} = $modifier_val; return $self; }

    Update: Added +/- modifiers

    Good Day,
        Dean

      Thanks duelafn,

      I really appreciate your code example as it seems to me a good starting point. I really like the use of $self you exploit to add optional terms to the expression. Now prehaps I understand it better.

      I still do not fully understand your:

      > which makes the convenient default ::= action => ::first work for the Dice_Expression rule

      If you have the patience to expand this further it will help me in the understanding the ::first (see also below my answer to GrandFather).

      Again about optional sub elements:

      So the only way to specify somenthing optional is:

      Dice_with_modifier_r ::= Simple_Dice 'r' Die_Modifier_Val action => mo +difier_r |Simple_Dice 'r' Die_Modifier_Comp Die_Modifie +r_Val action => modifier_r

      Right? there is no  <Optional> syntax to play with?

      I tried a single rule like:

      Dice_with_modifier_r ::= Simple_Dice 'r' <Die_Modifier_Comp>* Die_Modifier_Val action => modifier_r

      as found at the end of this post but it throws the error:

      Parse of BNF/Scanless source failed Error in SLIF parse: No lexeme found at line 11, column 61 * String before error: modifier_r ::= Simple_Dice 'r' <Die_Modifier_Co +mp> * The error was at line 11, column 61, and at character 0x002a '*', .. +. * here: * Die_Modifier_Val action => modifier_r\n\n\nSimpl Marpa::R2 exception at marpa07.pl line 41. Marpa::R2 exception at marpa07.pl line 41.

      Why on marpa-for-building-parsers there is: declaration ::= assignment* action => doResult

      and I cannot do

      Dice_with_modifier_r ::= Simple_Dice 'r' Die_Modifier_Comp* Die_Modifi +er_Val action => modifier_r Die_Modifier_Comp ~ 'gt' | 'lt'

      L*

      There are no rules, there are no thumbs..
      Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.

        Late responding, so you may have some of this figured out, but...

        default ::first: First is just a subroutine which returns the result of the first token. That is: sub ::first { return $_[1] }. Thus, Dice_Expression returns whatever Dice_Expression1 returns which returns whatever Simple_Dice returns which happens to be our "$self". After posting, I decided that if I were doing it, I would probably avoid using the default action and instead do something like:

        root ::= Dice_Expression action => finish Dice_Expression ::= ... ... same as above ... sub My_Actions::finish { my $self = shift; # ... additional cleanup or else change whole sub to just return $ +_[0] return $self }

        The advantage being that this is explicit and gives a nice hook for modifying the final result just before returning it.

        Optionals For the most part, yes, I spell it out. The "*" syntax has a big limitation: The RHS alternative must consist of a single RHS primary. which means only "NAME ::= ITEM*" rules. You can't have multiple things on the right hand side. So, it would look like:

        r_modifier ::= 'r' Die_Modifier_Comp Die_Modifier_Val action +=> modifier_r_comp Die_Modifier_Comp ::= Die_Modifier_Comp_Toke* action +=> ::first Die_Modifier_Comp_Toke ~ 'gt' | 'lt'

        The star has to move to its own rule by itself (and then we renamed the token rule). Of course, that rule won't do what you want since it doesn't limit the number of "Die_Modifier_Comp_Toke". You could instead, use an empty rule to achieve a 0-or-1 match:

        r_modifier ::= 'r' Die_Modifier_Comp Die_Modifier_Val action = +> modifier_r_comp Die_Modifier_Comp ::= Die_Modifier_Comp_Toke action = +> ::first | None Die_Modifier_Comp_Toke ~ 'gt' | 'lt' None ::=

        That seems to work, but requires "None" to be a "::=" rule, a "~" rule throws an error. I'm not sure if that means it is an abuse of syntax to do that or not, but if it works and doesn't seem to slow down the parsing, I'd say go for it. It will leave an undef in the Die_Modifier_Comp slot.

        Good Day,
            Dean

Re: First steps with Marpa::R2 and BNF
by choroba (Archbishop) on Jan 13, 2021 at 14:56 UTC
    Before reaching for Marpa, make sure you need it. Is the language you're trying to recognise more complex than regular? See Chomsky hierarchy for explanation.

    If it's regular, regular expressions should do just well. Using a context-free grammar is an overkill.

    I don't know what combinations are possible, but I have a feeling there's no nesting involved which would mean you don't need to go context-free.

    map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]
      Dear choroba,

      I consider you a friend, but your reply made my laugh :)

      let's rephrase it:

      > Q: Hey I need some help on the usage of my new electronic drill because its instruction contains too much theory..

      > A: Go study quantum mechanics to see if you can use a hammer instead

      I looked at the document you linked but I must confess I dont understand nothing of this. I'll bookmark it anyway and maybe one of my friends (a math professor) some day can explain me its meaning.

      About overkill tecniques: I dont use a batch file to check hosts in my network because perl is an overkill for such a simple task. I'd use perl because I'm better at this and it is fun!

      So I'd like to learn Marpa::R2 and BNF using this dice simultation as playground even if it can be accomplished with easy using simpler techniques (and wait: I have already done it ;) beacause the creativity is feeded by wider horizons if you know many different ways to approach the same problem.

      That said thanks for the example you linked in the chat and that I have put at the top of my program as reminders.

      L*

      There are no rules, there are no thumbs..
      Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.
Re: First steps with Marpa::R2 and BNF
by GrandFather (Saint) on Jan 14, 2021 at 09:38 UTC

    You seem to be at least at the level of this brief intro to Marpa, but may be able to find something of interest in it. Perhaps of more interest is Marpa debugging. Both these are articles I contributed to Gabor Szabo's Perl Maven website.

    Optimising for fewest key strokes only makes sense transmitting to Pluto or beyond
Re: First steps with Marpa::R2 and BNF
by GrandFather (Saint) on Jan 15, 2021 at 09:23 UTC
    3 - Returned structures

    First: sorry for the piecemeal replies. I'm playing with your code and refreshing my Marpa as I go - it's been a while since I played with it.

    You are getting a "bogus" array because that is the default action and you haven't specified actions for Dice_Expression ::= .... You can fix it by:

    Dice_Expression ::= Simple_Dice action => ::first |Dice_with_modifier_x action => ::first |Dice_with_modifier_r action => ::first

    which will cause just the hash ref to be returned from $grammar->parse(\$input, 'My_Actions'). See the semantics section in the Marpa::R2 documentation.

    Optimising for fewest key strokes only makes sense transmitting to Pluto or beyond
      no sorry GrandFather but a big thanks

      Anyway i dont understand your reply. I removed :default ::= action => [name,values] and is this the root cause of the "bogus" array (it makes sense to know which name is passing around but I consider it misleading in the sysnopsis).

      I read in the doc you linked:

      > The "::first" action indicates that the value of a rule is to be the value of its first child, that is, the value corresponding to the first symbol of the rule's RHS.

      But I cannot see how your addition of action => ::first can fix it. Given I removed the :default ::= action I get the same results with or without ::first

      Dice_Expression ::= Simple_Dice |Dice_with_modifier_x |Dice_with_modifier_r do_simple_roll received: ({}, 4, "d", 6) Rolled : 4 Rolled : 3 Rolled : 6 Rolled : 2 do_simple_roll returning: { die_type => "1d6", rolls => [4, 3, 6, 2] } modifier_r received: ({}, { die_type => "1d6", rolls => [4, 3, 6, 2] } +, "r", 1) Dice_Expression ::= Simple_Dice action => ::first |Dice_with_modifier_x action => ::first |Dice_with_modifier_r action => ::first do_simple_roll received: ({}, 4, "d", 6) Rolled : 3 Rolled : 2 Rolled : 5 Rolled : 6 do_simple_roll returning: { die_type => "1d6", rolls => [3, 2, 5, 6] } modifier_r received: ({}, { die_type => "1d6", rolls => [3, 2, 5, 6] } +, "r", 1)

      L*

      There are no rules, there are no thumbs..
      Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.

        Perhaps a full example will help:

        use strict; use warnings; use Marpa::R2; my $dsl = <<'END_OF_DSL'; lexeme default = latm => 1 Dice_Expression ::= Simple_Dice action => ::first Simple_Dice ::= Rolls ('d') Sides action => do_simple_roll Rolls ~ digits Sides ~ digits digits ~ [\d]+ :discard ~ whiteSpace whiteSpace ~ [\s]+ END_OF_DSL package Actions; sub do_simple_roll { my (undef, $rolls, $sides) = @_; my @res = map {1 + int(rand($sides))} 1 .. $rolls; return \@res; } package main; my $grammar = Marpa::R2::Scanless::G->new({ source => \$dsl}); my $input = '2d6'; my $rolls = $grammar->parse(\$input, 'Actions'); print "Result: @$$rolls\n";

        Note the :default ::= action => ::first. Actually we can omit the default line altogether and then change the Dice_Expression line to:

        Dice_Expression ::= Simple_Dice action => ::first

        with the same result. The action determines what is passed up to the next level. If there is no action and no explicit default action undef is passed back up (although that isn't what the documentation seems to say). The default action is used where there is no explicit action. Perhaps the key thing to understand is that the called sub gets a parameter in its argument list for each RHS primary (see RHS alternatives) unless the primary is hidden. RHS values are either constant strings or the LHS for a G1 rule, or a L0 symbol.

        Optimising for fewest key strokes only makes sense transmitting to Pluto or beyond
Re: First steps with Marpa::R2 and BNF
by GrandFather (Saint) on Jan 15, 2021 at 08:26 UTC
    1 - Marpa uses regexes but..

    Um, actually Marpa uses character classes, just character classes, or literal matches (with 'single quoted strings'). See Marpa's DSL documentation.

    Optimising for fewest key strokes only makes sense transmitting to Pluto or beyond

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://11126847]
Approved by LanX
Front-paged by hippo
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (5)
As of 2021-01-22 11:58 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Notices?