in reply to Re^3: Cannot get Marpa::R2 to prioritise one rule over another
in thread Cannot get Marpa::R2 to prioritise one rule over another

The first argument is there for you, you can store whatever you want in it. But if you can build the result just by composition, I don't see a reason to use it.

AFAIK, there aren't many predefined actions (::first, [name,values]). Concatenation is definitely not a universal thing, you typically propagate structures, not strings.

map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]

Replies are listed 'Best First'.
Re^5: Cannot get Marpa::R2 to prioritise one rule over another
by Anonymous Monk on Jan 21, 2021 at 21:56 UTC

    I might not be explaining myself properly...

    In the contrived below example, the default action is [name,values], as you recommend:

    #!/usr/bin/env perl use warnings; use strict; use Data::Dumper::Concise; use Term::ANSIColor qw(:constants); use Marpa::R2; package main; my $rules = <<'END_OF_GRAMMAR'; lexeme default = latm => 1 :default ::= action => [name,values] :start ::= <entry> <entry> ::= 'foo' (SP) <hostaddr4> | 'bar' (SP) <hostaddr4> | 'baz' (SP) <hostaddr4> <ipv4> ::= NUMBER ('.') NUMBER ('.') NUMBER ('.') NUMBER <hostname> ::= NAMECH+ separator => DOT <hostaddr4> ::= <ipv4> | <hostname> SP ~ [\s]+ DOT ~ '.' NAMECH ~ [^\s.:]+ NUMBER ~ [\d]+ END_OF_GRAMMAR my $input = <<'END_OF_INPUT'; foo 192.0.2.1 foo www.example.org bar 192.0.2.2 bar 3.2.0.192.in-arpa.net baz 192.0.2 baz flibble.example.com END_OF_INPUT my $grammar = Marpa::R2::Scanless::G->new({source => \$rules}); for (split /^/m, $input) { chomp; if (length $_) { print "\n\n$_\n"; my $recce = Marpa::R2::Scanless::R->new({grammar => $grammar, +ranking_method => 'rule', semantics_package => 'main'}); eval { $recce->read(\$_ ) }; print(($@ ? (RED . "$@\n") : GREEN), $recce->show_progress(), "\n", Dumper($recce->value), "\n\n", RESET); } }

    This results in a parse result that makes it clear whether the hostaddr4 component is an ipv4 or a hostname, by the first element that gets pushed onto the array, but requires that I later recompose both IPv4 addresses and hostnames, e.g. by shift; join '.', @_:

    \[ "entry", "foo", [ "hostaddr4", [ "ipv4", 192, 0, 2, 1, ], ], ] ... \[ "entry", "foo", [ "hostaddr4", [ "hostname", "www", "example", "org", ], ], ] ...

    If I update the grammar slightly, to use a custom action to recompose these for me:

    <ipv4> ::= NUMBER ('.') NUMBER ('.') NUMBER ('.') NUMBER + action => joindot <hostname> ::= NAMECH+ separator => DOT + action => joindot ... sub joindot { shift, join '.', @_ }

    ...the resulting parse structure no longer has the ipv4 or hostname indicators, and there appears not to be enough information in the arguments passed to joindot for it to return it:

    \[ "entry", "foo", [ "hostaddr4", "192.0.2.1", ], ] ... \[ "entry", "foo", [ "hostaddr4", "www.example.org", ], ]

    The only approach I can see to do so is to define the grammar with two near-identical function, one that emits 'ipv4' and one that emits 'hostname':

    <ipv4> ::= NUMBER ('.') NUMBER ('.') NUMBER ('.') NUMBER + action => joinipv4 <hostname> ::= NAMECH+ separator => DOT + action => joinhostname ... sub joindot { join '.', @_ } sub joinipv4 { shift, ["ipv4", (joindot @_)] } sub joinhostname { shift, ["hostname", (joindot @_)] }

    But, in a larger grammar, that repetition is a pain, especially when the information is already known to the parser (and can be emitted automatically where the action is non-custom).

    Am I missing a trick here?

      This is Perl. Hack around!
      <ipv4> ::= NUMBER ('.') NUMBER ('.') NUMBER ('.') NUMBER act +ion => joindot <hostname> ::= NAMECH+ separator => DOT act +ion => joindot <hostaddr4> ::= <ipv4> act +ion => add_ip | <hostname> act +ion => add_host ... # There should be a semicolon, not a comma. # | # v sub joindot { shift; join '.', @_ } sub AUTOLOAD { die 'Invalid action' unless $::AUTOLOAD =~ /^add_(ip|host)$/; {type => $1, $1 => $_[1]} }

      map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]