Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer

Re^3: Cannot get Marpa::R2 to prioritise one rule over another

by Anonymous Monk
on Jan 21, 2021 at 21:07 UTC ( #11127236=note: print w/replies, xml ) Need Help??

in reply to Re^2: Cannot get Marpa::R2 to prioritise one rule over another
in thread Cannot get Marpa::R2 to prioritise one rule over another

Thanks for demonstrating how to recompose the dotted components of hostnames and IPs, using a custom action. I had been wondering how best to go about that, and you have given me a starting point.

One question, regarding your concat subroutine, if I may: Is it possible to generalise it to return the [rulename,concatted-string] pair, so it conforms to the tokens emitted by the default action [name,values], or would I have to have a separate subroutine for each rule (and return the rulename literally)?

I had originally thought there might be context in first argument, which you shift over, but that appears to be an empty hashref in all cases I've seen.

Replies are listed 'Best First'.
Re^4: Cannot get Marpa::R2 to prioritise one rule over another
by choroba (Archbishop) on Jan 21, 2021 at 21:14 UTC
    The first argument is there for you, you can store whatever you want in it. But if you can build the result just by composition, I don't see a reason to use it.

    AFAIK, there aren't many predefined actions (::first, [name,values]). Concatenation is definitely not a universal thing, you typically propagate structures, not strings.

    map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]

      I might not be explaining myself properly...

      In the contrived below example, the default action is [name,values], as you recommend:

      #!/usr/bin/env perl use warnings; use strict; use Data::Dumper::Concise; use Term::ANSIColor qw(:constants); use Marpa::R2; package main; my $rules = <<'END_OF_GRAMMAR'; lexeme default = latm => 1 :default ::= action => [name,values] :start ::= <entry> <entry> ::= 'foo' (SP) <hostaddr4> | 'bar' (SP) <hostaddr4> | 'baz' (SP) <hostaddr4> <ipv4> ::= NUMBER ('.') NUMBER ('.') NUMBER ('.') NUMBER <hostname> ::= NAMECH+ separator => DOT <hostaddr4> ::= <ipv4> | <hostname> SP ~ [\s]+ DOT ~ '.' NAMECH ~ [^\s.:]+ NUMBER ~ [\d]+ END_OF_GRAMMAR my $input = <<'END_OF_INPUT'; foo foo bar bar baz 192.0.2 baz END_OF_INPUT my $grammar = Marpa::R2::Scanless::G->new({source => \$rules}); for (split /^/m, $input) { chomp; if (length $_) { print "\n\n$_\n"; my $recce = Marpa::R2::Scanless::R->new({grammar => $grammar, +ranking_method => 'rule', semantics_package => 'main'}); eval { $recce->read(\$_ ) }; print(($@ ? (RED . "$@\n") : GREEN), $recce->show_progress(), "\n", Dumper($recce->value), "\n\n", RESET); } }

      This results in a parse result that makes it clear whether the hostaddr4 component is an ipv4 or a hostname, by the first element that gets pushed onto the array, but requires that I later recompose both IPv4 addresses and hostnames, e.g. by shift; join '.', @_:

      \[ "entry", "foo", [ "hostaddr4", [ "ipv4", 192, 0, 2, 1, ], ], ] ... \[ "entry", "foo", [ "hostaddr4", [ "hostname", "www", "example", "org", ], ], ] ...

      If I update the grammar slightly, to use a custom action to recompose these for me:

      <ipv4> ::= NUMBER ('.') NUMBER ('.') NUMBER ('.') NUMBER + action => joindot <hostname> ::= NAMECH+ separator => DOT + action => joindot ... sub joindot { shift, join '.', @_ }

      ...the resulting parse structure no longer has the ipv4 or hostname indicators, and there appears not to be enough information in the arguments passed to joindot for it to return it:

      \[ "entry", "foo", [ "hostaddr4", "", ], ] ... \[ "entry", "foo", [ "hostaddr4", "", ], ]

      The only approach I can see to do so is to define the grammar with two near-identical function, one that emits 'ipv4' and one that emits 'hostname':

      <ipv4> ::= NUMBER ('.') NUMBER ('.') NUMBER ('.') NUMBER + action => joinipv4 <hostname> ::= NAMECH+ separator => DOT + action => joinhostname ... sub joindot { join '.', @_ } sub joinipv4 { shift, ["ipv4", (joindot @_)] } sub joinhostname { shift, ["hostname", (joindot @_)] }

      But, in a larger grammar, that repetition is a pain, especially when the information is already known to the parser (and can be emitted automatically where the action is non-custom).

      Am I missing a trick here?

        This is Perl. Hack around!
        <ipv4> ::= NUMBER ('.') NUMBER ('.') NUMBER ('.') NUMBER act +ion => joindot <hostname> ::= NAMECH+ separator => DOT act +ion => joindot <hostaddr4> ::= <ipv4> act +ion => add_ip | <hostname> act +ion => add_host ... # There should be a semicolon, not a comma. # | # v sub joindot { shift; join '.', @_ } sub AUTOLOAD { die 'Invalid action' unless $::AUTOLOAD =~ /^add_(ip|host)$/; {type => $1, $1 => $_[1]} }

        map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://11127236]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others about the Monastery: (3)
As of 2021-03-01 00:43 GMT
Find Nodes?
    Voting Booth?

    No recent polls found