Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

Re: [Marpa::R2] More Help With Grammar (:discard whitespace fa fa fa)

by Anonymous Monk
on Nov 17, 2013 at 02:25 UTC ( #1062958=note: print w/ replies, xml ) Need Help??


in reply to [Marpa::R2] More Help With Grammar

One problem is that you discard whitespace, but whitespace include \n, so \n is discarded before you can recognize it -- this might be a bug in marpa (or maybe some kind of configurable option ... blah blah blah ask on list :)

One workaround is to redefine what you mean by whitespace, like

ws ~ [ \f\r\t]+

Basically, I need to figure out how to say "if the line ends in "\n" stop processing this line, if it ends in "\\n" then the next line is part of the expression.

I kinda thought you figured this out, something like

blahblah ::= stuff continuationcharacter blahblah ## recurse to self | stuff endoflinestop

unicharproptoregexrange.pl - program to expand PosixPunct  '[:punct:]' => ['punctuation characters',, '\p{PosixPunct}', ],

Example output of which includes "\\\n" as part of a line ... gee, i should probably just discard it

#!/usr/bin/perl -- use strict; use warnings; use Data::Dump qw/ dd pp /; use Marpa::R2; Main( @ARGV ); exit( 0 ); sub Main { my $grammar_spec = get_grammar(); my $test_input = test_input(); my $grammar = Marpa::R2::Scanless::G->new({ bless_package => 'Ast' +, source => \$grammar_spec, }); my $recce = Marpa::R2::Scanless::R->new({ grammar => $grammar }); $recce->read(\$test_input); my $val = $recce->value; dd( $val ); } ## no definition means "Unproductive lexical symbols: <ws>" sub get_grammar { return q{ :default ::= action => [values] :start ::= Lines Lines ::= Line* Line ::= Words <continuation> <Line> | Words <new line> | <new line> Words ::= <Word>* Word ::= <wordchars><punctuationwordchars> | <wordchars> punctuationwordchars ::= <PosixPunct><wordchars> | <PosixPunct> continuation ~ '\\' [\n] PosixPunct ~ [\N{U+0021}-\N{U+002F}\N{U+003A}-\N{U+0040}\N{U ++005B}-\N{U+0060}\N{U+007B}-\N{U+007E}]+ wordchars ~ [\w]+ <new line> ~ [\n]+ :discard ~ ws ws ~ [ \f\r\t]+ } } sub test_input { #~ I can't seem to face up to the facts #~ I'm tense and nervous and I\ #~ Can't relax return q{ I can't sleep 'cause my bed's on fire Don't touch me I'm a real live wire a b c\ 1 2 3\ do re mi you and me girl } } __END__ \[ ["\n"], [ [ ["I"], ["can", ["'", "t"]], ["sleep", ["'", "cause"]], ["my"], ["bed", ["'", "s"]], ["on"], ["fire"], ], "\n", ], [ [ ["Don", ["'", "t"]], ["touch"], ["me"], ["I", ["'", "m"]], ["a"], ["real"], ["live"], ["wire"], ], "\n", ], [ [["a"], ["b"], ["c"]], "\\\n", [[[1], [2], [3]], "\\\n", [[["do"], ["re"], ["mi"]], "\n"]], ], [[["you"], ["and"], ["me"], ["girl"]], "\n"], ]

discardedededit

#!/usr/bin/perl -- use strict; use warnings; use Data::Dump qw/ dd pp /; use Marpa::R2; Main( @ARGV ); exit( 0 ); sub Main { my $grammar_spec = get_grammar(); my $test_input = test_input(); my $grammar = Marpa::R2::Scanless::G->new({ bless_package => 'Ast' +, source => \$grammar_spec, }); my $recce = Marpa::R2::Scanless::R->new({ grammar => $grammar }); $recce->read(\$test_input); my $val = $recce->value; dd( $val ); } ## no definition means "Unproductive lexical symbols: <ws>" sub get_grammar { return q{ :default ::= action => [values] :start ::= Lines Lines ::= Line* Line ::= Words <new line> | <new line> Words ::= <Word>* Word ::= <wordchars><punctuationwordchars> | <wordchars> punctuationwordchars ::= <PosixPunct><wordchars> | <PosixPunct> continuation ~ '\\' [\n] PosixPunct ~ [\N{U+0021}-\N{U+002F}\N{U+003A}-\N{U+0040}\N{U ++005B}-\N{U+0060}\N{U+007B}-\N{U+007E}]+ wordchars ~ [\w]+ <new line> ~ [\n]+ :discard ~ ws :discard ~ continuation ws ~ [ \f\r\t]+ } } sub test_input { #~ I can't seem to face up to the facts #~ I'm tense and nervous and I\ #~ Can't relax return q{ I can't sleep 'cause my bed's on fire Don't touch me I'm a real live wire a b c\ 1 2 3\ do re mi you and me girl } } __END__ \[ ["\n"], [ [ ["I"], ["can", ["'", "t"]], ["sleep", ["'", "cause"]], ["my"], ["bed", ["'", "s"]], ["on"], ["fire"], ], "\n", ], [ [ ["Don", ["'", "t"]], ["touch"], ["me"], ["I", ["'", "m"]], ["a"], ["real"], ["live"], ["wire"], ], "\n", ], [ [["a"], ["b"], ["c"], [1], [2], [3], ["do"], ["re"], ["mi"]], "\n", ], [[["you"], ["and"], ["me"], ["girl"]], "\n"], ]


Comment on Re: [Marpa::R2] More Help With Grammar (:discard whitespace fa fa fa)
Select or Download Code

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1062958]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others meditating upon the Monastery: (11)
As of 2014-10-23 21:43 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    For retirement, I am banking on:










    Results (129 votes), past polls