Re: Simple Parser Combinator Implementation

in reply to Simple Parser Combinator Implementation

That looks like it might be very interesting--I like playing with parsing, so I typically read all nodes that talk about it. Unfortunately, I'm not conversant with functional programming, nor familiar with the Hutton paper. So I can't really dig into it unless I'm willing to do a bit of digging.

If you want people to take a closer look at it you might need to add a couple of links to your post to the relevent background information. A sentence or two describing why and how we would use such a thing would be even better.

Having said that, I find the first block of code to be mostly clear and easy to read. If I were going to try to work on it, it doesn't look like it would be too difficult to maintain. There are a few odd variable names, but the way you're using other variable names, I'm guessing that they're relatively obvious contractions to someone familiar with the problem domain.

The only criticisms I can offer at this time are:

In some functions the variable names are far too short to be meaningful. Giving more meaningful names to the variables here and there would make the code even easier to read.
I may be missing something, but it appears that you're trying to bless some code references, but not providing the class to bless the code reference into. I don't do much object-oriented coding in perl, so it could easily be that I'm simply missing something.

That's the best I can do. Without enough background, I'm afraid the second code block is rather difficult for me to read. I can decipher bits of the syntax, but I can't see why or how you're tying it all together.

I hope you find this somewhat helpful.

...roboticus

When your only tool is a hammer, all problems look like your thumb.

Comment on Re: Simple Parser Combinator Implementation

Replies are listed 'Best First'.
Re^2: Simple Parser Combinator Implementation by withering (Monk) on Sep 08, 2013 at 03:23 UTC
Thanks a lot for your criticisms! Though the paper can be found through Google Scholar, I will add some simple introduction for the combinators later -- try to make them short :D The odd or short variable names, as you said, are just mathematic-style names, whose meaning is clear to people familiar with the parser combinator theory. I will rename them in the later version of my local clone. As for the blessings, please refer to perldoc bless. There is the second form 'bless REF', where CLASSNAME is omitted and the current package is used. The former test case I used is a expression calculator: my ($expn, $term, $factor, $num); wraith_rule->makerules(\$expn, \$term, \$factor, \$num); $expn = ( (\$term >> $wraith::token->('\+') >> \$expn) sub { [ $_[0 +]->[0] + $_[0]->[2] ] } ) \| ( (\$term >> $wraith::token->('-') >> \$expn) sub { [ $_[0] +->[0] - $_[0]->[2] ] } ) \| ( \$term ); $term = ( (\$factor >> $wraith::token->('\') >> \$term) * sub { [ $_ +[0]->[0] * $_[0]->[2] ] } ) \| ( (\$factor >> $wraith::token->('\/') >> \$term) sub { $_[0]->[2] ? [ $_[0]->[0] / $_[0]->[2] ] : [] } ) \| ( \$factor ); $factor = ( (\$num) sub { my $args = $_[0]; my $val = undef; for my + $elt (@$args) { $val .= $elt; } [ $val ] } ) \| ( ( $wraith::token->('\(') >> \$expn >> $wraith::token->('\) +') ) ** sub { my $args = $_[0]; [ $args->[1] ] } ); $num = $wraith::token->('[1-9][0-9]'); print $expn->('2 + (4 - 1) 3 + 4 -2')->[0]->[0]->[0], "\n"; [download] The corresponding BNFs are E -> T + E \| T - E \| T, T -> F * T \| F / T \| F, F -> num \| ( E ), where num is a terminal symbol (a natural number). The overloaded operator >> means sequence, i.e, A >> B means the concatenation of A and B. Operator \| has the same meaning with BNF operator \| (alternative). Perl operator is overloaded for semantic action, e.g, ALPHA sub { ... } means that when product ALPHA is correctly matched, the second operand of ** is executed, with its only argument being a reference to a list of return values of each term (terminal or nonterminal symbol) in product ALPHA. Just like what we use in YAPP except there is only one argument to the semantic action sub.	[reply] [d/l]

Replies are listed 'Best First'.

Re^2: Simple Parser Combinator Implementation
by withering (Monk) on Sep 08, 2013 at 03:23 UTC

Thanks a lot for your criticisms!

Though the paper can be found through Google Scholar, I will add some simple introduction for the combinators later -- try to make them short :D

The odd or short variable names, as you said, are just mathematic-style names, whose meaning is clear to people familiar with the parser combinator theory. I will rename them in the later version of my local clone.

As for the blessings, please refer to perldoc bless. There is the second form 'bless REF', where CLASSNAME is omitted and the current package is used.

The former test case I used is a expression calculator:

my ($expn, $term, $factor, $num);
wraith_rule->makerules(\$expn, \$term, \$factor, \$num);

$expn = ( (\$term >> $wraith::token->('\+') >> \$expn) ** sub { [ $_[0
+]->[0] + $_[0]->[2] ] } ) |
        ( (\$term >> $wraith::token->('-') >> \$expn) ** sub { [ $_[0]
+->[0] - $_[0]->[2] ] } ) |
        ( \$term );

$term = ( (\$factor >> $wraith::token->('\*') >> \$term) ** sub { [ $_
+[0]->[0] * $_[0]->[2] ] } ) |
        ( (\$factor >> $wraith::token->('\/') >> \$term) **
            sub { $_[0]->[2] ? [ $_[0]->[0] / $_[0]->[2] ] : [] } ) |
        ( \$factor );

$factor = ( (\$num) ** sub { my $args = $_[0]; my $val = undef; for my
+ $elt (@$args) { $val .= $elt; } [ $val ] } ) |
          ( ( $wraith::token->('\(') >> \$expn >> $wraith::token->('\)
+') ) ** sub { my $args = $_[0]; [ $args->[1] ] } );

$num = $wraith::token->('[1-9][0-9]*');

print $expn->('2 + (4 - 1) * 3 + 4 -2')->[0]->[0]->[0], "\n";
[download]

The corresponding BNFs are

E -> T + E | T - E | T,

T -> F * T | F / T | F,

F -> num | ( E ),

where num is a terminal symbol (a natural number).

The overloaded operator >> means sequence, i.e, A >> B means the concatenation of A and B. Operator | has the same meaning with BNF operator | (alternative). Perl operator ** is overloaded for semantic action, e.g, ALPHA ** sub { ... } means that when product ALPHA is correctly matched, the second operand of ** is executed, with its only argument being a reference to a list of return values of each term (terminal or nonterminal symbol) in product ALPHA. Just like what we use in YAPP except there is only one argument to the semantic action sub.

[reply]
[d/l]

In Section Meditations