Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic

Re^2: Perl Parsing Based on Supplied Precedence

by protist (Monk)
on Nov 07, 2012 at 03:33 UTC ( #1002622=note: print w/replies, xml ) Need Help??

in reply to Re: Perl Parsing Based on Supplied Precedence
in thread Perl Parsing Based on Supplied Precedence

Your change in the math operator precedence looks good. I did not test the logic of the math operators so much as that the precedence supplied was correctly represented in the tree. My "&&", as described by precedence, is closer to "and" (in Perl). Keep in mind, that the idea here was not to properly parse Perl, but to properly parse grammars (in general) by defined operator precedence.

Thank you for your feedback, I'm happy to see that someone was interested enough to play with it. :)

  • Comment on Re^2: Perl Parsing Based on Supplied Precedence

Replies are listed 'Best First'.
Re^3: Perl Parsing Based on Supplied Precedence
by wirito (Acolyte) on Nov 07, 2012 at 10:52 UTC
    In fact, I find this pretty interesting:
    #!/usr/bin/perl use warnings; use strict; use Data::Dumper; $Data::Dumper::Terse=1; my $precedence_perlop=[ qr/(?:\/|\*|\%|x)/, qr/(?:\+|-|\.)/, qr/(?:<=|>=|<|>lt|gt|le|ge)/, qr/&/, qr/(?:\||\^)/, qr/&&/, qr/(?:\|\||\/\/)/, qr/(not)/, qr/(and)/, qr/(or|xor)/ ]; sub parse{ my ($regex,$input)=@_; $input=~s/\s//g; for(reverse @$regex){ if($input=~m/(.+)($_)(.+)/){ my ($before,$op,$after,$node)=($1,$2,$3); $node->{$op}=[parse($regex,$before),parse($regex,$after)]; return $node; } } return $input; } sub evaluate { my $tree = shift; return $tree unless ref $tree; foreach my $op ( keys %{$tree} ) { my @terms = map { evaluate($_) } @{ $tree->{$op} }; my $result; eval "\$result = $terms[0] $op $terms[1]"; return undef if $@; return $result; } } while(<>){ my $tree = parse($precedence_perlop,$_); print Dumper($tree); my $ev = evaluate( $tree ); print "Evaluate to: " . $ev . "\n" if defined $ev; }


      That was an amusing diversion. I played around with it and added a couple of features for fun:

      • Parenthesis now work for grouping
      • Now we have variables
      • It emits the RPN version of the expression

      I tried monkeying with regexes to add parenthesis, and then remembered Text::Balanced, so I used that to extract the subexpressions. The parsed subexpressions are then stored in the %temps hash. The expression is then rewritten, replacing the subexpression with the temp name. Finally, when we wind up with a single token (variable, value or temp name), we check the temps hash, and if we get a match, we return the parse tree we stowed away there.

      In order to prevent collisions between temp names and variable names, I map the expression to lower case before parsing. So I split the parse function into two bits.

      I built a to_RPN() function by whacking on your original evaluate() routine. Then I made an evaluator for the RPN string. I didn't update your evaluate() function, so I removed it for now. It's a simple modification to bring it up-to-date with the variable assignments, but I gotta get back to work.

      Finally, I handle variables by simply looking at any scalar I pop from the stack to see if it has any alpha characters and if it's in the %vars hash. If so, I look up the appropriate value.

      There's still plenty of room for cleanup, simplification and such, but I thought I'd stop for now. I hope someone finds it amusing...

      A sample run gives:

      $ ./ a:=1*(2+(3/5+2)) RPN: a 1 2 3 5 / 2 + + * := (a set to <4.6>) ====> 4.6 Variables: a:4.6 b:=a+15 RPN: b a 15 + := (b set to <19.6>) ====> 19.6 Variables: a:4.6, b:19.6 c:=(b-a)*(5+3+(9-6)*3) RPN: c b a - 5 3 + 9 6 - 3 * + * := (c set to <255>) ====> 255 Variables: a:4.6, b:19.6, c:255


      When your only tool is a hammer, all problems look like your thumb.

        I was playing to add parenthesis too. But as I have a 'difficult' access to CPAN I didn't tried Text::Balanced. However I reach the regex monkey way to deal with it, just adding a two globals and a new loop on parse():
        my $par; my $par_count = 0; sub parse{ my ($regex,$input)=@_; $input =~ /^__PAREN(\d+)__$/ and return $par->{$1}; $input=~s/\s//g; while( $input =~ /\(([^()]+)\)/ ) { my $sub = $1; $par->{$par_count} = parse($regex, $sub); my $tag = '__PAREN' . $par_count++ . '__'; $input =~ s/\(([^()]+)\)/$tag/; } for(reverse @$regex){ if($input=~m/(.+)($_)(.+)/) { my ($before,$op,$after,$node) = ($1, $2, $3); $node->{$op}=[parse($regex,$before),parse($regex,$after)]; return $node; } } return $input; }
        Your variable handling has amazed me :)

      That is cool. :) I may come back to this parser and make something "smarter" or easier to play with.

Re^3: Perl Parsing Based on Supplied Precedence
by wirito (Acolyte) on Nov 07, 2012 at 09:06 UTC
    With boolean operators there is the same issue. '&&' and 'and' operator has higher precedence that '||' and 'or' (check perlop link on my first post). My approach gives accurate results.
    Then 'COND1&&COND2||COND3' is the same as '(COND1&&COND2)||COND3':
    $ perl COND1&&COND2||COND3 ORIG: { '&&' => [ 'COND1', { '||' => [ 'COND2', 'COND3' ] } ] } NEW: { '||' => [ { '&&' => [ 'COND1', 'COND2' ] }, 'COND3' ] }

      "&&" and "and" have differing precedences. "and" is lower precedence than "&&". The same goes for "||" and "or"; "or" has lower precedence than "||". "and" actually has lower precedence than "||", as I will demonstrate.

      code with "&&":

      perl -e 'if(0&& 0||1){print"hello\n"}'



      code with "and":

      perl -e 'if(0and 0||1){print"hello\n"}'

      (outputs nothing)

        Yes, you are right. I meant to say:
        '&&' and 'and' operator has higher precedence than '||' and 'or' respectively
        English is not my first language as you can notice :)

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1002622]
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others taking refuge in the Monastery: (4)
As of 2017-10-21 13:21 GMT
Find Nodes?
    Voting Booth?
    My fridge is mostly full of:

    Results (270 votes). Check out past polls.