Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

Re^4: Perl Parsing Based on Supplied Precedence

by roboticus (Chancellor)
on Nov 07, 2012 at 16:23 UTC ( #1002689=note: print w/ replies, xml ) Need Help??


in reply to Re^3: Perl Parsing Based on Supplied Precedence
in thread Perl Parsing Based on Supplied Precedence

wirito:

That was an amusing diversion. I played around with it and added a couple of features for fun:

  • Parenthesis now work for grouping
  • Now we have variables
  • It emits the RPN version of the expression

I tried monkeying with regexes to add parenthesis, and then remembered Text::Balanced, so I used that to extract the subexpressions. The parsed subexpressions are then stored in the %temps hash. The expression is then rewritten, replacing the subexpression with the temp name. Finally, when we wind up with a single token (variable, value or temp name), we check the temps hash, and if we get a match, we return the parse tree we stowed away there.

In order to prevent collisions between temp names and variable names, I map the expression to lower case before parsing. So I split the parse function into two bits.

I built a to_RPN() function by whacking on your original evaluate() routine. Then I made an evaluator for the RPN string. I didn't update your evaluate() function, so I removed it for now. It's a simple modification to bring it up-to-date with the variable assignments, but I gotta get back to work.

Finally, I handle variables by simply looking at any scalar I pop from the stack to see if it has any alpha characters and if it's in the %vars hash. If so, I look up the appropriate value.

There's still plenty of room for cleanup, simplification and such, but I thought I'd stop for now. I hope someone finds it amusing...

$ cat expression_evaluator.pl #!/usr/bin/perl use warnings; use strict; use Data::Dumper; use Scalar::Util qw( looks_like_number ); use Text::Balanced qw( extract_bracketed ); $Data::Dumper::Terse=1; my $precedence_perlop=[ qr/(?:\/|\*|\%|x)/, qr/(?:\+|-|\.)/, qr/(?:<=|>=|<|>lt|gt|le|ge)/, qr/&/, qr/(?:\||\^)/, qr/&&/, qr/(?:\|\||\/\/)/, qr/(not)/, qr/(and)/, qr/(or|xor)/, qr/(?::=)/, ]; my %vars; my %temps; my $last_temp="TEMP000"; sub parse { # Map all variable names to lower case, so temp values in upper ca +se # won't collide my ($regex,$input)=@_; $input = lc($input); # Reset temporaries $last_temp="TEMP000"; return parse_helper($regex, $input); } sub parse_helper{ my ($regex,$input)=@_; $input=~s/\s//g; # parse subexpressions into temps if ($input=~m/(.*?)(\(.*)/) { my $cur_temp = $last_temp++; # get next temp # Split into "$before ($subexpr) $after" my ($before, $nested, $node) = ($1, $2); my ($subexpr, $after) = Text::Balanced::extract_bracketed($nested, +"()"); $subexpr = substr($subexpr,1,length($subexpr)-2); # clip parens # Parse & store temporary variable $temps{$cur_temp} = parse_helper($regex, $subexpr); # Rewrite expression with temporary variable name $input = $before . $cur_temp . $after; } # parse as usual for(reverse @$regex){ if($input=~m/(.+)($_)(.+)/){ my ($before,$op,$after,$node)=($1,$2,$3); $node->{$op}=[parse_helper($regex,$before),parse_helper($regex,$ +after)]; return $node; } } # Return temporary expression tree or scalar value if DNE return exists $temps{$input} ? $temps{$input} : $input; } sub to_RPN { my $tree = shift; return $tree unless ref $tree; foreach my $op ( keys %{$tree} ) { my @terms = map { to_RPN($_) } @{ $tree->{$op} }; return join(" ", @terms, $op); } } sub evaluate_RPN { my $RPN = shift; my @stack; my $pop = sub { die "EMPTY STACK! (malformed expression...)" unless @s +tack; my $t = pop @stack; if ($t =~ /^[a-z][a-z0-9]*$/) { $t = $vars{$t} // 0; } return $t; }; for (split /\s+/, $RPN) { if ($_ =~ /^[a-z0-9.]*$/) { # Value or variable name push @stack, $_; } elsif ($_ eq ':=') { # Variable assignment my $value = $pop->(); my $varname = pop @stack; $vars{$varname} //= $value; print " ($varname set to <$value>)"; return $value; } else { my $R = $pop->(); #pop @stack; my $L = $pop->(); #pop @stack; push @stack, eval "$L $_ $R"; } } die "EXTRA JUNK ON STACK! (malformed expression)" unless @stack == 1 +; return $stack[-1]; } while(<>){ s/^\s+//; s/\s+$//; my $tree = parse($precedence_perlop,$_); my $RPN = to_RPN($tree); print "RPN: $RPN"; print " ====> ", evaluate_RPN($RPN), "\n"; print "Variables: ", join(", ", map { "$_:$vars{$_}" } sort keys %va +rs), "\n"; }

A sample run gives:

$ ./expression_evaluator.pl a:=1*(2+(3/5+2)) RPN: a 1 2 3 5 / 2 + + * := (a set to <4.6>) ====> 4.6 Variables: a:4.6 b:=a+15 RPN: b a 15 + := (b set to <19.6>) ====> 19.6 Variables: a:4.6, b:19.6 c:=(b-a)*(5+3+(9-6)*3) RPN: c b a - 5 3 + 9 6 - 3 * + * := (c set to <255>) ====> 255 Variables: a:4.6, b:19.6, c:255

...roboticus

When your only tool is a hammer, all problems look like your thumb.


Comment on Re^4: Perl Parsing Based on Supplied Precedence
Select or Download Code
Replies are listed 'Best First'.
Re^5: Perl Parsing Based on Supplied Precedence
by wirito (Acolyte) on Nov 07, 2012 at 16:42 UTC
    Cool!
    I was playing to add parenthesis too. But as I have a 'difficult' access to CPAN I didn't tried Text::Balanced. However I reach the regex monkey way to deal with it, just adding a two globals and a new loop on parse():
    my $par; my $par_count = 0; sub parse{ my ($regex,$input)=@_; $input =~ /^__PAREN(\d+)__$/ and return $par->{$1}; $input=~s/\s//g; while( $input =~ /\(([^()]+)\)/ ) { my $sub = $1; $par->{$par_count} = parse($regex, $sub); my $tag = '__PAREN' . $par_count++ . '__'; $input =~ s/\(([^()]+)\)/$tag/; } for(reverse @$regex){ if($input=~m/(.+)($_)(.+)/) { my ($before,$op,$after,$node) = ($1, $2, $3); $node->{$op}=[parse($regex,$before),parse($regex,$after)]; return $node; } } return $input; }
    Your variable handling has amazed me :)

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1002689]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others scrutinizing the Monastery: (15)
As of 2015-07-28 18:41 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (258 votes), past polls