comment on

I have something similar in an interpreter I started to write ages ago in perl, but has abandonned it. The interpreter works but cannot interpret user-defined functions which makes it unsuitable for most purposes. I am now not motivated to continue it.

You can download the interpreter but I also copy some relevant parts of the code here. You can download the full version and run it with -p to dump the optree instead of interpretting it. (Piping the dump through | sed 's/\bline( /\n&/g' may make it more readable.)

For this, you have to know that the get_tok function gets the next token as a string, and unget_tok pushes a token back so that it's the next token get_tok will read. The following code will parse expressions, and is called at various places from the code that parses statements.

{
my($pars_expr1,$pars_expr2);
my %binops= (
    # fun(args), var[index], and var.elt will be handled specially in 
+code
    "+", "add", "-", "sub", "*", "mul", 
    "/", "div", "div", "idiv", "mod", "mod",
    "<", "lt", "=", "eq", ">", "gt", "<=", "le", "<>", "ne", ">=", "ge
+",
    "es", "and", "vagy", "or", 
    );
my %binoppr= (
    "+", 50, "-", 50, "*", 40, 
    "/", 40, "div", 40, "mod", 40,
    "<", 60, "=", 60, ">", 60, "<=", 60, "<>", 60, ">=", 60,
    "es", 40, "vagy", 50, 
    # all operators with the same precedence are left-to-right associa
+tive,
    # eg: a-b-c ==> (a-b)-c
    );
my %unops= (
    # (expr), +expr are handled specially by code
    "-", "neg", "nem", "not", 
    );
my %unoppr= (
    "-", 30, "nem", 30,
    );
my %unfunc= (    
    "kerekit", "round", "round", "round", "egeszresz", "floor",
    "sqr", "sqr", "sqrt", "sqrt",
    );
# More precedences hard-coded in the code below:
# maximal:137, unary+:30, unary():-INF, unary functions:30,
# binary[]:20, binary.:20, binary():20
$pars_expr1= sub {
    my($t,$f,$x,$u,$p,$g);
    ($p,)= @_;
    $t= get_tok ();
    $t=~ m!^\d! and
        return [$t=~/[.eE]/?"qf":"qi", 0+$t];
    is_id $t and
        return ["var", $t];
    $f= $unops{$t} and do {
        $g= $unoppr{$t};
        return [$f, $pars_expr2->($g<$p?$g:$p)];
        };
    $t eq "(" and do {
        $x= $pars_expr2->(137);
        ($u= get_tok) eq ")" or
            die qq!error parsing expr at "$u": ")" expected!;
        return $x;
        };
    $t eq "+" and
        return $pars_expr2->(30<$p?30:$p);
    $f= $unfunc{$t} and 30<=$p and do {
        ($u= get_tok) eq "(" or
            die qq!parse error at "$u": "(" expected!;
        $x= $pars_expr2->(137);
        ($u= get_tok) eq ")" or
            die qq!parse error at "$u": ")" expected!;
        return [$f, $x];
        };
    $t eq "'" and
        return ["qs", get_tok];
    die qq(error parsing expr at "$t": expr expected);
    };
$pars_expr2= sub {
    my($o,$f,$y,$x,$u,$p,$g);
    ($p,)= @_;
    $x= $pars_expr1->($p);
    {
    $o= get_tok;
    $f= $binops{$o} and
        ($g= $binoppr{$o})<=$p and 
        do {
        $y= $pars_expr2->($g-1);
        $x= [$f, $x, $y];
        redo;
        };
    $o eq "." and 20<=$p and
        do {
        $$x[0] eq "var" or
            die qq(parse error at ".": expression cannot be dotted);
        $y= get_tok;
        is_id $y or
            die qq(parse error at "$y": record field expected);
        push @$x, $y;
        redo;
        };
    $o eq "[" and 20<=$p and
        do {
        $$x[0] eq "var" or
            die qq(parse error at "[": expression cannot be subscripte
+d);
        $y= $pars_expr2->(137);
        ($u= get_tok) eq "]" or
            die qq(parse error at "$u": "]" expected);
        push @$x, $y;
        redo;
        };
    }; 
    unget_tok $o;
    $x;
    };
sub pars_expr () {
    $pars_expr2->(137);
    };
sub pars_lvalue () {
    $pars_expr2->(20);
    };
};
[download]

Another interpreter I wrote in C++ as a school project also has a similar parsing routine.

Oh well, I don't think anyone would want to read this post this way. I think I'll have to make a simple example parser that works the same way as these two and post it as a meditation if I want any attention.

In reply to Re: Operator Precedence Parser by ambrus
in thread Operator Precedence Parser by bart

Are you posting in the right place? Check out Where do I post X? to know for sure.
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
Want more info? How to link or How to display code and escape characters are good places to start.


The stupid question is the question not asked
	PerlMonks