<?xml version="1.0" encoding="windows-1252"?>
<node id="991006" title="Re^5: Perl 5 Optimizing Compiler, Part 5: A Vague Outline Emerges" created="2012-08-31 09:36:06" updated="2012-08-31 09:36:06">
<type id="11">
note</type>
<author id="341121">
dave_the_m</author>
<data>
<field name="doctext">
Perhaps if I make my question a bit more specific, it will help clarify things?
(I'll explain the perl runtime in detail for anyone following along at
home).
&lt;p&gt;
Consider the expression &lt;c&gt;$x + $y * 3&lt;/c&gt;. This is compiled into an op
tree, where (amongst other things) each op struct holds the info needed
for that op, but also a pointer (op_next) to the next op in the execution
sequence, and a pointer (op_ppaddr) to the C function that knows how to
"execute" that op. So the op sequence for the above looks a bit like:
&lt;c&gt;
1: OP_PADSV
   op_targ   = 1
   op_ppaddr = Perl_pp_padsv
   op_next   = 2

2: OP_PADSV
   op_targ   = 2
   op_ppaddr = Perl_pp_padsv
   op_next   = 3

3: OP_CONST
   op_sv     = [an SV holding the value 3]
   op_ppaddr = Perl_pp_const
   op_next   = 4

4: OP_MULTIPLY
   op_ppaddr = Perl_pp_multiply
   op_next   = 5

5: OP_ADD
   op_ppaddr = Perl_pp_add
   op_next   = 6
&lt;/c&gt;
&lt;p&gt;
The pp_functions themselves look a bit like the following (hugely
over-simplified of course):
&lt;p&gt;
&lt;c&gt;
OP * Perl_pp_padsv {
    *PL_stack_sp++ = PL_curpad[PL_op-&gt;op_targ];
    return PL_op-&gt;op_next;
}
OP * Perl_pp_const {
    *PL_stack_sp++ = PL_op-&gt;op_sv;
    return PL_op-&gt;op_next;
}
OP * Perl_pp_multiply {
    SV *s1 = *--PL_stack_sp;
    SV *s2 = *--PL_stack_sp;
    SV *s3 = (a new or reused SV of some description);
    SvIVX(s3) = SvIVX(s1) * SvIVX(s2);
    *PL_stack_sp++ = s3;
    return PL_op-&gt;op_next;
}
OP * Perl_pp_add {
    SV *s1 = *--PL_stack_sp;
    SV *s2 = *--PL_stack_sp;
    SV *s3 = (a new or reused SV of some description);
    SvIVX(s3) = SvIVX(s1) + SvIVX(s2);
    *PL_stack_sp++ = s3;
    return PL_op-&gt;op_next;
}
&lt;/c&gt;
&lt;p&gt;
And finally, the runops loop looks a bit like:
&lt;p&gt;
&lt;c&gt;
Perl_runops_standard {
    PL_op = ...;
    while (PL_op) {
        PL_op = PL_op-&gt;op_ppaddr();
    }
}
&lt;/c&gt;
&lt;p&gt;
So the net effect is that perl is in a loop, calling various pp_*
functions, whose job is to push and pull useful values onto the perl
stack.
&lt;p&gt;
Now, under your proposal, you'll all have the pp functions (and all the
functions they depend upon, such the hash library) compiled into IR and
available to you. What do you do with the op tree? Yuval's proposal is to
effectively compile the following C into IR:
&lt;p&gt;
&lt;c&gt;
    PL_op = Perl_pp_padsv(PL_op);
    PL_op = Perl_pp_padsv(PL_op);
    PL_op = Perl_pp_const(PL_op);
    PL_op = Perl_pp_multiply(PL_op);
    PL_op = Perl_pp_add(PL_op);
&lt;/c&gt;
&lt;p&gt;
except that he would use modified versions of the pp functions that take
and return their args directly, rather than getting them on and off the
stack. Then LLVM has access to IR version of the unrolled runops loop and
all the functions in IR, and can do its funky stuff with them.
&lt;p&gt;
So, with that background, what would your IR generated from the op tree
look like? Is it just an unrolled runops loop for a single sub, with lots
of explicit calls to pp-ish functions, or would you try and unroll the pp
functions themselves, or something completely different?
&lt;p&gt;
Dave
</field>
<field name="root_node">
990666</field>
<field name="parent_node">
990881</field>
</data>
</node>
