Re^5: Perl 5 Optimizing Compiler, Part 5: A Vague Outline Emerges

in reply to Re^4: Perl 5 Optimizing Compiler, Part 5: A Vague Outline Emerges
in thread Perl 5 Optimizing Compiler, Part 5: A Vague Outline Emerges

Perhaps if I make my question a bit more specific, it will help clarify things? (I'll explain the perl runtime in detail for anyone following along at home).

Consider the expression $x + $y * 3. This is compiled into an op tree, where (amongst other things) each op struct holds the info needed for that op, but also a pointer (op_next) to the next op in the execution sequence, and a pointer (op_ppaddr) to the C function that knows how to "execute" that op. So the op sequence for the above looks a bit like:

1: OP_PADSV
   op_targ   = 1
   op_ppaddr = Perl_pp_padsv
   op_next   = 2

2: OP_PADSV
   op_targ   = 2
   op_ppaddr = Perl_pp_padsv
   op_next   = 3

3: OP_CONST
   op_sv     = [an SV holding the value 3]
   op_ppaddr = Perl_pp_const
   op_next   = 4

4: OP_MULTIPLY
   op_ppaddr = Perl_pp_multiply
   op_next   = 5

5: OP_ADD
   op_ppaddr = Perl_pp_add
   op_next   = 6
[download]

The pp_functions themselves look a bit like the following (hugely over-simplified of course):

OP * Perl_pp_padsv {
    *PL_stack_sp++ = PL_curpad[PL_op->op_targ];
    return PL_op->op_next;
}
OP * Perl_pp_const {
    *PL_stack_sp++ = PL_op->op_sv;
    return PL_op->op_next;
}
OP * Perl_pp_multiply {
    SV *s1 = *--PL_stack_sp;
    SV *s2 = *--PL_stack_sp;
    SV *s3 = (a new or reused SV of some description);
    SvIVX(s3) = SvIVX(s1) * SvIVX(s2);
    *PL_stack_sp++ = s3;
    return PL_op->op_next;
}
OP * Perl_pp_add {
    SV *s1 = *--PL_stack_sp;
    SV *s2 = *--PL_stack_sp;
    SV *s3 = (a new or reused SV of some description);
    SvIVX(s3) = SvIVX(s1) + SvIVX(s2);
    *PL_stack_sp++ = s3;
    return PL_op->op_next;
}
[download]

And finally, the runops loop looks a bit like:

Perl_runops_standard {
    PL_op = ...;
    while (PL_op) {
        PL_op = PL_op->op_ppaddr();
    }
}
[download]

So the net effect is that perl is in a loop, calling various pp_* functions, whose job is to push and pull useful values onto the perl stack.

Now, under your proposal, you'll all have the pp functions (and all the functions they depend upon, such the hash library) compiled into IR and available to you. What do you do with the op tree? Yuval's proposal is to effectively compile the following C into IR:

    PL_op = Perl_pp_padsv(PL_op);
    PL_op = Perl_pp_padsv(PL_op);
    PL_op = Perl_pp_const(PL_op);
    PL_op = Perl_pp_multiply(PL_op);
    PL_op = Perl_pp_add(PL_op);
[download]

except that he would use modified versions of the pp functions that take and return their args directly, rather than getting them on and off the stack. Then LLVM has access to IR version of the unrolled runops loop and all the functions in IR, and can do its funky stuff with them.

So, with that background, what would your IR generated from the op tree look like? Is it just an unrolled runops loop for a single sub, with lots of explicit calls to pp-ish functions, or would you try and unroll the pp functions themselves, or something completely different?

Dave

In Section Meditations