Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

Re^6: Perl 5 Optimizing Compiler, Part 4: LLVM Backend?

by dave_the_m (Parson)
on Aug 29, 2012 at 09:22 UTC ( #990413=note: print w/ replies, xml ) Need Help??


in reply to Re^5: Perl 5 Optimizing Compiler, Part 4: LLVM Backend?
in thread Perl 5 Optimizing Compiler, Part 4: LLVM Backend?

Perl internally parses the code into a tree of OP structures (the "optree"). This is sort of like an AST without the "A" part. But by its nature it's never been designed to be easily manipulable in the way an AST is supposed to be. Also, the structure and specification of the parsed program isn't contained within the optree, its also spread out among stashes, globs, pads etc.

When runtime is reached, each OP contains a pointer to the next op in execution sequence (op_next), plus a pointer to a C function that can carry out the action of that op (op_ppaddr). The main execution loop of perl (the "runloop") consists of calling the op_ppaddr function of the current op, then setting the current op to be whatever that function returns; repeat until a NULL is returned.

OPs (and the pp* functions which implement them) are unlike bytecode: bytecode consists of small, lightweight ops, with the expectation that they can can be easily converted into equivalent native machine code via JIT etc. Or to but it another way, they don't contain much switching logic themselves: switching is done by adding in switching bytecodes. A tracing JIT can then follow what paths are taken through the bytecode, and pick a certain path (no switching), and convert that sequence of bytecode into a sequence of machine instructions.

Perl ops are heavyweight: within each one there may be a lot of switching action. For example, the perl "add" op examines its two args: checks if they're overloaded, or tied, or have other magic associated with them, and if so handles that. Then checks whether the args are strings or other non-num things, and if so numifies them. Then adds them. Then checks for overflow: if overflow has occurred, see if the overflow could be avoided by upgrading from integer to float, and if so, do so.

(update: so what I meant to say at this point is that the perl optree is a bastard hybrid of an AST and bytecode, and has to serve both puprposes, not always comfortably)

B::Bytecode is just a way to serialise the optree (and associated state) into a platform neutral file. To execute it, the file is read in, and used to reconstruct the optree and state, then execution continues as before. It provides no runtime speedup, but was intended to speed startup by skipping the compilation phase: but in practice few gains were seen.

Dave


Comment on Re^6: Perl 5 Optimizing Compiler, Part 4: LLVM Backend?
Re^7: Perl 5 Optimizing Compiler, Part 4: LLVM Backend?
by Will_the_Chill (Pilgrim) on Aug 29, 2012 at 10:28 UTC
    Dave,

    I appreciate the optree lesson! :)

    I've used B::Bytecode, so I know that it doesn't always improve startup time (but sometimes it does).

    Now I think I understand a bit more about the importance of moving the magic ("switching"?) from the ops to the SVs, as well as the difficulty of converting from optree to some intermediate form like an actual AST or bytecode or anything useful outside of Perlguts.

    I'm sure this info will help in our development planning.

    Thanks,
    ~ Will

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://990413]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others wandering the Monastery: (12)
As of 2014-10-01 16:55 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    What is your favourite meta-syntactic variable name?














    Results (30 votes), past polls