Re^2: Perl 5 Optimizing Compiler, Part 5: A Vague Outline Emergesby BrowserUk (Pope)
|on Aug 30, 2012 at 15:59 UTC||Need Help??|
The third thing to do with LLVM, (which is what I think BrowserUK is advocating, but I may be wrong), is the wholesale replacement of perl's current runtime, making perl's parser convert the op tree to LLVM IR "bytecode", and thus making a perl a first-class Perl-to-LLVM compiler, in the same way that clang is a first-class C-to-LLVM compiler. This would be a massive undertaking, involving understanding all the subtleties and intricacies of 25,000 lines of pp_* functions, and writing code that will emit equivalent IR - even assuming that you kept the rest of perl the same (i.e. still used SVs, HVs, the context stack, pads etc).
You're about 1/3rd of the way to what I was trying to suggest as a possibility. I'm going to try again. I hope you have the patience to read it. I'm going to start with an unrealistic scenario for simplicity and try to fill in the gaps later.
Starting with a syntactically correct perl source that is entirely self-contained -- uses no modules or pragmas; no evals; no runtime code generation of any kind -- there are (notionally, no matter how hard it is linguistically to describe them; or practically to separate them ), three parts involved in the running of that program:
Part 2. The only new code that needs to be written. But even this already exists in the form of -MO-Deparse.
New code is adapted from Deparse. It processes the PCG in the normal way, but instead of (re)generating the Perl source code, it generates a LLVM IF "copy" of the PCG it is given. Let's call that the LLCG.
The LLCG is now the program we started with, but in a platform independent, optimisible (also platform independent) form that can be
I hope that is clearer than my previous attempts at description.
I'm fully aware that perl frequently reinvokes the parser and runloop in the process of compiling the source of a program, in order to deal with used modules and pragmas and BEGIN/CHECK/UNITCHECK/INIT/END blocks. Effectively, each alternation or recursion would be processed the same way as the above standalone program. If the module has previously been save in .bc form, the parsing and PCG->LLCG conversion can be skipped.
The first step, and perhaps the hardest part getting started, would be the re-engineering the existing build process -- and a little tweaking of the source files -- to break apart the code needed for parts 1 & 2 from part 3, so they can be built into separate dlls -- and the latter into PRT.bc. This process may result in some duplication as perl tends to use internally some of the same stuff that it provides to Perl programs as runtime.
These modifications to the build process and splitting out of the parser/PCG generation from the runtime could be done and used by the next release (or the one after that) of the existing Perl distribution. without compromising it.
It would not be trivial and it would require some one with excellent knowledge of both the internals and the build process -- ie. YOU! -- but it wouldn't be a huge job, and it needn't be a throwaway if all the rest failed or went nowhere. It might even benefit the existing code base and build system in its own right.
I'm done. If that fails to clarify or persuade, so be it. I'll respond to direct questions should there be any, but no more attempts to change anyones mind :)
In the unlikely event you read to here, thank you for your time and courtesy.
With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".