http://www.perlmonks.org?node_id=990790


in reply to Re: Perl 5 Optimizing Compiler, Part 5: A Vague Outline Emerges
in thread Perl 5 Optimizing Compiler, Part 5: A Vague Outline Emerges

The third thing to do with LLVM, (which is what I think BrowserUK is advocating, but I may be wrong), is the wholesale replacement of perl's current runtime, making perl's parser convert the op tree to LLVM IR "bytecode", and thus making a perl a first-class Perl-to-LLVM compiler, in the same way that clang is a first-class C-to-LLVM compiler. This would be a massive undertaking, involving understanding all the subtleties and intricacies of 25,000 lines of pp_* functions, and writing code that will emit equivalent IR - even assuming that you kept the rest of perl the same (i.e. still used SVs, HVs, the context stack, pads etc).

You're about 1/3rd of the way to what I was trying to suggest as a possibility. I'm going to try again. I hope you have the patience to read it. I'm going to start with an unrealistic scenario for simplicity and try to fill in the gaps later.

Starting with a syntactically correct perl source that is entirely self-contained -- uses no modules or pragmas; no evals; no runtime code generation of any kind -- there are (notionally, no matter how hard it is linguistically to describe them; or practically to separate them ), three parts involved in the running of that program:

  1. The parsing of the source code and construction of the perl internal form -- call it a tree or graph; bytecode or opcodes -- for want of a term and some short-hand, the Perl Code Graph (PCG).

    Part 1 cannot be changed. it *is* Perl. So segregate it (I know; I know) out into a separately compiled and linked, native code unit.

    A dll (loaded by the minimal perl.exe much as perl5.x.dll is today), that reads the source file and builds exactly whatever it builds now, and then gets the hell out of dodge, leaving the PCG behind in memory.

  2. The interpreter proper -- the runloop -- that processes the PCG and dispatches to the Perl runtime (PRT).

    Moved below, because I need you to understand the context above and below, before the description of this middle bit will make sense.

  3. The Perl runtime -- the functions that do the actual work.

    Part 3 is very hard to re-code, as much of the behavioral semantics of perl is encapsulated entirely within it.

    So, give the whole kit&caboodle -- all the pp_* source code and dependencies -- to LLVM using its C front end, to process into LLVM intermediate form (IF), and then pass that through the various IF optimising stages until it can do no more, and then write it in its optimised IF form to a file (PRT.bc).

    This process is done once (for each release) by "the devs". The optimised PRT.bc file is platform independant and can be distributed as part of the build -- at the risk of the hackles it will raise including mine -- a bit like MSCRT.dll, but platform independent.

    This single binary file contains all the 'dispatched to' functions and their dependencies, pre-optimised as far as that can go, but still in portable IF form.

Part 2. The only new code that needs to be written. But even this already exists in the form of -MO-Deparse.

New code is adapted from Deparse. It processes the PCG in the normal way, but instead of (re)generating the Perl source code, it generates a LLVM IF "copy" of the PCG it is given. Let's call that the LLCG.

The LLCG is now the program we started with, but in a platform independent, optimisible (also platform independent) form that can be

I hope that is clearer than my previous attempts at description.

I'm fully aware that perl frequently reinvokes the parser and runloop in the process of compiling the source of a program, in order to deal with used modules and pragmas and BEGIN/CHECK/UNITCHECK/INIT/END blocks. Effectively, each alternation or recursion would be processed the same way as the above standalone program. If the module has previously been save in .bc form, the parsing and PCG->LLCG conversion can be skipped.

The first step, and perhaps the hardest part getting started, would be the re-engineering the existing build process -- and a little tweaking of the source files -- to break apart the code needed for parts 1 & 2 from part 3, so they can be built into separate dlls -- and the latter into PRT.bc. This process may result in some duplication as perl tends to use internally some of the same stuff that it provides to Perl programs as runtime.

These modifications to the build process and splitting out of the parser/PCG generation from the runtime could be done and used by the next release (or the one after that) of the existing Perl distribution. without compromising it.

It would not be trivial and it would require some one with excellent knowledge of both the internals and the build process -- ie. YOU! -- but it wouldn't be a huge job, and it needn't be a throwaway if all the rest failed or went nowhere. It might even benefit the existing code base and build system in its own right.

I'm done. If that fails to clarify or persuade, so be it. I'll respond to direct questions should there be any, but no more attempts to change anyones mind :)

In the unlikely event you read to here, thank you for your time and courtesy.


With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

RIP Neil Armstrong

Replies are listed 'Best First'.
Re^3: Perl 5 Optimizing Compiler, Part 5: A Vague Outline Emerges
by chromatic (Archbishop) on Aug 30, 2012 at 16:49 UTC

    I understand that strategy, and it's probably easier to start to see results than most other approaches, but keep in mind a few drawbacks:

    • You have to compile every part of the system to LLVM bitcode, including the Perl core, any modules you use, and every XS component. This is cacheable, but you still have to do it.
    • Before running a program, you have to link all of the bitcode together into a single image. This image will be large.
    • Before running a program, you really want to run as many of LLVM's optimization stages as possible. This takes time and memory.
    • You may be able to emit native code which represents only that program and execute that. Essentially you're replicating what PAR does, with all of the caveats about program analysis and requirement discoverability.

    I expect the resulting binaries to be huge. I expect the link and optimization steps to take a long time. I expect you'll have to keep the Perl 5 interpreter around anyway because there are things you just can't do without either rewriting the language (to get around the BEGIN/import symbol declaration dance, for example. I don't know if you have to include any LLVM runtime components.

    I can imagine that you can optimize some programs with this approach, but I don't know that the intrinsic overhead in the Perl 5 VM you get is more than 10%. Maybe link-time optimization can cut out another 10%. Part of that is the inherent flexibility of Perl 5's design, and part of that is that LLVM is at heart a really good compiler for languages that act like C++.

      I'm not sure there is a direct question in there for me to respond to, so all I'll say is that it is a minimal up-front effort strategy that puts the hooks in place that would allow some real measurements to be taken.

      And even that minimal effort needn't be discarded if the LLVM experiments fail, because it leaves an intact perl distribution that works just as efficiently and effectively as it does now, with the only difference being that the compiled perl executable comes in 3 parts rather than the current two. (In windows terms: perl.exe dynalinked to perl5.x.part1&2.dll & perl5.x.part3.dll)

      And that split might even benefit the normal distribution in some way.

      Once the hooks were in there, the ability to pass the PCG to some other dll, module or process, might allow other investigations to move forward.


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

      RIP Neil Armstrong

      /code

        If this approach could lead to a cleaner separation of the parser from the runtime, that might be a worthwhile thing in and of itself, apart from any LLVM success or failure. It's worth considering further.

Re^3: Perl 5 Optimizing Compiler, Part 5: A Vague Outline Emerges
by dave_the_m (Monsignor) on Aug 30, 2012 at 22:06 UTC
    (Damn, I said i wouldn't be responding further...)

    New code is adapted from Deparse. It processes the PCG in the normal way, but instead of (re)generating the Perl source code, it generates a LLVM IF "copy" of the PCG it is given. Let's call that the LLCG
    This is the bit I don't currently get. take a piece of code like
    $m = ($s =~ /foo/);

    Which is compiled to an optree that looks like:

    7 <@> leave[1 ref] vKP/REFC ->(end) 1 <0> enter ->2 2 <;> nextstate(main 1 p:3) v:{ ->3 6 <2> sassign vKS/2 ->7 4 </> match(/"foo"/) sKPS/RTIME ->5 - <1> ex-rv2sv sK/1 ->4 3 <#> gvsv[*s] s ->4 - <1> ex-rv2sv sKRM*/1 ->6 5 <#> gvsv[*m] s ->6
    In general terms, what would the IR look like that you would convert that into?

    Dave.

      A first reaction (I'm getting punchy at this point in time.). If you use LLVM as just an alternative C compiler, then as a part of the process of compiling perl -- unchanged -- it will compile whatever code/functions/source file(s) that constitute the current "runloop" (ostensibly runops_standard in run.c).

      One of the possible variations of using LLVM in this mode, is that it (clang) can output .bc (bitcode) files, that can later be linked together -- using the LLVM linker -- to produce a native executable. What's more, is that the LLVM linker is quite happy to accept some "object files" that are in .bc format, and some that are in the normal .obj/.o format, and link them together and produce a (normal) platform dependent executable.

      It is also possible, to have clang produce LLVM IF in text form.

      So, in theory, if we ran (something like; there's a lot of documentation) clang --emit-text run.c -O run.o and then inspected run.o in a text editor, it would tell us exactly what the IF looks like for that source file.

      And that IF, would (in its binary form), be combinable -- with all the other normal object files produced using gcc or cl.exe -- using the LLVM linker, to produce a working, native compiled executable.

      That is not a direct answer to your question, but the point is that (as a starting point), it is possible to build a working executable, by substituting any individual clang-compiled-to-bitcode-source-file, for the native compiled objct file from that source, and combine it with all the other GGC/CL produced object files, and the LLVM linker will happily combine them into into a native executable.

      Thus, to see what the LLVM IF look like, for any given source file, you only need to use clang to compile that individual source file to its text representation. You don't gain any performance, but you do get to see what LLVM IF looks like.

      I'll attempt to get back to you with a specific answer to your question, but given that my LLVM installation if 2 years old, and my primary perl installation about the same, It'll take a couple of days to get caught up.


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

      RIP Neil Armstrong

        Perhaps if I make my question a bit more specific, it will help clarify things? (I'll explain the perl runtime in detail for anyone following along at home).

        Consider the expression $x + $y * 3. This is compiled into an op tree, where (amongst other things) each op struct holds the info needed for that op, but also a pointer (op_next) to the next op in the execution sequence, and a pointer (op_ppaddr) to the C function that knows how to "execute" that op. So the op sequence for the above looks a bit like:

        1: OP_PADSV op_targ = 1 op_ppaddr = Perl_pp_padsv op_next = 2 2: OP_PADSV op_targ = 2 op_ppaddr = Perl_pp_padsv op_next = 3 3: OP_CONST op_sv = [an SV holding the value 3] op_ppaddr = Perl_pp_const op_next = 4 4: OP_MULTIPLY op_ppaddr = Perl_pp_multiply op_next = 5 5: OP_ADD op_ppaddr = Perl_pp_add op_next = 6

        The pp_functions themselves look a bit like the following (hugely over-simplified of course):

        OP * Perl_pp_padsv { *PL_stack_sp++ = PL_curpad[PL_op->op_targ]; return PL_op->op_next; } OP * Perl_pp_const { *PL_stack_sp++ = PL_op->op_sv; return PL_op->op_next; } OP * Perl_pp_multiply { SV *s1 = *--PL_stack_sp; SV *s2 = *--PL_stack_sp; SV *s3 = (a new or reused SV of some description); SvIVX(s3) = SvIVX(s1) * SvIVX(s2); *PL_stack_sp++ = s3; return PL_op->op_next; } OP * Perl_pp_add { SV *s1 = *--PL_stack_sp; SV *s2 = *--PL_stack_sp; SV *s3 = (a new or reused SV of some description); SvIVX(s3) = SvIVX(s1) + SvIVX(s2); *PL_stack_sp++ = s3; return PL_op->op_next; }

        And finally, the runops loop looks a bit like:

        Perl_runops_standard { PL_op = ...; while (PL_op) { PL_op = PL_op->op_ppaddr(); } }

        So the net effect is that perl is in a loop, calling various pp_* functions, whose job is to push and pull useful values onto the perl stack.

        Now, under your proposal, you'll all have the pp functions (and all the functions they depend upon, such the hash library) compiled into IR and available to you. What do you do with the op tree? Yuval's proposal is to effectively compile the following C into IR:

        PL_op = Perl_pp_padsv(PL_op); PL_op = Perl_pp_padsv(PL_op); PL_op = Perl_pp_const(PL_op); PL_op = Perl_pp_multiply(PL_op); PL_op = Perl_pp_add(PL_op);

        except that he would use modified versions of the pp functions that take and return their args directly, rather than getting them on and off the stack. Then LLVM has access to IR version of the unrolled runops loop and all the functions in IR, and can do its funky stuff with them.

        So, with that background, what would your IR generated from the op tree look like? Is it just an unrolled runops loop for a single sub, with lots of explicit calls to pp-ish functions, or would you try and unroll the pp functions themselves, or something completely different?

        Dave

      Okay. I compiled run.c -> run.e using the cl /E. It produced 30,000 lines of post-precprocessed C. Most of which is unrelated to Perl having been pulled in from a crap load of OS header files. (I won't post it here, it's too big and entirely uninteresting anyway.)

      I then threw that file at clang -- all 30,000 lines of it -- and asked it to convert it to LLVM assembler with no optimisation. It took most of the day cleaning up stuff that clang is really pedantic about -- it deosn't allow duplicate typedefs, even if they are identical except whitespace; it doesn't like MSC source code annotations or accept most of their pragmas; and it doesn't like prototypes without ; on the end ... and there were 1000s of them produced from the perl headers -- but 9 hours work and I got there.

      It produced these 572 lines:

      I then asked it to optimise it. This time it produced just these 363 lines:

      Now looking at that, I can see that it has built data descriptions for a crapload of Windows internal data structures, so I've manually (and conservatively) removed anything that I don't think are used by Perl. (I could have done this at the /e stage, but it was *much* easier reading 700 lines that 30000 lines :)

      What I've ended up with is these 112 lines of IR:

      ; ModuleID = '/tmp/webcompile/_9304_0.bc' target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i6 +4:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f +80:128:128-n8:16:32:64-S128" target triple = "x86_64-unknown-linux-gnu" %struct._TP_CALLBACK_ENVIRON = type { i64, %struct._TP_POOL*, %struct. +_TP_CLEANUP_GROUP*, void (i8*, i8*)*, i8*, %struct._ACTIVATION_CONTEX +T*, void (%struct._TP_CALLBACK_INSTANCE*, i8*)*, %union.anon } %struct._TP_POOL = type opaque %struct._TP_CLEANUP_GROUP = type opaque %struct._ACTIVATION_CONTEXT = type opaque %struct._TP_CALLBACK_INSTANCE = type opaque %union.anon = type { i64 } %struct.interpreter = type { %struct.sv**, %struct.op*, %struct.sv**, +%struct.sv**, %struct.sv**, i64*, i8**, i64, i64, %union.any*, i64, i +64, %struct.sv**, i64, i64, i64, i64, i64*, i64*, i64*, %struct.sv*, +%struct.xpv*, i64, %struct._stat64, %struct._stat64, %struct.gv*, %st +ruct.sv*, %struct.tms, %struct.pmop*, %struct.sv*, %struct.gv*, %stru +ct.gv*, %struct.gv*, i8*, %struct.sv*, %struct.sv*, %struct.sv*, %str +uct.hv*, %struct.hv*, %struct.op*, %struct.jmpenv*, %struct.cop*, %st +ruct.av*, %struct.stackinfo*, %struct.av*, %struct.jmpenv*, %struct.j +mpenv, %struct.sv*, %struct.he*, %struct.op*, %struct.op*, %struct.hv +*, %struct.gv*, %struct.gv*, i8*, i64, i64*, i64*, %struct.sv*, %stru +ct.re_save_state, %struct.regnode, i16, i8, i8, [6 x i8*], void (%str +uct.interpreter*, %struct.op*)*, void (%struct.interpreter*, %struct. +op*)*, void (%struct.interpreter*, %struct.op*)*, i64, i64, i8**, i8* +, %struct.regmatch_slab*, %struct.regmatch_state*, i16, i8, i8, i8, i +8, i32, i8, i64, i32, i8**, %struct.gv*, %struct.gv*, %struct.gv*, i8 +*, %struct.sv*, %struct.sv*, %struct.sv*, %struct.sv*, i8**, i8*, i8, + i8, i8, i8, i8, i8, i8, i8, i8, i8, i8, i8, i8, i8, i8, i8*, %struct +.sv*, i64, %struct.sv*, i64, i64, i64, i32, i32*, %struct.gv*, %struc +t.gv*, %struct.gv*, %struct.gv*, %struct.gv*, %struct.av*, %struct.gv +*, %struct.gv*, %struct.gv*, %struct.gv*, %struct.gv*, %struct.sv*, % +struct.sv*, %struct.sv*, %struct.av*, %struct.hv*, %struct.hv*, %stru +ct.sv*, %struct.av*, %struct.av*, %struct.av*, %struct.av*, %struct.a +v*, %struct.hv*, i64, i32, i64, i64, %struct.sv*, %struct.sv*, %struc +t.av*, i8*, %struct.cv*, %struct.op*, %struct.op*, %struct.op*, %stru +ct.op*, %struct.cop*, i32, i32, i8*, i8**, i8*, %struct.av*, %struct. +sv*, %struct.sv*, i64, i8, i8, i16, i32, i64, %struct.exitlistentry*, + %struct.hv*, i64*, %struct.cop, %struct.cv*, %struct.av*, %struct.av +*, i64, i64, %struct.interp_intern, %struct.cv*, i32, i8, i8, i8, i8, + i64, i64, i64, i64, i64, i64, i64, i64, i8**, i8*, void (i32)*, [16 +x i8*], i64, i32, {}*, %struct.sv, %struct.sv, %struct.sv, %struct.sv +*, i64, i64, i64, i64, i64, i64, i64, i64, i64, i8*, i64, i64, i64, i +8, i8, i8, i8, i8*, %struct.sv*, %struct.sv*, %struct.sv*, %struct.sv +*, %struct.sv*, %struct.sv*, %struct.sv*, %struct.sv*, %struct.sv*, % +struct.sv*, %struct.sv*, %struct.sv*, %struct.sv*, %struct.sv*, %stru +ct.sv*, %struct.sv*, %struct.sv*, %struct.sv*, %struct.sv*, %struct.s +v*, %struct.sv*, %struct.sv*, %struct.sv*, %struct.sv*, %struct.sv*, +%struct.sv*, %struct.sv*, %struct.sv*, %struct.sv*, %struct.sv*, %str +uct.hv*, i8*, i64, [10 x i8], i8, i8, i32, %struct.yy_parser*, %struc +t.sv**, %struct.sv**, %struct.ptr_tbl*, %struct.av*, i8*, %struct.sv* +, %struct.sv**, %struct.av*, %struct.REENTR*, %struct.hv*, %struct.hv +*, %struct._PerlIO*, %struct.PerlIO_list_s*, %struct.PerlIO_list_s*, +%struct.sv*, %struct.perl_debug_pad, %struct.sv*, %struct.sv*, %struc +t.sv*, %struct.sv*, i64 (%struct.interpreter*, %struct.sv*, %struct.s +v*)*, %struct.av*, %struct.av*, i64, i64, i32, %struct.hv*, void (%st +ruct.interpreter*, %struct.sv*)*, void (%struct.interpreter*, %struct +.sv*)*, void (%struct.interpreter*, %struct.sv*)*, {}*, void (%struct +.interpreter*)*, i64, i64, %struct.hv*, i32, i8**, i8 (%struct.interp +reter*, %struct.sv*)*, %struct.hv*, %struct.av*, %struct.hv*, %struct +.hv*, %struct.hv* } %struct.sv = type { i8*, i64, i64, %union.anon.0 } %union.anon.0 = type { i8* } %struct.op = type { %struct.op*, %struct.op*, %struct.op* (%struct.int +erpreter*)*, i64, [2 x i8], i8, i8 } %union.any = type { i8* } %struct.xpv = type { %struct.hv*, %union._xmgu, i64, i64 } %struct.hv = type { %struct.xpvhv*, i64, i64, %union.anon.3 } %struct.xpvhv = type { %struct.hv*, %union._xmgu, i64, i64 } %union._xmgu = type { %struct.magic* } %struct.magic = type { %struct.magic*, %struct.mgvtbl*, i16, i8, i8, i +64, %struct.sv*, i8* } %struct.mgvtbl = type { i32 (%struct.interpreter*, %struct.sv*, %struc +t.magic*)*, i32 (%struct.interpreter*, %struct.sv*, %struct.magic*)*, + i64 (%struct.interpreter*, %struct.sv*, %struct.magic*)*, i32 (%stru +ct.interpreter*, %struct.sv*, %struct.magic*)*, i32 (%struct.interpre +ter*, %struct.sv*, %struct.magic*)*, i32 (%struct.interpreter*, %stru +ct.sv*, %struct.magic*, %struct.sv*, i8*, i64)*, i32 (%struct.interpr +eter*, %struct.magic*, %struct.clone_params*)*, i32 (%struct.interpre +ter*, %struct.sv*, %struct.magic*)* } %struct.clone_params = type { %struct.av*, i64, %struct.interpreter*, +%struct.interpreter*, %struct.av* } %struct.av = type { %struct.xpvav*, i64, i64, %union.anon.2 } %struct.xpvav = type { %struct.hv*, %union._xmgu, i64, i64, %struct.sv +** } %union.anon.2 = type { i8* } %union.anon.3 = type { i8* } %struct._stat64 = type { i32, i16, i16, i16, i16, i16, i32, i64, i64, +i64, i64 } %struct.gv = type { %struct.xpvgv*, i64, i64, %union.anon.7 } %struct.xpvgv = type { %struct.hv*, %union._xmgu, i64, i64, %union._xi +vu, %union._xnvu } %union._xivu = type { i64 } %union._xnvu = type { %struct.anon.5 } %struct.anon.5 = type { i64, i64 } %union.anon.7 = type { i8* } %struct.tms = type { i64, i64, i64, i64 } %struct.pmop = type { %struct.op*, %struct.op*, %struct.op* (%struct.i +nterpreter*)*, i64, [2 x i8], i8, i8, %struct.op*, %struct.op*, i64, +i64, %union.anon.12, %union.anon.13 } %union.anon.12 = type { %struct.op* } %union.anon.13 = type { %struct.op* } %struct.jmpenv = type { %struct.jmpenv*, [16 x i32], i32, i8 } %struct.cop = type { %struct.op*, %struct.op*, %struct.op* (%struct.in +terpreter*)*, i64, [2 x i8], i8, i8, i64, i8*, i8*, i64, i64, i64*, % +struct.refcounted_he* } %struct.refcounted_he = type opaque %struct.stackinfo = type { %struct.av*, %struct.context*, %struct.stac +kinfo*, %struct.stackinfo*, i64, i64, i64, i64 } %struct.context = type { %union.anon.14 } %union.anon.14 = type { %struct.block } %struct.block = type { i8, i8, i16, i64, %struct.cop*, i64, i64, %stru +ct.pmop*, %union.anon.15 } %union.anon.15 = type { %struct.block_sub } %struct.block_sub = type { %struct.op*, %struct.cv*, %struct.av*, %str +uct.av*, i64, %struct.av* } %struct.cv = type { %struct.xpvcv*, i64, i64, %union.anon.11 } %struct.xpvcv = type { %struct.hv*, %union._xmgu, i64, i64, %struct.hv +*, %union.anon.9, %union.anon.10, %struct.gv*, i8*, %struct.av*, %str +uct.cv*, i64, i16, i64 } %union.anon.9 = type { %struct.op* } %union.anon.10 = type { %struct.op* } %union.anon.11 = type { i8* } %struct.he = type { %struct.he*, %struct.hek*, %union.anon.1 } %struct.hek = type { i64, i64, [1 x i8] } %union.anon.1 = type { %struct.sv* } %struct.re_save_state = type { i64, i64, i64, i8, i8*, i8*, i8*, %stru +ct.regexp_paren_pair*, i64*, i64*, i8**, %struct.magic*, %struct.pmop +*, %struct.pmop*, i8*, i64, i64, i64, i64, i64, i64, i8*, i8* } %struct.regexp_paren_pair = type { i64, i64 } %struct.regnode = type { i8, i8, i16 } %struct.regmatch_slab = type { [42 x %struct.regmatch_state], %struct. +regmatch_slab*, %struct.regmatch_slab* } %struct.regmatch_state = type { i32, i8*, %union.anon.22 } %union.anon.22 = type { %struct.anon.26 } %struct.anon.26 = type { %struct.regmatch_state*, i64, i64, i64, i16*, + %struct.regnode*, %struct.regnode*, i8*, i64, i16, i16, i8 } %struct.exitlistentry = type { void (%struct.interpreter*, i8*)*, i8* +} %struct.interp_intern = type { i8*, i8**, i64, %struct.av*, %struct.ch +ild_tab*, i64, %struct.pseudo_child_tab*, i8*, %struct.thread_intern, + %struct.HWND__*, i32, i32, [27 x void (i32)*] } %struct.child_tab = type { i64, [64 x i64], [64 x i8*] } %struct.pseudo_child_tab = type { i64, [64 x i64], [64 x i8*], [64 x % +struct.HWND__*], [64 x i8] } %struct.HWND__ = type { i32 } %struct.thread_intern = type { [512 x i8], %struct.servent, [128 x i8] +, i32, [30 x i8], i32, i16 } %struct.servent = type { i8*, i8**, i16, i8* } %struct.yy_parser = type { %struct.yy_parser*, %union.YYSTYPE, i32, i3 +2, i32, i32, %struct.yy_stack_frame*, %struct.yy_stack_frame*, i64, i +64, i8*, i8*, i8, i8, i8, i8, i64, %struct.op*, %struct.op*, %struct. +sv*, i16, i16, i64, %struct.sv*, i64, i64, i8, i8, i8, i8, i64, %stru +ct._sublex_info, %struct.sv*, i8*, i8*, i8*, i8*, i8*, i8*, i8*, i64, + i16, i8, i8, %struct.hv*, %struct._PerlIO**, %struct.av*, [5 x %unio +n.YYSTYPE], [5 x i64], i64, %struct.cop*, [256 x i8], i8, i8 } %union.YYSTYPE = type { i64 } %struct.yy_stack_frame = type { %union.YYSTYPE, i16, i64, %struct.cv* +} %struct._sublex_info = type { i8, i16, %struct.op*, i8*, i8* } %struct._PerlIO = type opaque %struct.ptr_tbl = type { %struct.ptr_tbl_ent**, i64, i64, %struct.ptr_ +tbl_arena*, %struct.ptr_tbl_ent*, %struct.ptr_tbl_ent* } %struct.ptr_tbl_ent = type { %struct.ptr_tbl_ent*, i8*, i8* } %struct.ptr_tbl_arena = type opaque %struct.REENTR = type { i32 } %struct.PerlIO_list_s = type opaque %struct.perl_debug_pad = type { [3 x %struct.sv] } define i64 @HEAP_MAKE_TAG_FLAGS(i64 %TagBase, i64 %Tag) nounwind uwtab +le readnone { %1 = shl i64 %Tag, 18 %2 = add i64 %1, %TagBase ret i64 %2 } define i32 @Perl_runops_standard(%struct.interpreter* %my_perl) nounwi +nd uwtable { %1 = getelementptr inbounds %struct.interpreter* %my_perl, i64 0, i3 +2 1 %2 = load %struct.op** %1, align 8, !tbaa !3 br label %3 ; <label>:3 ; preds = %3, %0 %op.0 = phi %struct.op* [ %2, %0 ], [ %6, %3 ] %4 = getelementptr inbounds %struct.op* %op.0, i64 0, i32 2 %5 = load %struct.op* (%struct.interpreter*)** %4, align 8, !tbaa !3 %6 = tail call %struct.op* %5(%struct.interpreter* %my_perl) nounwin +d store %struct.op* %6, %struct.op** %1, align 8, !tbaa !3 %7 = icmp eq %struct.op* %6, null br i1 %7, label %8, label %3 ; <label>:8 ; preds = %3 %9 = getelementptr inbounds %struct.interpreter* %my_perl, i64 0, i3 +2 78 store i8 0, i8* %9, align 1, !tbaa !0 ret i32 0 } !0 = metadata !{metadata !"omnipotent char", metadata !1} !1 = metadata !{metadata !"Simple C/C++ TBAA", null} !2 = metadata !{metadata !"long", metadata !0} !3 = metadata !{metadata !"any pointer", metadata !0} !4 = metadata !{metadata !"long long", metadata !0}

      Take a close look at the data definitions for Interpreter, SV_any, HEK, COP etc. Isn't that the most concise (and thorough) description of the entire perl internals you've ever seen?

      Doesn't it nake you wonder (just a little), what could it do with all that information?


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

      [http://thebottomline.cpaaustra

        ake a close look at the data definitions for Interpreter, SV_any, HEK, COP etc. Isn't that the most concise (and thorough) description of the entire perl internals you've ever seen? Doesn't it nake you wonder (just a little), what could it do with all that information?
        No, not in the slightest. I think this is the fundamental impedence mismatch between you and me on this subject.

        You keep asserting that if LLVM can only be given a full picture of the program, it will be able to do (unspecified) wonderful things with it. I come from the viewpoint that for perl to perform a particular action, e.g. my $x = $h{$key}, there are a certain basic number of things the underlying hardware is going to have to at some point, such as calculate a hash index, index into the hash bucket, scan the hash bucket for a matching string, retrieve the associated SV, then copy the relevant part of the SV (e.g. its integer value, or its string buffer) into the SV stored in a pad somewhere that's associated with the name $x.

        Now at the moment, there's a certain amount of overhead associated with doing that via the perl stack and the perl runops loop, but there's still a basic mimimum that needs doing. I earlier showed that the overhead is probably less than 20%. To get better than this, LLVM has got to, in some magical way, cut into basic underlying operations like hash lookup. And I really don't think its going to that.

        Until someone provides me with a single actual concrete example of how LLVM might do anything better than just cut out a bit of that overhead, I'm not going to believe it.

        PS I completely fail to understand the the point of your showing how run.c gets converted to IR. run.c just contains a single trivial C function, operating on a few args and variables of particular types, and the IR knows the types of those args. So what?

        Dave

      Oh. And finally, I asked it to convert the optimised form and output it as C++. The interesting bits are at the top and the very bottom; there is a load of Windows API stuff that I can't be bothered to delete: It seems at 3000+ lines, it is too big to post here, but man is it ever interesting.

      Not that I'd want to maintain it in that form; but can you imagine converting all the Perl sources into C++, loading it up into a new respository and annotating it and then using that to construct a properly object-oriented Perl5 source tree?

      Proper inheritance of the SVt_types; a full cleanup of all the cruft that has built up over the years; starting anew -- from a clean, compilable and working, if weirdly structured and named -- codebase and going forward from there?


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

      RIP Neil Armstrong

      .
        No, I think you're talking utter nonsense. Sorry, I've been trying so, so, hard to be open minded throughout these discussions, but I've finally given up. No more time wasting for me.

        Dave.