Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

Re^6: Perl 5 Optimizing Compiler, Part 4: LLVM Backend?

by chromatic (Archbishop)
on Aug 28, 2012 at 16:44 UTC ( #990280=note: print w/ replies, xml ) Need Help??


in reply to Re^5: Perl 5 Optimizing Compiler, Part 4: LLVM Backend?
in thread Perl 5 Optimizing Compiler, Part 4: LLVM Backend?

You can save the OP from many blind alleys.

I don't think he's listening. If this stuff were easy, it would be more than a pipe dream by now.

I hope no one's taking my criticism as stop energy. My intent is to encourage people to solve the real problems and not pin their hopes on quick and dirty stopgaps that probably won't work.

Are you 100% certain that under those circumstances, that the link-time optimiser isn't going to find substantial gains from its supra compile-unit view of that code?

I expect it will find some gains, but keep in mind two things. First, you have to keep around all of the IR for everything you want to optimize across. That includes things like the XS in DBI as well as the Perl 5 core. Second, LLVM tends to expect the languages it compiles have static type systems. The last time I looked at its JIT, it didn't do any sort of tracing at runtime, so either you add that yourself, or you do without. (I stand by the assumption that the best opportunity for optimization from a JIT is rewriting and replacing basic blocks with straight line code that takes advantage of known types.)

With that said, compiling all of Perl 5 with a compiler that knows how to do link time optimization does offer a benefit, even if you can use LTO only on the core itself. This won't be an order of magnitude improvement. If you get 5% performance improvement, be happy.

Defining a language that targets a VM defined in (back then) lightweight C pre-processor macros, and throwing it at the C compilers to optimise, was very innovative.

Maybe so as far as that goes, but the implementation of Perl 5 was, from the start, flawed. Even something as obvious as keeping the reference count in the SV itself has huge problems. See, for example, the way memory pages go unshared really really fast even when reading values between COW processes.

The problem is that the many heavy-handed additions, extensions and overzealous "correctness" drives, have turned those once lightweight opcode macros into huge, heavyweight, scope-layered, condition-ridden lumps of unoptimisible boiler-plate.

I think we're talking about different things. Macros or functions are irrelevant to my criticisms of the Perl 5 core design. My biggest objection is the fat opcode design that puts the responsibility for accessing values from SVs in the opcode bodies rather than using some sort of polymorphism (and it doesn't have to be OO!) in the SVs themselves.

Are continuations a necessary part of a Perl-targeted VM?

They were a must-have from Perl 6 back then. They simplify a lot of user-visible language constructs, and they make things like resumable exceptions possible. If implemented well, you can get a lot of great features reasonably cheaply from CPS as your control flow mechanism.

Lua uses them, and Lua uses a register architecture.

Why have RISC architectures failed to take over the world?

Windows, I suspect.


Comment on Re^6: Perl 5 Optimizing Compiler, Part 4: LLVM Backend?
Re^7: Perl 5 Optimizing Compiler, Part 4: LLVM Backend?
by BrowserUk (Pope) on Aug 28, 2012 at 18:20 UTC
    With that said, compiling all of Perl 5 with a compiler that knows how to do link time optimization does offer a benefit, even if you can use LTO only on the core itself. This won't be an order of magnitude improvement. If you get 5% performance improvement, be happy.

    If all you do is use LLVM's C front-end to compile the complete existing code base to IF, optimise, and then back to C to compile, I think you are probably in the ball park, if a little pessimistic. I'd have said low double digit percentage improvements.

    But, if you break out the runtime components from the compile-time components -- ie. everything before the (say*) ${^GLOBALPHASE} = INIT -- and compile the before to C and link it. But then convert the (for want of a better term) "bytecode" to IF and pass it to the LLVM JIT engine, what then?

    And what if the IF generated could be saved and the reloaded thus avoiding the perl compilation phase for second and subsequent runs (until edits)?

    And how about combining the IF generated from the bytecode with the IF form of the core and linking it to build standalone executables?

    Is any of this possible? There is only one way to find out.

    (*)I appreciate that you would probably need to intercept, repeatedly, at the ${^GLOBALPHASE} = 'CHECL' or 'UNITCHECK' stages in reality.


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

    RIP Neil Armstrong

      My instinct (after not exploring this in much detail) is that you will get some improvements, if you don't blow past the memory limits of LLVM. I believe the 16 MB limit is gone, which will help, but you're still talking about deserializing plenty of bytecode for LLVM to process.

      However Reini and I disagree about some things, I think we both agree that improving the typefulness of Perl code to narrow down the dynamic possibilities offer more potential improvements for memory use and optimization. That is probably also compatible with LLVM, but I still think to get real advantages out of LLVM, you have to port the VM to generate IR internally rather than compile the C components to IR with clang.

        It is when you get into the details, that casual discussion starts to get difficult. There are so many phases, it gets very hard to make it clear what we are talking about at any given point. (It is hard to keep them distinct in your (my!) own mind.)

        I *think* we are talking about (roughly) the same things here.

        1. There is the bit of perl that parses the source code and builds the AST from the Perl programs source.

          This would need to be left pretty much as is. The whole perl defines Perl thing. This would need to be compiled (back to) C and then linked into an executable.

        2. Then there is the bit that converts the AST into byte code.

          I believe that this would need to be separated out and converted to produce IF.

        3. Then there is the bit of perl that ordinarily runs the bytecode.

          Not just the runloop, but all the code the runloop dispatches to.

          I think that should be compiled to IF and built into an IF library (.bc)

          It would be ""combined"" with the IF from stage 2, at runtime, and given to the JIT to convert to machine code.

        All total speculation of course. And possibly completely impractical or impossible. But we won't know ...


        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.

        RIP Neil Armstrong

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://990280]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others avoiding work at the Monastery: (11)
As of 2014-07-28 11:11 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (196 votes), past polls