http://www.perlmonks.org?node_id=990456


in reply to Re^5: Perl 5 Optimizing Compiler, Part 4: LLVM Backend?
in thread Perl 5 Optimizing Compiler, Part 4: LLVM Backend?

What you have shown is XS code compiled using the most inefficient, backwards-compatible mode.

If you compile it with #define PERL_NO_GET_CONTEXT at the top of the file, you'll find that all those calls to Perl_get_context() are avoided. Similarly, using perl 5.14.0 or later removes all those Perl_Istack_base_ptr()-style function calls.

All those checks on stack size could be replaced by using a single EXTEND(24) at the start of the function.

All those calls to create mortals could be removed by returning an array rather than a list. Etc.

A lot of the macros designed for use by XS are less efficient (depending on circumstances) than the macros use in the perl core, due to the need to ensure backwards compatibility, or to insulate the XS code from the internal details of the core etc.

If you look at the actual code in the hot pp ops, you'll find its hardly unoptimised. And if a modern C compiler can't cope with a a few extra no-op {} scopes added by perl's macros, there's something wrong with it. Show me some actual assembly of a function in pp_hot.c that's being significantly crippled by the design of perl's macros, and I'll start to take it seriously.

Other than that, everything you've talked about is wild speculation so far.

Dave.

Replies are listed 'Best First'.
Re^7: Perl 5 Optimizing Compiler, Part 4: LLVM Backend?
by BrowserUk (Patriarch) on Aug 29, 2012 at 13:01 UTC
    What you have shown is XS code compiled using the most inefficient, backwards-compatible mode. If you compile it with ... All those checks ... All those calls ... macros designed for use by XS are less efficient ...

    So, what you are saying is, if every XS-dabbling programmer, became an XS expert and learnt all the rules and tricks and techniques; and then they all modified all of their modules and programs, then things would run faster.

    Wouldn't it be nice if we had tools that took care of that?

    And if a modern C compiler can't cope with a a few extra no-op {} scopes added by perl's macros, there's something wrong with it.

    If C compilers were able to optimise away all that stuff, then wouldn't #define PERL_NO_GET_CONTEXT be unnecessary? An effective noop under optimisation?

    Compile-time optimisers cannot optimise across compile unit boundaries. C compilers are beginning to do LTO, but at a very limited level.

    If you used LLVM simply as an alternative C compiler, it couldn't do much more than modern C compilers do, but it is capable of doing so much more.

    Show me some actual assembly of a function in pp_hot.c that's being significantly crippled by the design of perl's macros, and I'll start to take it seriously.

    You could show me ...

    But who decided what was "hot"? On the basis of tracing what code?

    If the functions in pp_hot.c have come in for some special attention that has proven demonstratively worth that effort, isn't it possible that programs that use function outside of pp_hot.c might benefit if the functions they use came in for similar treatment?

    And isn't it just possible that a radically different (un-C-like) alternative like LLVM might be able to make pp_hot.c type changes elsewhere, in an automated manner?

    And maybe even find other changes that C compilers and programmers wouldn't even consider?

    Other than that, everything you've talked about is wild speculation so far.

    Agreed. Speculation based on some two years (on and off) of looking at what LLVM can, and is, doing, but still speculation. And clearly labeled as such.

    And it will remain that way until someone tries it. (Aren't you in the least bit intrigued?)


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

    RIP Neil Armstrong

    p
      So, what you are saying is, if every XS-dabbling programmer, became an XS expert and learnt all the rules and tricks and techniques; and then they all modified all of their modules and programs, then things would run faster
      No. You were showing XS code that macro-expanded to very inefficient C, which you argued, might benefit greatly from LLVM being clever. Whereas I was in the main trying to point out that core perl code is nothing like that; it's been heavily worked on for years, the macro expansions are much better, etc, and the gains from LLVM are not nearly as clear-cut. (And you can get most of the gains on the 'bad' XS code just by using 5.14.0 or later, and using a non-threaded (or non-multiplicity) build.

      Note also that LLVM is very unlikely to be able to able to optimise away any of the get_context() calls in the XS code.

      Show me some actual assembly of a function in pp_hot.c that's being significantly crippled by the design of perl's macros, and I'll start to take it seriously.

      You could show me ...

      Well, my current belief is that most important perl ops have tight code that are not hugely penalised by poor macros. If you believe differently, the onus is on you to identify such a function.

      There's nothing particularly magical about pp_hot.c: it's just one file which contains the ops that people at some point in the past have speculated as being the the most critical, and gathered them together so that (a) they just might benefit from instruction cache hits; (b) alert people that if they mess with this code, they should be extra-specially careful not to make things go slower. Many of the OPs in the other pp*.c files are heavily worked too.

      Speculation based on some two years (on and off) of looking at what LLVM can, and is, doing, but still speculation. And clearly labeled as such.

      Well, my speculation, based on 11 years experience of working on the perl core, is that improvements with LLVM will come into the "10% better" category rather than the "5x better" or "perl runs at C speed" categories. Which is where this all started.

      Don't get me wrong, I'm not opposed to things like LLVM; I just haven't been convinced yet that they can offer game-changing improvements.

      Dave.

        (And you can get most of the gains on the 'bad' XS code just by using 5.14.0 or later, and using a non-threaded (or non-multiplicity) build.

        If the gains are there to be had -- and they are -- with a sufficiently wide field of view, then they can be applied at LTO time. With argument promotion and function re-writing and SSA lifting.

        Or, for local subtrees of code, they can be applied at runtime using JIT with a sufficiently deep view.

        In other words, non-multiplicity efficiencies selectively applied ot subtrees of code on multiplicity builds running single threaded processing. Or even, subtrees of threaded coded that doesn't access shared data.

        Note also that LLVM is very unlikely to be able to able to optimise away any of the get_context() calls in the XS code.

        Why? If they can be #defined away, why can they not be optimised away?

        I anticipate your answer is: because the compiler cannot see widely enough to know to do it. And that is very definitely true for C compilers doing compilation unit optimisations.

        But for a non-C optimiser running whole program analysis and/or SSA-analysis, there is no reason it couldn't lift the get_context up to whatever level it last changed, and make it a Single Static Assignment. And perhaps place it in a register.

        If you believe differently, the onus ...
        • It would be the work of minutes for you; and (probably) days for me.

          Besides which, I do not feel either of us should feel any "onus", because ... see next bullet point.

          But as you suggested that seeing it might clarify something ...

        • All it would prove, is that on one compilation, with one configuration, on one platform something did or did not get optimised.

          The beauty of the LLVM approach to optimisation is that it (can) produce optimisations in a platform and compiler independent manner -- at its IF level -- that then get translated into whatever is the best machine-dependent implementation at compile-time or link-time or runtime.

        Well, my speculation, based on 11 years experience of working on the perl core, is that improvements with LLVM will come into the "10% better" category rather than the "5x better" or "perl runs at C speed" categories. Which is where this all started. Don't get me wrong, I'm not opposed to things like LLVM; I just haven't been convinced yet that they can offer game-changing improvements.

        I only joined this thread belatedly, to counter the negative energy of "it couldn't work", "it wouldn't work", "it cannot be done", and "it would be a waste of energy to try".

        And please ignore any possible negative implications of this, but I did so because I saw -- and see -- people thinking of LLVM as simply another C compiler. And citing "failed" projects where it has been used in exactly that way. All I'm trying to do is to get people to stop thinking in C; read a little of the tutorial information, and consider the possibilities.

        Your knowledge and experience is -- and would be -- invaluable to such a project, even if only in an advisory capacity if that's all you have the time and interest for. And all I'm trying to do here is persuade you (and others) to consider the possibilities. If I can speculate with my dearth of knowledge of the internals, what might someone like you see, if you were of a mind to understand that LLVm is nothing like any other "compiler"?

        >Another mind's eye speculation for you to consider:

        Much of the costly code in PP_* routines is there to deal with the 5 or 10% cases. Imagine if -- manually for now -- each SV was born marked (or better unmarked) as carrying magic or not. Say a single bit in the (unused) least-significant 3 or 4 bits of the SVs address, was used to flag when an SV had magic attached to it.

        And then imagine a whole duplicate set of pp_* routines that never even considered the possibility of magic, and only called other functions that similarly never considered magic.

        And then in the runloop, a one-time inspection was made and the appropriate chain of pp_no_mg_* or normal pp_* routines was dispatched to.

        Do you see any gains to be had from running the 90% or 95% (?) of code that doesn't use magic, through a set of opcodes that not only don't test for magic, but also do not carry all the extra variables and conditional branch code that can confuse and overpower the C optimiser and/or the CPU branch predictor?

        Please read http://llvm.org/pubs/2004-09-22-LCPCLLVMTutorial.pdf, and see that LLVM was able to do stuff not unlike the above 8 years ago. And things have moved on. A lot!

        (There is a fairly detailed walk through of a moderately complex example starting at slide 28, but you would probably need to read the preceding to understand the nomenclature.)

        Again, no offense is meant by this, but maybe you have been so close to the Perl sources for so long, that you view them the way a C compiler does. And so cannot imagine the possibilities of looking at them not as C source, or even C compiler generated machine code, but rather with Explicit SSA dataflow; and an explicit control-flow graph; and explicit language-independent type information; and (most importantly in the case of Perl), explicitly typed pointer arithmetic; in conjunction with an infinite SSA register set and loads/stores with typed pointers.

        I can see that I do not have the gravitas to convince you. So, please, put aside my speculations and read the .pdf, and allow it to engender your own speculations about the possibilities. Because your speculations would be so much more insightful and therefore useful than mine.

        And, if you read it, it will engender your speculations.


        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.

        RIP Neil Armstrong