Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

Re^3: Perl 5 Optimizing Compiler, Part 4: LLVM Backend?

by BrowserUk (Pope)
on Aug 27, 2012 at 23:39 UTC ( #990087=note: print w/ replies, xml ) Need Help??


in reply to Re^2: Perl 5 Optimizing Compiler, Part 4: LLVM Backend?
in thread Perl 5 Optimizing Compiler, Part 4: LLVM Backend?

I suspect that the performance of this hypothetical application would be determined almost exclusively by the performance of the hash-table code within the “guts,” such that time spent generating the parse-trees and then iterating through them would be negligible.

And I suspect that, once again, you haven't a clue what you are talking about. Have you ever bothered to look into hv.c?

Each Perl "opcode" has to deal with a complex variety of different possibilities.

  • Is the (hash) tied?
  • Are the hash keys unicode?
  • Does it have shared keys?
  • Is it a stash?
  • Is it a glob?
  • Does it have magic attached?
  • More ...

With runtime compilation (or jit), it would be possible for 'simple hash' accesses/inserts/updates to bypass all of the myriad checks and balances that are required for the general case, which could yield significant gains in hash heavy code. Ditto for arrays. Ditto for strings. Ditto for numbers. (Do a super search for "use integer" to see some of the possibilities that can yeild.)

Then there is the simple fact that perl's subroutines/methods are -- even by interpreter standards -- very slow. (See: 488791 for a few salient facts about Perl's subcall performance.)

Much of this stems from the fact that the way the perl sources are structured, C compilers cannot easily optimise across compilation unit boundaries, because they mostly(*) do compile-time optimisations. However, there are a whole class of optimisations that can be done at either link-time or runtime, that would hugely benefit Perl code.

(*)MS compiler have the ability to do some link-time optimisations, and it would surprise me greatly if gcc doesn't have similar features. It would also surprise me if these have ever been enabled for teh compilation of Perl. They would need to be specifically tested on so many platforms, it would be very hard to do.

But, something like LLVM, can do link-time & runtime optimisations, because it (can) targets not specific processors, but a virtual processor (a "VM") which allows its optimiser to operate in that virtual environment. And only once the VM code has been optimised is it finally translated into processor specific machine code.That means you only need to test each optimiation (to the VM) once; and independently, the translation to each processor.

Not the combinatorial product of all optimisations on all processors.

What would these gains be worth? It is very hard to say, but if it gave 50% of the difference between (interpreted, non-JITed Java & perl running an recursive algorithm (Ackermann), that does a few simple additions (11 million times):

So 83.6 seconds for Perl, and 1.031 seconds for Java!

Perl's productivity and (1/2) Java's performance!

That would be something worth having for vast array of genomists, physicists, data miners et al.

Heck. It might even mean that hacks like mod_perl might become redundant; making a whole bunch of web monkeys happy. Moose might even become usable for interactive applications. Parse::RecDescent might be able to process document in real time rather than geological time. DateTime might be able to calculate deltas as they happen rather than historically.

There are three fundamental limitations on an interpreters performance:

  1. Subroutine/method call performance.
  2. Memory allocation/deallocation performance.
  3. Parameter passing performance.

Whilst Perl is faster than (native code) Python & Ruby, it sucks badly when compared to Java, LUA etc. And the reasons are:

  1. Perl's particular brand of preprocessor macro-based, Virtual Machine was innovative and way ahead of its time when it was first written.

    But many different hands have been lain upon that tiller in the interim, without (in many cases) understanding the virtues of simplicity that it had for in-lining, and the optimisations that came from that.

    The result is that now, many of the macros are so complex, and so heavily nested and asserted, that the best compiler in the world cannot optimise the twisted morass of code that results, from what looks like a few simple lines, prior to the preprocessor doing its thing.

    The result is that each line of the perl source is so heavily macroised, that very few people realise that the five or six lines of code that are required to construct a minimal subroutine at the perl source level, expand to dozens -- even hundreds of lines once the preprocessor has run. And that those expanded lines are so heavily blocked and nested, that they blow the optimiser stack of pretty much every C compiler available.

  2. Perl's current memory allocator has so many layers to it, that it is neigh impossible to switch in something modern, tried and tested, like the Bohiem allocaotor.

    Much less enable the recognition that the future is 64-bit, and utilise the 8TB of virtual address space to allow each class (of fundamental and user) object to use a dedicated array of exact-sized objects with a simple free list chain.

  3. It eshews the hardware optimise push/pop of parameters to the processor (c) stack (1 opcode each) in favour of (literally) hundreds of opcodes required by each push and pop of its software emulation of those hardware instructions using multiple heap-based stacks.

    Parrot suffers from a similar problem. Someone decided that rather than use the hardware optimised (and constantly re-optimised with each new generation) hardware stack for parameter passing, it was a good idea to emulate the (failed) hardware-based, register-renaming architecture of (the now almost obsolete) RISC processors, in software.

In a nutshell, your "suspicions" are so out of touch with reality, and so founded upon little more than supposition, that they are valueless.


With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

RIP Neil Armstrong


Comment on Re^3: Perl 5 Optimizing Compiler, Part 4: LLVM Backend?
Re^4: Perl 5 Optimizing Compiler, Part 4: LLVM Backend?
by chromatic (Archbishop) on Aug 28, 2012 at 00:00 UTC

    You're half right. hv.c does demonstrate one of the big problems in optimizing Perl 5, but no amount of static optimization fairy dust magic will help.

    The biggest problem with regard to SVs and unoptimizability is that the responsibility for determining how to access what's in the SV is in every op. That's why every op that does a read has to check for read magic and every op that does a write has to check for write magic. That's why ops are fat in Perl 5. (That's why ops are fat in Parrot, which still has a design that would make it a much faster VM for Perl 5.6.)

    Moving magic into SVs from ops would help, as would porting the Perl 5 VM to C++ which can optimize for the type of static dispatch this would enable.

    LLVM would only really help if you could compile all of Perl 5 and the XS you want for any given program to LLVM IR and let LLVM optimize across the whole program there, but even then you still have to move magic into the SVs themselves (or spend a lot of time and memory tracing types and program flow at runtime) to be able to optimize down to a handful of processor ops.

    I suspect no one's going to do that for a 10% performance improvement at the cost of 10x memory use.

    With that said, I must disagree with:

    Perl's particular brand of preprocessor macro-based, Virtual Machine was innovative and way ahead of its time when it was first written.

    Not if you look at a good Forth implementation or a decent Smalltalk implementation, both of which you could find back in 1993.

    Perl's current memory allocator has so many layers to it, that it is neigh impossible to switch in something modern, tried and tested, like the Bohiem allocaotor.

    I don't see how Boehm would help. The two biggest memory problems I've measured are that everything must be an SV (even a simple value like an integer) and that there's no sense of heap versus stack allocation. Yes, there's the TARG optimization, and that helps a lot, but if you want an order of magnitude speed improvement, you have to avoid allocating memory where you don't need it.

    Someone decided that rather than use the hardware optimised (and constantly re-optimised with each new generation) hardware stack fr parameter passing, it was a good idea to emulate the (failed) hardware-based, register-renaming architecture of (the now almost obsolete) RISC processors, in software.

    You're overlooking two things. First, you can't do anything interesting with continuations if you're tied to the hardware stack. Second, several research papers have shown that a good implementation of a register machine (I know the Dis VM for Inferno has a great paper on this, and the Lua 5.0 implementation paper has a small discussion) is faster than the equivalent stack machine. I think there's a paper somewhere about a variant of the JVM which saw a measurable speed improvement by going to a register machine too. (Found it in the bibliography of the Lua 5.0 paper: B. Davis, A. Beatty, K. Casey, D. Gregg, and J. Waldron. The case for virtual register machines.)

    ... but with all that said, Parrot's lousy calling convention system is not a good example of a register machine. A good register machine lets you go faster by avoiding moving memory around. Parrot's calling conventions move way too much memory around to go fast.

      Up front. When you nay-say the OPs discussion, I, like many others I suspect, read each sentence twice, consider it thrice, and then stay stum. You have the knowledge and experience to contribute to the OPs endeavors, even when you do so with negative energy. You can save the OP from many blind alleys.

      When sundial "contributes" his 'stop energy'(skip directly to 33:24) there is no knowledge, no experience, nothing but the negative energy of his groundless suppositions.

      LLVM would only really help if you could compile all of Perl 5 and the XS you want for any given program to LLVM IR and let LLVM optimize across the whole program there, but even then you still have to move magic into the SVs themselves (or spend a lot of time and memory tracing types and program flow at runtime) to be able to optimize down to a handful of processor ops.

      Are you 100% sure there would be no gains?

      Just for a minute suspend your disbelief and imagine that all of perl5.x.x.dll/.so was compiled (otherwise unmodified wherever possible) to LLVMs IF. And then when that .dll/.so is linked, all the macros have been expanded and in-lined, all the do{...}while(1) blocks are in-situ; all the external dependencies of all the compile-time scopes are available.

      Are you 100% certain that under those circumstances, that the link-time optimiser isn't going to find substantial gains from its supra compile-unit view of that code?

      Now suspend your disbelief a little further and imagine that somone had the energy and time to use LLVMs amazingly flexible, platform-independent, language-independent type system (it can do 33-bit integers or 91 bit floats if you see the need for them), to re-cast Perl's internal struct-based type inheritance mechanism into a concrete type-inheritance hierarchy.

      What optimisation might it find then?

      C treats structs as opaque lumps of storage, and has no mechanisms for objects, inheritance or any extensions of its storage-based types. But (for example) C++ has these concepts, and as you say:

      porting the Perl 5 VM to C++ which can optimize for the type of static dispatch

      if you could port Perl's type hierarchy to C++, then its compilers should be able to do more by way of optimising them.

      But porting perl to C++ would be a monumental task because it would require re-writing everything to be proper, standards-compliant, C++. Properly OO with all that entails.

      LLVM doesn't impose any particular HLL's view of the world on the code. LL stands for low-level. It doesn't impose any particular type mechanism on the code, it will happily allow you to define a virtual machine (VM) that uses 9-bit words and 3-word registers.

      Isn't it just possible that it might allow the Perl VM to be modeled directly, such that -- with the overview that link-time optimisation has -- it can produce some substantial runtime benefits?

      And just maybe allow the pick-up-sticks nature of the Perl internals to be cleaned up along the way?

      And finally, there is the possibility that its JIT capabilities may be able to recognise (at runtime) when a hash(ref) is 'just a hash', and optimise away all the tests for magic, stashes, globs and other variations, and so fast path critical sections of code at runtime.

      What percentage of Perl's opcode usage actually uses those alternate paths? 10%? 5%? Doesn't that leave a substantial amount of Perl code as potentially JITable to good effect?

      Whether LLVM JIT is up to the task is a different question -- one that would be answered if we could try it.

      Not if you look at a good Forth implementation or a decent Smalltalk implementation, both of which you could find back in 1993.

      I was using Digitalk's SmallTalk/VPM at around that time, and it was dog slow.

      Forth compilers were making strides using their interlaced opcodes technology (called threaded interpreted code back then, but that has different connuctations these days), but a) those interpreters were in large part handed-coded in assembler; b) you had to write your programs in Forth. Like Haskell, its a different mindset, largely out-of-reach of the sysadmins, shell & casual programmers that Perl targeted.

      Defining a language that targets a VM defined in (back then) lightweight C pre-processor macros, and throwing it at the C compilers to optimise, was very innovative.

      The problem is that the many heavy-handed additions, extensions and overzealous "correctness" drives, have turned those once lightweight opcode macros into huge, heavyweight, scope-layered, condition-ridden lumps of unoptimisible boiler-plate. Most of which very few people have ever even taken the time to expand out and look at. Basically, noone really knows what the Perl sources actually look like.

      Too many heavy-hands on the tiller pulling it every which way as the latest greatest fads come and go, have left us with an opaque morass of nearly untouchable code. (That is in no way to belittle the mighty efforts of the current (and past) maintainers; but rather to acknowledge the enormity of their chosen task!)

      I don't see how Boehm would help.

      I'm not sure that it would either, but the main problem is that it would be neigh impossible to try it. Somewhere here (at PM), I documented my attempts to track through the myriad #definiition and redefinitions that make up Perl's memory manager -- it ran to (from memory, literally) hundreds of *alloc/*free names. Impossible to fathom.

      On Windows, as built by default (and AS; and to my knowledge Strawberry), the allocator that gets used can quite easily (using pretty standard perl code), be flipped into a pathological mode where almost every scalar allocation or reallocation results in a page fault. Documented here 4 or 5 two (seemed like longer) years ago, it is still there, despite my posting a 1-line patch to fix it.

      Much of my knowledge of using Perl in a memory-efficient manor has come about simply as a result of finding ways to avoid that pathological behaviour.

      Another big part of the memory problem is the mixing of different allocation sizes within a single heap. Whilst the allocator uses buckets for different sized entities, mixing fixed-sized entities -- scalars, rvs, ints, floats etc. -- and variable sized entities -- strings, AVs etc. -- in the same stack means that you inevitably end up with creeping fragmentation.

      Imagine an allocator that used different heaps for each fixed-sized allocation; and another two heaps for variable-sized allocations that it flip-flops between when it needs to expand the variable-sized heap. Instead of reallocing in-place, it grabs a fresh chunk of VM from the OS and copies the existing strings over to the new heap and discards the old one thereby automatically reclaiming fragmentation.

      Don't argue the case here, I've omitted much detail. But the point is that as-is, it is simply too hard to try bolting a different allocator underneath Perl, because what is there is so intertwined.

      You're overlooking two things. First, you can't do anything interesting with continuations if you're tied to the hardware stack.

      Are continuations a necessary part of a Perl-targeted VM? Or just a theoretically interesting research topic-du-jour.

      From my viewpoint, the fundamental issue with the Parrot VM was and is the notion that it should be all things to all men. Every theoretical nice-to-have and every cool research topic of the day, was to be incorporated in order to support the plethora of languages that were going to magically inter-operate atop it.

      Cool stuff if you have Master's level researchers on research budgets and academia's open-ended time frames to play with. But as a solution to the (original) primary goal of supporting Perl6 ...

      Second, several research papers have shown that a good implementation of a register machine (I know the Dis VM for Inferno has a great paper on this, and the Lua 5.0 implementation paper has a small discussion) is faster than the equivalent stack machine.

      Research papers often have a very particular notion of equivalence.

      Often as not, such comparisons are done using custom interpreters that assume unlimited memory (no garbage collection required), supporting integer-only baby-languages running contrived benchmarks for strictly limited periods on otherwise quiescent machines that are simply switched off when memory starts to exhaust.

      So unrepresentative of running real languages on real workloads on real-world hardware environments, that their notion of equivalence has to be taken very much in the light of the research they are conducting.

      Is there a single, major real-world language that uses continuations?

      Is there a single, real-world, production use VM that emulates a register machine in software?

      Why have RISC architectures failed to take over the world?


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

      RIP Neil Armstrong

        You can save the OP from many blind alleys.

        I don't think he's listening. If this stuff were easy, it would be more than a pipe dream by now.

        I hope no one's taking my criticism as stop energy. My intent is to encourage people to solve the real problems and not pin their hopes on quick and dirty stopgaps that probably won't work.

        Are you 100% certain that under those circumstances, that the link-time optimiser isn't going to find substantial gains from its supra compile-unit view of that code?

        I expect it will find some gains, but keep in mind two things. First, you have to keep around all of the IR for everything you want to optimize across. That includes things like the XS in DBI as well as the Perl 5 core. Second, LLVM tends to expect the languages it compiles have static type systems. The last time I looked at its JIT, it didn't do any sort of tracing at runtime, so either you add that yourself, or you do without. (I stand by the assumption that the best opportunity for optimization from a JIT is rewriting and replacing basic blocks with straight line code that takes advantage of known types.)

        With that said, compiling all of Perl 5 with a compiler that knows how to do link time optimization does offer a benefit, even if you can use LTO only on the core itself. This won't be an order of magnitude improvement. If you get 5% performance improvement, be happy.

        Defining a language that targets a VM defined in (back then) lightweight C pre-processor macros, and throwing it at the C compilers to optimise, was very innovative.

        Maybe so as far as that goes, but the implementation of Perl 5 was, from the start, flawed. Even something as obvious as keeping the reference count in the SV itself has huge problems. See, for example, the way memory pages go unshared really really fast even when reading values between COW processes.

        The problem is that the many heavy-handed additions, extensions and overzealous "correctness" drives, have turned those once lightweight opcode macros into huge, heavyweight, scope-layered, condition-ridden lumps of unoptimisible boiler-plate.

        I think we're talking about different things. Macros or functions are irrelevant to my criticisms of the Perl 5 core design. My biggest objection is the fat opcode design that puts the responsibility for accessing values from SVs in the opcode bodies rather than using some sort of polymorphism (and it doesn't have to be OO!) in the SVs themselves.

        Are continuations a necessary part of a Perl-targeted VM?

        They were a must-have from Perl 6 back then. They simplify a lot of user-visible language constructs, and they make things like resumable exceptions possible. If implemented well, you can get a lot of great features reasonably cheaply from CPS as your control flow mechanism.

        Lua uses them, and Lua uses a register architecture.

        Why have RISC architectures failed to take over the world?

        Windows, I suspect.

        where almost every scalar allocation or reallocation results in a page fault. Documented here two (seemed like longer) years ago, it is still there, despite my posting a 1-line patch to fix it.
        We (the perl5 committers) can easily overlook things. The best approach is to create an RT ticket, then if we ignore that, prod us from time to time by updating the ticket by replying to it.

        Dave.

Re^4: Perl 5 Optimizing Compiler, Part 4: LLVM Backend?
by sundialsvc4 (Abbot) on Aug 28, 2012 at 00:44 UTC

    It is the fundamental nature of a programming language like Perl that the opcodes can be presented with many different situations, as you described.   And it knows how to deal with them, so that the programmer does not have to.

    In order to avoid having the interpreter have to do all of these things, you must introduce strong-typing into the language.   Which Perl emphatically does not have.   You must restrict the type of parameters that can be passed into a given subroutine, so that the compiler can make the correct determination(s), statically.   You must also be able to prove that the operation of the compiler and therefore of the generated code is correct:   that your statically-determined checks are both complete and correct; that no other program behavior is possible.

    I argue that the Perl language does not possess the necessary semantics, and it was purposely designed not to require them.   As a language, it is a product of its intended implementation-method; of DWIM and all of that.   And I argue that these characteristics impose that implementation method at the exclusion of all others.

    If you want strong typing, use any one of many languages that provide it.   Those languages provide the semantic detail that your compiler will require.   Without them, you will find that you can’t do it.   The Perl language does not possess them and it never did.   And I think that this is what the professor was saying, when he said it would be a good project for an intern where you could always stop at any time and say you won.

    As I have politely said before, each of us have different core competencies, and language/compiler/interpreters happen to be one of mine.

Re^4: Perl 5 Optimizing Compiler, Part 4: LLVM Backend?
by Anonymous Monk on Aug 28, 2012 at 02:17 UTC
Re^4: Perl 5 Optimizing Compiler, Part 4: LLVM Backend?
by bulk88 (Priest) on Aug 28, 2012 at 10:22 UTC

    Much of this stems from the fact that the way the perl sources are structured, C compilers cannot easily optimise across compilation unit boundaries, because they mostly(*) do compile-time optimisations. However, there are a whole class of optimisations that can be done at either link-time or runtime, that would hugely benefit Perl code.

    (*)MS compiler have the ability to do some link-time optimisations, and it would surprise me greatly if gcc doesn't have similar features. It would also surprise me if these have ever been enabled for teh compilation of Perl. They would need to be specifically tested on so many platforms, it would be very hard to do.

    But, something like LLVM, can do link-time & runtime optimisations, because it (can) targets not specific processors, but a virtual processor (a "VM") which allows its optimiser to operate in that virtual environment. And only once the VM code has been optimised is it finally translated into processor specific machine code.That means you only need to test each optimiation (to the VM) once; and independently, the translation to each processor.

    64 bit VC builds have been LTCG from basically day 1 http://perl5.git.perl.org/perl.git/commit/d921a5fbd57e5a5e78de0c6f237dd9ef3d71323c?f=win32/Makefile. A couple months ago I compiled a Perl with ltcg in 32 bit mode, unfortunately I dont remember the VC version, whether it was my 2003 or my 2008. The DLL got slightly (I dont remember how many KB) fatter from inlining, but the inlined functions still existed as separate function calls, and the assembly looked the same everywhere, and I didn't find anything (looking around randomly by hand) that got a non-standard calling convention except what already was static functions. I wrote it off as useless. 2003 vs 2008 for 32bit code might make all the difference though. I decided it wasn't worth writing a patch up for and submitting to P5P to change the makefile.

    With Will's LLVM proposal, I believe nothing will come of it unless some or all of the pp_ opcode functions, along with runops are rewritten in "not C", or perl opcodes are statically analyzed and converted to native machine data types with SVs being gone. All the "inter procedure optimizations" mentioned in this thread are gone the moment you create a function pointer, it is simply the rules of C and C's ABI on that OS http://msdn.microsoft.com/en-us/library/xbf3tbeh%28v=vs.80%29.aspx.

    I went searching through perl's pre and post preprocessor headers. I found some interesting things which prove that automatic IPO, on Perl, in C with any compiler is simply impossible.
    /* Enable variables which are pointers to functions */ typedef void (*peep_t)(pTHX_ OP* o); typedef regexp* (*regcomp_t) (pTHX_ char* exp, char* xend, PMOP* pm); typedef I32 (*regexec_t) (pTHX_ regexp* prog, char* stringarg, char* strend, char* strbeg, I32 minend, SV* screamer, void* data, U32 flags); typedef char* (*re_intuit_start_t) (pTHX_ regexp *prog, SV *sv, char *strpos, char *strend, U32 flags, re_scream_pos_data *d); typedef SV* (*re_intuit_string_t) (pTHX_ regexp *prog); typedef void (*regfree_t) (pTHX_ struct regexp* r); typedef regexp* (*regdupe_t) (pTHX_ const regexp* r, CLONE_PARAMS *par +am); typedef I32 (*re_fold_t)(const char *, char const *, I32); typedef void (*DESTRUCTORFUNC_NOCONTEXT_t) (void*); typedef void (*DESTRUCTORFUNC_t) (pTHX_ void*); typedef void (*SVFUNC_t) (pTHX_ SV* const); typedef I32 (*SVCOMPARE_t) (pTHX_ SV* const, SV* const); typedef void (*XSINIT_t) (pTHX); typedef void (*ATEXIT_t) (pTHX_ void*); typedef void (*XSUBADDR_t) (pTHX_ CV *); typedef OP* (*Perl_ppaddr_t)(pTHX); typedef OP* (*Perl_check_t) (pTHX_ OP*); typedef void(*Perl_ophook_t)(pTHX_ OP*); typedef int (*Perl_keyword_plugin_t)(pTHX_ char*, STRLEN, OP**); typedef void(*Perl_cpeep_t)(pTHX_ OP *, OP *); typedef void(*globhook_t)(pTHX); ////////////////////////////////////////// /* dummy variables that hold pointers to both runops functions, thus f +orcing * them *both* to get linked in (useful for Peek.xs, debugging etc) */ EXTCONST runops_proc_t PL_runops_std INIT(Perl_runops_standard); EXTCONST runops_proc_t PL_runops_dbg INIT(Perl_runops_debug); //////////////////////////////////////////// START_EXTERN_C #ifdef PERL_GLOBAL_STRUCT_INIT # define PERL_PPADDR_INITED static const Perl_ppaddr_t Gppaddr[] #else # ifndef PERL_GLOBAL_STRUCT # define PERL_PPADDR_INITED EXT Perl_ppaddr_t PL_ppaddr[] /* or perlvars.h */ # endif #endif /* PERL_GLOBAL_STRUCT */ #if (defined(DOINIT) && !defined(PERL_GLOBAL_STRUCT)) || defined(PERL_ +GLOBAL_STRUCT_INIT) # define PERL_PPADDR_INITED = { Perl_pp_null, Perl_pp_stub, Perl_pp_scalar, /* implemented by Perl_pp_null */ Perl_pp_pushmark, Perl_pp_wantarray, Perl_pp_const, ////////////////////////////////////////////
    Now in C++, in theory, calling conventions don't exist unless you explicitly force one. The compiler is free to choose how it wants to implement vtables/etc. MS's Visual C for static C functions does do some pretty good "random" calling conventions for 32bit X86 IMHO. For 64 bit X86, Visual C never deviated from the 1 and only calling convention. The question is, are there any compilers daring enough to create a whole DLL/SO which contains exactly 1 function call in C?

    Not any professional compiler. On some OSes (x64 windows), ABI is enforced through OS parsing of assembly code (x64 MS SEH, technically not true, if you are careful, the OS will never have a reason to parse your ASM). And on some CPUs (SPARC) calling conventions are enforced in hardware.

    Another danger, there is a fine line between inlining/loop unrolling, and making your L1 and L2 Caches useless. Blindly inlining away all function calls will cause a multi MB object file per Perl script that won't solve anything.
      I went searching through perl's pre and post preprocessor headers. I found some interesting things which prove that automatic IPO, on Perl, in C with any compiler is simply impossible.

      But LLVM isn't a C compiler. It can compile C (amongst many other languages), but it doesn't (have to) follow C conventions.

      LLVM is a far more an assembler targeting a user definable virtual processor. As an illustration of the sorts of things it can and does do, can you think of any other compiler technology that will generate 832-bit integers as a part of its optimisation pass?

      You have to stop thinking of LLVM as a C compiler before can even begin to appreciate what it is potentially capable of. It is weird, and to my knowledge unique.

      In a world where everything -- processors, memory, disk, networking et al. -- are being virtualised; why not virtualise the compiler, have it target a (user configurable) virtual processor, and produce not just platform independence, but processor architecture independence, and source language independence?

      Can it really tackle a hoarey ol' dynamic language and apply those principles to it successfully? The simple answer is: I do not know. But neither does anyone else!

      Stop nay-saying based upon your knowledge of what C compilers do, and follow the matra: (Let someone else) Try it!

      I first installed LLVM here (going by the date on the subdirectory) on the 6th May 2010:

      C:\test>dir /t:c .| find "llvm" 06/05/2010 20:35 <DIR> llvm 06/05/2010 20:55 <DIR> llvm-2.7

      I've been playing with it and reading about it on and off ever since, I still keep learning new things about it all the time. It is unlike anything I've come across before, and defies my attempts at description. Virtual compiler, virtual interpreter, virtual assembler. Take your pick; or all 3.

      Give it a try, (or at least a read) before you summarily dismiss it it out of hand.


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

      RIP Neil Armstrong

      Sorry for the second reply, but I responded before seeing the stuff below the code block.

      For 64 bit X86, Visual C never deviated from the 1 and only calling convention.
      1. Firstly, that is a good thing. Much better than the previous world of cdecle, pascal, fastcall et al.
      2. Whilst the MSC compiler won't vary the calling convention, it provides the hooks to allow the programmer to do so.

        See the frame attribute of the PROC directive & .ENDPROLOGUE directive

        Scant information, but the possibility.

      The question is, are there any compilers daring enough to create a whole DLL/SO which contains exactly 1 function call in C?

      Just this morning I read that the LLVM JIT had until recently, a 16MB limitation on its JIT'ed code size, which is now lifted.

      Besides which, I don't believe that you need to optimise across function boundaries to get some significant gains (over C compilers) out of the Perl sources.

      You pointed out that many of perl's opcodes and functions are huge. Much of the problem is not just that they are huge, but also that they are not linear. The macros that generate them are so heavily nested and so frequently introduce new scopes, and unwieldy asserts, that C compilers pretty much give up trying optimise them because they run out of whatever resources they use when optimising. Too many levels of scope is a known inhibitor of optimisers. That's where inlining can help.

      Will LLVM fare any better? Once again, we won't know for sure unless someone tries it.

      Another danger, there is a fine line between inlining/loop unrolling, and making your L1 and L2 Caches useless. Blindly inlining away all function calls will cause a multi MB object file per Perl script that won't solve anything.

      Once again I ask: are you sure?

      If the JIT can determine that this variable -- hash(ref), array(ref) or scalar -- is not tied, has no magic, and never changes its type -- IV or NV to PV or vice versa -- within a particular loop, then it can throw away huge chunks of conditional code. Similarly for utf/non-utf string manipulations; similarly for all the context stuff for non-threaded code on threaded builds.

      Note: I say "can", not will. The only way we collectively will know for sure if it will, is to try it.


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

      RIP Neil Armstrong

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://990087]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others exploiting the Monastery: (9)
As of 2014-10-01 18:16 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    What is your favourite meta-syntactic variable name?














    Results (32 votes), past polls