http://www.perlmonks.org?node_id=987037


in reply to Perl 5 Optimizing Compiler

What are your thoughts about how this might most effectively and efficiently be achieved?

Add compact, typed arrays to Perl 5.

Remove polymorphism from the Perl 5 ops and add it to SV and descendents. (C++ may be a better option here than C.)

Add a tracing JIT. Iterate on that for a few years.

I'd love to get it done and released at YAPC 2013, which will hopefully be held here in Austin.

Invent a time machine and send the code back from 2023.

Who else should I try to contact directly?

Reini Urban.

Replies are listed 'Best First'.
Re^2: Perl 5 Optimizing Compiler
by Will_the_Chill (Pilgrim) on Aug 13, 2012 at 06:20 UTC
    Chromatic,

    Thanks for the reply. Per your note, I've e-mailed Reini Urban, as well as Sisyphus and Neil Watkiss.

    You mention typed arrays, polymorphism, and a tracing JIT mechanism. I assume these are serious recommendations (not sarcastic)?

    Thanks,
    ~ Will

      I just want to tell you don't lose heart on what chromatic wrote

      2013 isn't far away, if you sprint fast enough you can get to something and probably even fail. But you learn tons, fail fast and get something substantial.

      Or you may even get there, you never know. How many it has happened people are convinced something is impossible, until somebody comes along and proves them other wise.

      If you are determined. Start first. Don't get bogged down by details. Note: Well begun is half done.

Re^2: Perl 5 Optimizing Compiler
by xiaoyafeng (Deacon) on Aug 13, 2012 at 17:12 UTC

    Invent a time machine and send the code back from 2023.

    lol, perl5 is non type and too flexible, these 2 reasons lead it's hard to optimize. If we can add a brand new type and const system into perl5 and strict perl's flexibilty, it would be sped up as we think.




    I am trying to improve my English skills, if you see a mistake please feel free to reply or /msg me a correction

      I agree. Many optimizations that can be performed in C or other no runtime eval languages can't be done in Perl since so much context and metadata has to be kept around for a rare eval string or magic to happen. Dereferencing has to be done every time -> or {} appears in code without exception in case its a magic variable. $root{l1}{l2}{l3} really takes 4 dereferences ops every time it is written, and it can be written 7 times in a sub. Very few people (I am one of them) will make a lexical reference to the has slice "\$root{l1}{l2}{l3}" to avoid all the reference ops. I still had a dereference op every time I write "${}" but its better than "load constant on stack, do deref" times 3 opcodes. Sometimes flexibility isn't so flexible. hv_common_key_len could use some refactoring to split out all the magic support into separate function calls to keep it out of the CPU cache. Here are top 18 (actually all functions over 4 KB long) fattest functions in ActivePerl 5.12 (Visual C -O1) in machine code.
      _Perl_re_compile 00001021 _Perl_gv_fetchpvn_flags 0000105F S_scan_const 000010EB S_regatom 000012FB S_make_trie 0000148C _Perl_sv_vcatpvfn 00001620 _Perl_yyparse 000016C2 S_regclass 0000177A _Perl_do_sv_dump 00001A4D S_reg 00001B5F _perl_clone_using 00001D3C S_unpack_rec 0000231F S_study_chunk 00002668 S_pack_rec 00002729 S_find_byclass 00002928 _Perl_keyword 0000359D S_regmatch 0000438A _Perl_yylex 00008095
      un/pack I'm surprised it so fat. clone_using could use a couple strategically placed memcpy calls rather than 100s of double/quadword copies.

      Use B::Deparse and look at your opcode trees. Reduce number of opcodes and your code is faster. Each ; has overhead (line number switching for not present debugger). Use a comma operator sometimes to reduce nextstates.

        In all fairness, the line number accounting is also important for warnings and errors.

        Your list would probably also be more useful if you dropped most of the compilation related functions: At least yylex, scan_const, and yyparse. Probably, worrying about regexp compilation isn't that useful either, so those functions could go, too. What are the runners up?