|Keep It Simple, Stupid|
You are lucky, because I am free to work on this for cPanel for the next years :) At cPanel they compile perl for more than 10 years and most of their speedup is with optimizing the perl code, not so the compiler. The problem is that perl, the language, is too dynamic to be optimizable. Too much slowdown (magic, module imports, ...) and changes can happen at run-time. Also the language is not clear enough. The ops contain the needed optimizer info, not the data. So it's dealing with a lot of run-time and side-effects.
If you only need better startup time B::C is already stable and good enough. For optimized run-time more is needed.
I did some research and lobbying over the last years and came up with those approaches:
Did you see my inofficial (not yet announced) proposals on github? I'm working on const and a type system to enable future optimizations. Time-frame: 1-2 years.
perl design draft - types and const
perl types pod
perldata.pod: add const and coretypes
B::CC or another optimizing compiler can only profit from const and types.
After the YAPC we'll gather in Stavanger, Norway to discuss how to get the MOP proposals into perl.
Of course a MOP will slow down perl (other disagree), but as perl is currently hardly optimizable at all and more and more slowdown is getting introduced over the years, the MOP could be used to allow more compile-time optimizations and more efficient class, method and property handling.
Getting an optimizing compiler for YAPC::US sounds like a good plan to me, but is of course way too optimistic. B::CC already exists and passes most tests. Fixing the failures sounds like a good plan to me.
Function calls are also super-slow with perl. Getting faster functions calls by analyzing the code at compile-time and omit unnecessary exception handling, @_ handling, scope, and so on would also be a worthwile goal, up to inlining. Unfortunately it is not possible to discuss such thing on p5p. Not even the much simplier, possible XS call improvements.
The next idea will be Go-like concurrency in p5 with a changed runloop to saturate multiple native threads on multiple CPU's automatically. And some go-like language extensions for IPC communication. But I guess this vision is way too much for p5p. And even parrot is not there yet. In the current p5p process environment hardly anything could get done. It's management anti-patterns all over, as from the textbook.