http://www.perlmonks.org?node_id=990512


in reply to Re^9: Perl 5 Optimizing Compiler, Part 4: LLVM Backend?
in thread Perl 5 Optimizing Compiler, Part 4: LLVM Backend?

(I'll read the pdf later and come back and comment on it further, but for now):

Imagine if -- manually for now -- each SV was born marked (or better unmarked) as carrying magic or not. Say a single bit in the (unused) least-significant 3 or 4 bits of the SVs address, was used to flag when an SV had magic attached to it.

But magic is a dynamic, runtime thing, not something that can be determined at the time an SV is created. For example:

use Devel::Peek; my $x = "abc"; Dump $x; # here $x is not magic $x =~ /./g; Dump $x; # now it is

So I don't see how that could work.

Also, in most binary perl ops, there is a macro call at the top of each function, tryAMAGICbin_MG(), that ORs the flags fields of the top two SVs on the stack together, then checks to see if the combined flags word indicates that either SV is magic or is overloaded, and if so, calls a function which handles this. The rest of the pp_ function can continue while assuming that its args aren't special:

#define tryAMAGICbin_MG(method, flags) STMT_START { \ if ( ((SvFLAGS(TOPm1s)|SvFLAGS(TOPs)) & (SVf_ROK|SVs_GMG)) \ && Perl_try_amagic_bin(aTHX_ method, flags)) \ return NORMAL; \ } STMT_END

Consider the following trivial code:

my $x = 1; my $y = 2; my $z; for (1..1000) { $z = $x + $y; }

cachgrind shows that the tryAMAGICbin_MG() line at the top of pp_add() does 7 instruction reads, 3 data reads and 1 branch per call (with no significant cache or branch predictor fails), while the pp_add() function as a whole does 85 instruction reads. So even if there was some way to know in advance to call a non-magic / non-overload version of pp_add (and I don't think there is) the best improvement you could theoretically achieve is 9%.

So you've shown me an example that (a) can't work, and (b) even if it did, doesn't show me that much could be gained.

Don't get me wrong, I want to be inspired; but I just haven't seen anything yet that indicates we could get more than that 10%.

Dave