|There's more than one way to do things|
Re^4: Detecting whether UV fits into an NVby syphilis (Bishop)
|on Mar 05, 2020 at 01:38 UTC||Need Help??|
My first guess is that the overhead in SvUV(ST(i)) is causing twiddle to be slower
Yes, I thought of replacing them with a variable, but decided there wouldn't be that much difference between looking at the value of an SV's IV slot and looking at the value of an IV.
I guess for a few calls there's not much difference, but when you're making 36 million of them it's not hard to believe that things might add up - and I should have thought that through a little better. (Actually, a "lot better".)
Fixing that alone makes uv_fits_double_bitfiddle almost twice as fast as uv_fits_double3 for me:
This is pretty much the type of approach whose existence I had wondered about.
It had never been pointed out to me that iv & -iv would identify the least significant set bit, and I'm certainly not sharp enough to have ever realized it myself.
This method is just brilliant ... and it's great that it turns out to be faster, too !!
I'll certainly be using it (with due accreditation to you) unless further testing, contrary to my expectations, reveals some problem with it.