Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

Re: Challenge: CPU-optimized byte-wise or-equals (for a meter of beer)

by Corion (Patriarch)
on Sep 12, 2007 at 14:16 UTC ( [id://638562]=note: print w/replies, xml ) Need Help??


in reply to Challenge: CPU-optimized byte-wise or-equals (for a meter of beer)

I prefer to let the C part of Perl to do the chore of the work. Especially, if you can chuck work down to the regex engine. Or, even better, if you can pretend to use the regex engine but don't fire it up but get a simple character scan to work for you:

I split $s1 on \0 and then join it back with the corresponding substrings taken from $s2. That reduces the number of comparisons and substrings done in Perl.

sub map_split { my $ofs = 0; join "", map { $ofs += length; $_ => substr $s2, $ofs++, 1 } split + /\0/, $s1, -1; };

On this machine I get:

Rate split1 substr1 using_str_bit_ops_and_ +tr map_split split1 1.85/s -- -94% -10 +0% -100% substr1 32.9/s 1676% -- -9 +5% -98% using_str_bit_ops_and_tr 714/s 38390% 2067% +-- -62% map_split 1898/s 102263% 5663% 16 +6% --

Update: Weird - I would have imagined ikegami's method of using the binary AND of strings to be way faster than mine. And it's even faster than avar's, even though I think that avar's method does far less copying of data (and the use of @+ is far more elegant than my approach).

Replies are listed 'Best First'.
Re^2: Challenge: CPU-optimized byte-wise or-equals (for a meter of beer)
by Corion (Patriarch) on Sep 12, 2007 at 14:35 UTC

    I was curious - my "goal" when jotting down the code was to use the minimal number of Perl ops and to let Perl do as much work as possible in the C parts of Perl. So I returned a list from map instead of returning a string:

    This returns the list and joins all the fragments later:

    sub map_split { my $ofs = 0; join "", map { $ofs += length; $_ => substr $s2, $ofs++, 1 } split + /\0/, $s1, -1; };

    This does a partial fragment concatenation in the map and returns a single string, but still does the concatenation later, but of half as many elements - one additional op in the map call:

    sub map_split_join { my $ofs = 0; join "", map { $ofs += length; $_ . substr $s2, $ofs++, 1 } split +/\0/, $s1, -1; };

    I didn't expect the difference to be that big:

    Rate map_split_join subst map_split map_split_join 1697/s -- -7% -13% subst 1828/s 8% -- -6% map_split 1949/s 15% 7% --

    So, I can only recommend reading/listening to When Perl Is Not Quite Fast Enough. Undoubtedly, there is still lots of potential in optimizing this.

Re^2: Challenge: CPU-optimized byte-wise or-equals (for a meter of beer)
by moritz (Cardinal) on Sep 12, 2007 at 14:50 UTC
    Since there are not so many bytes to substitute it pays off to search for these rare cases instead of working "blindly" with bit ops.

    Creating $mask takes one pass through the string, and then there are three binary bit ops on the string(s). That makes four passes, while only 1/256 of the bytes being zero bytes.

    Thus a solution that actually looks for zeros (one pass) may have an overhead of 3*256 compared to the bit operations to be still as fast. (Rough estimation only)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://638562]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others romping around the Monastery: (3)
As of 2024-04-24 01:12 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found