go ahead... be a heretic PerlMonks

Loops loosing Performance

by PerlingTheUK (Hermit)
 on Jun 09, 2005 at 23:07 UTC Need Help??
PerlingTheUK has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks, I have currently a set of values for weekdays, that is stored as integer. To clarify here are example binary values:
Mon = 0b0000001 = 1
Tue = 0b0000010 = 2
Mon to Fri = 0b0011111 = 31
Sat to Sun = 0b1100000 = 96
I need to know how many days of running two sets of data have in common. This is a simple task, however I got stuck in a vicious circle between simplicity and performance. I st up five test functions and compared their speed. It did not surprise me much, that the fastes way is a bitwise evaluation. But if I ran through all days with a loop, the function is a lot slower. Can anyone suggest a better way to approach this? Please find the current code below.
```sub test4 {

my ( \$iDays, \$iCmp ) = @_;

my \$iRes = 0;
for ( 1, 2, 4, 8, 16, 32, 64 ){
if ( \$iDays & \$iCmp & \$_ ){
\$iRes++;
}
}

return \$iRes;
}

sub test5 {

my ( \$iDays, \$iCmp ) = @_;

my \$iRes = 0;
\$iRes++ if ( \$iDays &  \$iCmp &  1 );
\$iRes++ if ( \$iDays &  \$iCmp &  2 );
\$iRes++ if ( \$iDays &  \$iCmp &  4 );
\$iRes++ if ( \$iDays &  \$iCmp &  8 );
\$iRes++ if ( \$iDays &  \$iCmp & 16 );
\$iRes++ if ( \$iDays &  \$iCmp & 32 );
\$iRes++ if ( \$iDays &  \$iCmp & 64 );

return \$iRes;
}

cmpthese(\$count, {
'Test4' => sub { test4( int ( rand(127) ), int( rand(127) ) ) },
'Test5' => sub { test5( int ( rand(127) ), int( rand(127) ) ) }
});

__DATA__
Rate Test4 Test5
Test4 162690/s    --  -46%
Test5 300000/s   84%    --
The performance lost by the loop is quite steep, but the code nicer.

Cheers,
PerlingTheUK

Replies are listed 'Best First'.
Re: Loops loosing Performance
by kaif (Friar) on Jun 09, 2005 at 23:53 UTC

Are you a hacker? Are you not worried if your code looks like crazy old magic? Then you'll love this solution:

```sub test0 {
return (((\$_[0] & \$_[1]) * 2113665) & 17895697) % 15;
}

(Go ahead! Check that it's correct for all input values!) Benchmarks on my computer, against your code and hv's:

```          Rate Test4 Test5 Test7 Test6 Test0
Test4  64233/s    --  -33%  -33%  -52%  -72%
Test5  95523/s   49%    --   -1%  -29%  -58%
Test7  96399/s   50%    1%    --  -28%  -58%
Test6 133602/s  108%   40%   39%    --  -41%
Test0 227756/s  255%  138%  136%   70%    --

Are you a hacker? Don't you know the C precedence table by heart? Then why do you need those crazy parentheses around the multiplication?

Isn't this enough:

```sub test1 {
((\$_[0] & \$_[1]) * 2113665 & 17895697) % 15;
}

Anyway, I like your method. Nice idea to search in HAKMEM. As you can see below, I've given a less concise solution.

Tried and tested it, but next year by this time, I still want to understand what the heck I did. Unless set a link in my comments that would be impossible.

Cheers,
PerlingTheUK
Kaif, I'd looove to understand what's going on there in test0. I'm not going to ask you for an explanation because I reckon that'd take some time, but would you mind pointing me in the right direction? What do I need to learn first in order to take that apart? Thanks.

Here's a hint: 0b1000000100000010000001 == 2113665 and 0b1000100010001000100010001 == 17895697. I'd gladly explain what's going on, but it seems that some people have already done it online, so I'll link to them instead. In short, multiplying by the first value "repeats" the input four times, while the second knocks out all but every fourth bit. Finally, we "cast out 15s".

Online resources: spurperl, below, linked to Bit Twiddling Hacks by Sean Eron Anderson of Stanford which contains this and many other bit hacks. Anderson, in turn, links to A Modulo Based Bit Counting Approach For 64 Bit Systems by Roshan James, an explanation of my solution above (more or less). Simply note the differences in integer size (they have 64 bits, while most of us only have 32) and requirements (our input is at most 7 bits, and they allow 16). I personally was inspired by a section of the HAKMEM, as I linked above (this also has a difference in size --- the PDP-1110 had 36-bit integers --- and requirements --- 9 bit input).

Re: Loops loosing Performance
by hv (Parson) on Jun 09, 2005 at 23:18 UTC

For speed, I'd suggest:

```  sub test6 {
my \$n = \$_[0] & \$_[1];
my \$count = 0;
++\$count, \$n &= \$n - 1 while \$n;
\$count;
}

For beauty I'd suggest hiding that behind a suitable function name - it costs an extra function call, but you can probably lose that in the way you call it when not benchmarking:

```  sub countbits {
my(\$n, \$count) = (\$_[0], 0);
++\$count, \$n &= \$n - 1 while \$n;
return \$count;
}
sub test7 {
my(\$iDays, \$iCmp) = @_;
return countbits(\$iDays & \$iCmp);
}

Benchmarks:

```          Rate Test4 Test5 Test7 Test6
Test4 106192/s    --  -38%  -40%  -57%
Test5 172463/s   62%    --   -3%  -31%
Test7 176987/s   67%    3%    --  -29%
Test6 248686/s  134%   44%   41%    --

Hugo

Thank you, test6 is what I was looking for. Quick but not a line for each bit of byte.

Cheers,
PerlingTheUK
Re: Loops loosing Performance
by spurperl (Priest) on Jun 10, 2005 at 05:02 UTC
What you have here is the classical "bit counting problem". By ANDing the two numbers all you need is find the amount of 1s in the result. The bit counting problem is a terrific example of the tradeoffs between space and time efficiency, and is one my my favorite interview questions.

You can find a lot of info about it online, for example here.

To make a long story short, the fastest technique is table lookup. Precompute the counts for all bytes (256 of them) in a table and do a simple lookup to find results later.

What you've been shown by kaif and others are the more arcane and enjoyable methods, but table lookup will be faster.

spurperl is right, table lookup is fastest. Using the notation op for original poster, lt for "lookup table", and the rest obvious, the overall benchmark looks something like this:

```         Rate  op4  op5  hv7  hv6 roy4 kaif   lt
op4   64233/s   -- -33% -33% -52% -57% -72% -76%
op5   96098/s  50%   --  -0% -28% -36% -58% -64%
hv7   96399/s  50%   0%   -- -28% -35% -58% -64%
hv6  134032/s 109%  39%  39%   -- -10% -41% -50%
roy4 149447/s 133%  56%  55%  12%   -- -34% -45%
kaif 227019/s 253% 136% 135%  69%  52%   -- -16%
lt   270514/s 321% 181% 181% 102%  81%  19%   --

Just be sure to initialize your table outside of your function, that is, initialize only once.

Re: Loops losing Performance
by Roy Johnson (Monsignor) on Jun 10, 2005 at 01:47 UTC
Yet another way to do it. Somewhat slower and less arcane than kaif's, but maybe adaptable to other situations. And it saves a few keystrokes. :-)
```# Reused your sub name
sub test4 {
return unpack("%4b*", pack('C', \$_[0] & \$_[1]));
}
__END__
Rate Test5 Test4 Test0
Test5  88345/s    --  -34%  -47%
Test4 133240/s   51%    --  -20%
Test0 167395/s   89%   26%    --

Caution: Contents may have been coded under pressure.
Re: Loops loosing Performance
by ambrus (Abbot) on Jun 10, 2005 at 10:08 UTC

The problem is essentially to count the number of bits in an integer. There's a good solution for this, which I'll show now.

This subroutine from MMIXware is supposed to do exactly this. This is for a 32-bit word though, not just 7 bits.

Now if you understand how this works, you can probably convert it to efficent perl. I'm not sure I understand it in full, so the solution I'll write here may not be the same, thus may be slower.
```#!perl
use warnings;
use strict;

# the following subroutines couns the number of bits in an 8-bit integ
+er

sub sadd8_a { # my first solution -- without looking closely to your c
+ode
my(\$x) = @_;
my \$n = 0;
for (my \$k = 1; \$k < 0x100; \$k <<= 1) {
0 != (\$x & \$k) and \$n++;
}
\$n;
}

my ( \$iDays ) = @_;

my \$iRes = 0;
for ( 1, 2, 4, 8, 16, 32, 64, 128 ){
if ( \$iDays & \$_ ){
\$iRes++;
}
}

return \$iRes;
}

my(\$x) = @_;
\$x = (\$x & 0b01010101) + (\$x >> 1 & 0b01010101);
\$x = (\$x & 0b00110011) + (\$x >> 2 & 0b00110011);
(\$x & 0b00001111) + (\$x >> 4);
}

# let's print an example
for my \$n (0 .. 31) {
}
print "\n";

# test
for my \$n (0 .. 255) {
\$a == \$b && \$b == \$c or
die "wrong results: \$n \$a \$b \$c";
}

__END__

Create A New User
Node Status?
node history
Node Type: perlquestion [id://465349]
Approved by fauria
Front-paged by fauria
help
Chatterbox?
 [hippo]: Welcome. Tell me, have you read perlintro? That's a great place to start. [thao4]: I seach to do that since a week but I can't found the answer in google. Thank for your help [Eily]: hello thao4 (and hello all :D), you should [id://http:// perlmonks.org/ index.pl?node_id= 479#post|open a question on Seekers of Perl Wisdom], be sure to include whatever you have tried that didn't work, or only does the job partially [thao4]: yes, I do a little perl since some years but not fréquently, I can correct the triggers in Clearcase of IBM, but write the code : no [Eily]: meh, such good link ... [Eily]: here [Eily]: still not xD there

How do I use this? | Other CB clients
Other Users?
Others avoiding work at the Monastery: (9)
As of 2018-02-21 09:50 GMT
Sections?
Information?
Find Nodes?
Leftovers?
Voting Booth?
When it is dark outside I am happiest to see ...

Results (276 votes). Check out past polls.

Notices?