http://www.perlmonks.org?node_id=1015564
BrowserUk has asked for the wisdom of the Perl Monks concerning the following question:

Given a bit vector \$v, this: my \$c = unpack '%32b*', \$v; will count the set bits very efficiently.

However, I need to count the set bits preceding (not including) a given bit position within \$v.

I can do that using:

```## \$p = position
my \$c = 0; vec( \$v, \$_, 1 ) and ++\$c for 0 .. \$p - 1;
[download]```

But that is grossly inefficient for large vectors.

Can you do it more efficiently?

Note:The rest of the code is pure perl, so I'd rather avoid going to XS to do this if possible.

With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

Replies are listed 'Best First'.
Re: Efficient bit counting with a twist.
by AnomalousMonk (Chancellor) on Jan 27, 2013 at 09:48 UTC

Maybe I just don't understand the question, but assuming  unpack '%32b*', ...   is sufficiently efficient, wouldn't a count on the  'b' template specifier serve to count the set bits from the beginning of the string up to but not including a given bit offset?

```>perl -wMstrict -le
"my \$b = qq{\x01\x03\x07\xff\xfe\xfc};
;;
print '          1111111111222222222233333333334444444444';
print '0123456789' x 5;
print unpack 'b*', \$b;
print ''
;;
for my \$offset (0, 1, 7 .. 10, 15 .. 20, 24 .. 28, 30 .. 32) {
my \$c = unpack qq{%32b\$offset}, \$b;
my \$vc = vec_sum(\$b, \$offset);
die qq{pack sum \$c != vec sum \$vc} if \$c != \$vc;
printf qq{before bit at offset %2d: %2d bits set \n}, \$offset, \$c;
}
;;
sub vec_sum {
my (\$v, \$p) = @_;
;;
my \$c = 0;
vec(\$v, \$_, 1) and ++\$c for 0 .. \$p - 1;
return \$c;
}
"
1111111111222222222233333333334444444444
01234567890123456789012345678901234567890123456789
100000001100000011100000111111110111111100111111

before bit at offset  0:  0 bits set
before bit at offset  1:  1 bits set
before bit at offset  7:  1 bits set
before bit at offset  8:  1 bits set
before bit at offset  9:  2 bits set
before bit at offset 10:  3 bits set
before bit at offset 15:  3 bits set
before bit at offset 16:  3 bits set
before bit at offset 17:  4 bits set
before bit at offset 18:  5 bits set
before bit at offset 19:  6 bits set
before bit at offset 20:  6 bits set
before bit at offset 24:  6 bits set
before bit at offset 25:  7 bits set
before bit at offset 26:  8 bits set
before bit at offset 27:  9 bits set
before bit at offset 28: 10 bits set
before bit at offset 30: 12 bits set
before bit at offset 31: 13 bits set
before bit at offset 32: 14 bits set
[download]```

Update: Example code changed to include check against  vec sum algorithm given in OP; also made things a bit prettier.

Whatever makes you think it could possibly be that obviously simple? (<<< a link!)

Re: Efficient bit counting with a twist.
by davido (Archbishop) on Jan 27, 2013 at 07:23 UTC

I'm hesitant to post this because I haven't had the time to turn an idea into a viable solution (and won't have time tonight). I haven't tested this against your code or even on a large vector. It's just an idea that hit me. Then as I was looking for other ideas on algorithms I came across Hamming Weight, which mentions this:

With unlimited memory, we could simply create a large lookup table of the Hamming weight of every 64 bit integer.

Well, obviously we don't have unlimited memory, but using an array as a lookup for 16-bit integers would allow you to count 16 bits at a time rather than a single bit at a time. So this is a quick rough-draft. It doesn't take into consideration bit strings that are of lengths that aren't evenly divisible by 16. It's just an incomplete proof of concept before I go to bed... there's work to be done on it before it's a solution, and of course the work needs to be followed by benchmarks. :)

```my @lookup;

for( 0 .. 65535 ) {
my \$bits = sprintf "%b", \$_;
my \$count = \$bits =~ tr/1/1/;
push @lookup, \$count;
}

my \$bitstring = '';

vec( \$bitstring, 0, 32 ) = 1234567891;

my \$count = 0;

for( 0 .. length( \$bitstring ) / 2 - 1 ) {
\$count += \$lookup[ vec( \$bitstring, \$_, 16 ) ];
}

print \$count, "\n";
[download]```

Dave

Re: Efficient bit counting with a twist.
by quester (Vicar) on Jan 27, 2013 at 07:59 UTC

Build a bit mask for the byte that contains the position. Then count the bits in the bytes before the position, and do a bitwise and for the byte that contains the position.

```        my \$mask = pack "V", 2 ** (\$p % 8) - 1;
my \$c = unpack( '%32b*', substr( \$v, 0, \$p / 8 ) )
+ unpack( '%32b*', substr( \$v, \$p / 8, 1 ) & \$mask );
[download]```

(Update: removed a superfluous pair of parens in the last line.)

Why pack 'V'?

substr( \$v, \$p / 8, 1 ) is a byte, why mask with 4 bytes?

With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
div class=Why
Oh. That. It's just force of habit, I adapted this bit of code from the way I code bitmasks for IPv4 subnets. It would make more sense to code it as pack 'C',... rather than pack 'V', although it works either way.
Re: Efficient bit counting with a twist.
by johngg (Abbot) on Jan 27, 2013 at 23:27 UTC

I'm not sure how efficient substr is with large strings but it generally seems pretty fast. Using it to count set bits in the whole bytes before your position and then, if necessary, those bits in the partial byte up to but not including it via a mask might be viable.

```use strict;
use warnings;

use 5.014;

say join q{}, map { sprintf q{%-10s}, \$_ } 0 .. 8;
say qq{@{ [ join q{}, 0 .. 9 ] }} x 9;

my \$vec = pack q{C*}, map ord, q{A} .. q{K};
say unpack q{B*}, \$vec;
say qq{Total set bits - @{ [ unpack q{%32b*}, \$vec ] }};

say qq{Set bits to \$_ - @{ [ setBitsB4pos( \ \$vec, \$_ ) ] }}
for 70 .. 87;

sub setBitsB4pos
{
my( \$rsVec, \$pos ) = @_;
my \$wholeBytes = int \$pos / 8;
my \$oddBits    = \$pos % 8;
my \$count      = unpack q{%32b*},
substr \${ \$rsVec }, 0, \$wholeBytes;

return \$count unless \$oddBits;

my \$mask = pack q{C*}, ( 0 ) x \$wholeBytes, do {
my \$acc;
\$acc += 2 ** ( 8 - \$_ ) for 1 .. \$oddBits;
\$acc;
};
\$count += unpack q{%32b*},
substr( \${ \$rsVec }, 0, \$wholeBytes + 1 ) & \$mask;

return \$count;
}
[download]```

The output.

```0         1         2         3         4         5         6
+7         8
0123456789012345678901234567890123456789012345678901234567890123456789
+01234567890123456789
0100000101000010010000110100010001000101010001100100011101001000010010
+010100101001001011
Total set bits - 31
Set bits to 70 - 23
Set bits to 71 - 23
Set bits to 72 - 24
Set bits to 73 - 24
Set bits to 74 - 25
Set bits to 75 - 25
Set bits to 76 - 25
Set bits to 77 - 26
Set bits to 78 - 26
Set bits to 79 - 27
Set bits to 80 - 27
Set bits to 81 - 27
Set bits to 82 - 28
Set bits to 83 - 28
Set bits to 84 - 28
Set bits to 85 - 29
Set bits to 86 - 29
Set bits to 87 - 30
[download]```

I hope this is useful.

Cheers,

JohnGG

Using it to count set bits in the whole bytes before your position and then, if necessary, those bits in the partial byte up to but not including it via a mask might be viable.

Yes. That works and is the same scheme questor came up with (rather more compactly:) in Re: Efficient bit counting with a twist..

But the best answer is the one AnomalousMonk pointed out in Re: Efficient bit counting with a twist..

Ie. To recognise that the unpack template '%32b*' is not a indivisible token saying 'count the bits', but actually contains 3 parts:

1. %32 accumulate a 32-bit count ...
2. b of the binary bits set ...
3. * for all the bits in the string.

And that by simply interpolating \$p into the template, in place of *, it counts the set bits within the first \$p bits of the string.

Perl had the problem solved; I just didn't recognise it :) (Hence my: D'oh! D'oh! D'oh!..... moment. :)

With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

Thanks for the explanation. I wasn't getting his solution at first until I saw your description of the steps. For those who need a little more (as I did): perlpacktut. It's funny, I'm sure we've all read that document numerous times, and yet before this I never noticed the % feature.

Update: Fixed perlunpacktut: sb perlpacktut.

Dave

Re: Efficient bit counting with a twist.
by jethro (Monsignor) on Jan 27, 2013 at 11:23 UTC

NOP. Idea was mentioned already

Re: Efficient bit counting with a twist.
by trizen (Hermit) on Jan 28, 2013 at 13:45 UTC
my \$c = unpack("%32b*", pack "B\$p", unpack "B\$p", \$v);

Note that \$p start with zero.

Note that that sequence of unpack/pack/unpack achieves exactly the same thing as AnomalousMonk's unpack "%32b\$p".

But the latter avoids turning a 500MB string into a 4 billion element list of 1s and 0s before converting it back to the 500MB string; which is what that sequence of unpack/pack/unpack does.

With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
Maybe I'm crazy, but this doesn't seem to work:
```#!/usr/bin/perl

use strict;
use warnings;

my \$mask = hex(\$ARGV[0]);

my \$setbits = unpack("%32b*", \$mask);

printf("0x%x has %d set\n", \$mask, \$setbits);
[download]```

Then I run it:
./bittest 0xff
0xff has 11 set

Or worse:
./bittest 0x1001
0x1001 has 14 set

Even the simple case:
./bittest 0
0x0 has 2 set

What is going on here?