I'm not sure how efficient substr is with large strings but it generally seems pretty fast. Using it to count set bits in the whole bytes before your position and then, if necessary, those bits in the partial byte up to but not including it via a mask might be viable.

`use strict;
use warnings;
use 5.014;
say join q{}, map { sprintf q{%-10s}, $_ } 0 .. 8;
say qq{@{ [ join q{}, 0 .. 9 ] }} x 9;
my $vec = pack q{C*}, map ord, q{A} .. q{K};
say unpack q{B*}, $vec;
say qq{Total set bits - @{ [ unpack q{%32b*}, $vec ] }};
say qq{Set bits to $_ - @{ [ setBitsB4pos( \ $vec, $_ ) ] }}
for 70 .. 87;
sub setBitsB4pos
{
my( $rsVec, $pos ) = @_;
my $wholeBytes = int $pos / 8;
my $oddBits = $pos % 8;
my $count = unpack q{%32b*},
substr ${ $rsVec }, 0, $wholeBytes;
return $count unless $oddBits;
my $mask = pack q{C*}, ( 0 ) x $wholeBytes, do {
my $acc;
$acc += 2 ** ( 8 - $_ ) for 1 .. $oddBits;
$acc;
};
$count += unpack q{%32b*},
substr( ${ $rsVec }, 0, $wholeBytes + 1 ) & $mask;
return $count;
}
`

The output.

`0 1 2 3 4 5 6
+7 8
0123456789012345678901234567890123456789012345678901234567890123456789
+01234567890123456789
0100000101000010010000110100010001000101010001100100011101001000010010
+010100101001001011
Total set bits - 31
Set bits to 70 - 23
Set bits to 71 - 23
Set bits to 72 - 24
Set bits to 73 - 24
Set bits to 74 - 25
Set bits to 75 - 25
Set bits to 76 - 25
Set bits to 77 - 26
Set bits to 78 - 26
Set bits to 79 - 27
Set bits to 80 - 27
Set bits to 81 - 27
Set bits to 82 - 28
Set bits to 83 - 28
Set bits to 84 - 28
Set bits to 85 - 29
Set bits to 86 - 29
Set bits to 87 - 30
`

I hope this is useful.

Comment onRe: Efficient bit counting with a twist.SelectorDownloadCode