Amendil has asked for the wisdom of the Perl Monks concerning the following question:
Hello Perl Monks,
I'm working on a tsv, one of its columns is a csv list of keywords (28 unique values). I'd like to compute the Jaccard Index (Intersection / Union) of this list of keywords. To do so efficiently I'd like to use a bit array to represent the list of keywords.
I tried to read few articles on Perlmonks and stackoverflow, but so far I feel I'm missing something completely obvious.
Here is what I wrote:
use common::sense; my $a = ''; my $b = ''; $a += 1 << 0; $a += 1 << 1; $b += 1 << 1; $b += 1 << 2; my $i = $a & $b; my $u = $a | $b; my $i_cnt = unpack '%32b*', $i; my $u_cnt = unpack '%32b*', $u; printf "a is %#032b %d\n", $a, $a; printf "b is %#032b %d\n", $b, $b; printf "intersection is %#032b %d\n", $i, $i; printf "union is %#032b %d\n", $u, $u; say "set bit count in intersection: $i_cnt"; say "set bit count in union: $u_cnt";
Actual result:
a is 0b000000000000000000000000000011 3 b is 0b000000000000000000000000000110 6 intersection is 0b000000000000000000000000000010 2 union is 0b000000000000000000000000000111 7 set bit count in intersection: 3 set bit count in union: 5
Expected result:
a is 0b000000000000000000000000000011 3 b is 0b000000000000000000000000000110 6 intersection is 0b000000000000000000000000000010 2 union is 0b000000000000000000000000000111 7 set bit count in intersection: 1 set bit count in union: 3
|
---|
Replies are listed 'Best First'. | |
---|---|
Re: bit array comparison
by rjt (Curate) on Oct 22, 2019 at 16:32 UTC | |
by Amendil (Novice) on Oct 22, 2019 at 17:46 UTC | |
by rjt (Curate) on Oct 23, 2019 at 01:16 UTC | |
Re: bit array comparison
by tybalt89 (Monsignor) on Oct 22, 2019 at 16:50 UTC | |
by Amendil (Novice) on Oct 22, 2019 at 17:29 UTC | |
Re: bit array comparison
by ikegami (Patriarch) on Oct 22, 2019 at 14:19 UTC | |
Re: bit array comparison
by rsFalse (Chaplain) on Oct 22, 2019 at 14:08 UTC | |
by Amendil (Novice) on Oct 22, 2019 at 14:28 UTC | |
by Athanasius (Archbishop) on Oct 23, 2019 at 03:05 UTC |
Back to
Seekers of Perl Wisdom