Perl-Sensitive Sunglasses | |
PerlMonks |
comment on |
( [id://3333]=superdoc: print w/replies, xml ) | Need Help?? |
-- It will depend on whether you have longish sequences of contiguous ones or zeros.
The one set (of 25 sets) of indexes that I've analyzed, consists of 88 x 31MB vectors. They vary between 86% and 98% sparse (by zero bytes rather than bits). The largest 0 runs range between 8 and 12 million bits. The largest 1 run is 67 bits. By packing the run counts as 0/1 pairs into 32-bit words, 24-bits for the 0 runs and 8-bits for the one run, I can reduce the size by more 2/3rds and am still able to perform boolean operations with decompressing first. For the underlying principles see http://crd.lbl.gov/~kewu/ps/LBNL-49626.pdf. In reply to Re^2: Run length encode a bit vector
by Anonymous Monk
|
|