puterboy has asked for the wisdom of the Perl Monks concerning the following question:
I am using a large hash (millions of entries) as a cache.
The keys are (sparsely spaced) unsigned integers.
The values are 32 hex char string followed by an optional unsigned integer.
Storing the hash normally as $hash{<u_integer>}=<32 hex> takes about 190 bytes per entry (as determined by using Devel::Size).
However, knowing the format it seems like I should be able to pack the 32 hex characters into 16 bytes. Similarly the optional unsigned integer string, should be able to be packed more efficiently than a char string.
Perhaps, I could even save on the key storage, knowing that they are unsigned integers.
I am looking to do a better "packing" not any fancy compression scheme. Note I tried various combinations of pack such as pack("H32l", <32hex>, <uint>) but it got me only about a 25% saving. There must be a better way of packing (assuming I am willing to sacrifice a little speed). I mean if the key is o(4) bytes and the value is o(16-20) bytes, I would think I could do better than taking o(150-200) bytes which is almost a 90% overhead. Or maybe hashes are by necessity that inefficient...
Any suggestions?
The keys are (sparsely spaced) unsigned integers.
The values are 32 hex char string followed by an optional unsigned integer.
Storing the hash normally as $hash{<u_integer>}=<32 hex> takes about 190 bytes per entry (as determined by using Devel::Size).
However, knowing the format it seems like I should be able to pack the 32 hex characters into 16 bytes. Similarly the optional unsigned integer string, should be able to be packed more efficiently than a char string.
Perhaps, I could even save on the key storage, knowing that they are unsigned integers.
I am looking to do a better "packing" not any fancy compression scheme. Note I tried various combinations of pack such as pack("H32l", <32hex>, <uint>) but it got me only about a 25% saving. There must be a better way of packing (assuming I am willing to sacrifice a little speed). I mean if the key is o(4) bytes and the value is o(16-20) bytes, I would think I could do better than taking o(150-200) bytes which is almost a 90% overhead. Or maybe hashes are by necessity that inefficient...
Any suggestions?
Back to
Seekers of Perl Wisdom