Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

Re: Space Efficiency of Hashes

by Stevie-O (Friar)
on Mar 16, 2005 at 00:52 UTC ( #439830=note: print w/ replies, xml ) Need Help??


in reply to Space Efficiency of Hashes

Well, nothing beats a good TIAS with Devel::Size in the mix:

#!/usr/bin/perl use strict; use warnings; my @printables = map chr, 33 .. 126; # random string 30 characters long sub randstr { join '', map { $printables[rand(@printables)] } 1..30 } use Devel::Size qw(size total_size); sub commas { my $str = ''.reverse shift; return scalar reverse join ',', grep length, split /(.{3})/, $str } my %hash = ( randstr => randstr ); for (0..5) { my $count = scalar keys %hash; print commas($count), " elements: ", commas(total_size\%hash), " b +ytes\n"; for (1..9 * $count) { my $s = randstr; $hash{$s} = randstr; } } my $count = scalar keys %hash; print commas($count), " elements: ", commas(total_size\%hash), " bytes +\n";
These are the results it printed out on my machine, for v5.8.4:
1 elements: 176 bytes 10 elements: 1,171 bytes 100 elements: 11,249 bytes 1,000 elements: 111,133 bytes 10,000 elements: 1,135,573 bytes 100,000 elements: 11,224,325 bytes 1,000,000 elements: 111,194,341 bytes
(That last one took several minutes to complete on my system...)

Taking my last result and multiplying by five, it seems that 5 million {acc, build} pairs will take only slightly more than half a gig of RAM to store all those hash entries.

So, if your machine has 2GB of RAM -- you should be good to go.

--Stevie-O
$"=$,,$_=q>|\p4<6 8p<M/_|<('=> .q>.<4-KI<l|2$<6%s!<qn#F<>;$, .=pack'N*',"@{[unpack'C*',$_] }"for split/</;$_=$,,y[A-Z a-z] {}cd;print lc


Comment on Re: Space Efficiency of Hashes
Select or Download Code
Re^2: Space Efficiency of Hashes
by Anonymous Monk on Mar 16, 2005 at 03:45 UTC

    Thanks (and to Joost as well): this indicates that I'll max out under 2 GB of RAM, leaving plenty free for other stuff running. I didn't think of just going ahead and testing overhead like that, but it seems a really obvious way to approach it in retrospect! Thanks for the enlightenment.

      You can also use Tie::SubsrHash and max memory required for you program will be around 300 MB.
        Nice approach. Fixed-sized records would never come to my mind. Correct name is Tie::SubstrHash.

        2share!2flame...

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://439830]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others lurking in the Monastery: (6)
As of 2014-11-27 01:44 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My preferred Perl binaries come from:














    Results (178 votes), past polls