Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

Re: Space Efficiency of Hashes

by Stevie-O (Friar)
on Mar 16, 2005 at 00:52 UTC ( #439830=note: print w/ replies, xml ) Need Help??


in reply to Space Efficiency of Hashes

Well, nothing beats a good TIAS with Devel::Size in the mix:

#!/usr/bin/perl use strict; use warnings; my @printables = map chr, 33 .. 126; # random string 30 characters long sub randstr { join '', map { $printables[rand(@printables)] } 1..30 } use Devel::Size qw(size total_size); sub commas { my $str = ''.reverse shift; return scalar reverse join ',', grep length, split /(.{3})/, $str } my %hash = ( randstr => randstr ); for (0..5) { my $count = scalar keys %hash; print commas($count), " elements: ", commas(total_size\%hash), " b +ytes\n"; for (1..9 * $count) { my $s = randstr; $hash{$s} = randstr; } } my $count = scalar keys %hash; print commas($count), " elements: ", commas(total_size\%hash), " bytes +\n";
These are the results it printed out on my machine, for v5.8.4:
1 elements: 176 bytes 10 elements: 1,171 bytes 100 elements: 11,249 bytes 1,000 elements: 111,133 bytes 10,000 elements: 1,135,573 bytes 100,000 elements: 11,224,325 bytes 1,000,000 elements: 111,194,341 bytes
(That last one took several minutes to complete on my system...)

Taking my last result and multiplying by five, it seems that 5 million {acc, build} pairs will take only slightly more than half a gig of RAM to store all those hash entries.

So, if your machine has 2GB of RAM -- you should be good to go.

--Stevie-O
$"=$,,$_=q>|\p4<6 8p<M/_|<('=> .q>.<4-KI<l|2$<6%s!<qn#F<>;$, .=pack'N*',"@{[unpack'C*',$_] }"for split/</;$_=$,,y[A-Z a-z] {}cd;print lc


Comment on Re: Space Efficiency of Hashes
Select or Download Code
Replies are listed 'Best First'.
Re^2: Space Efficiency of Hashes
by Anonymous Monk on Mar 16, 2005 at 03:45 UTC

    Thanks (and to Joost as well): this indicates that I'll max out under 2 GB of RAM, leaving plenty free for other stuff running. I didn't think of just going ahead and testing overhead like that, but it seems a really obvious way to approach it in retrospect! Thanks for the enlightenment.

      You can also use Tie::SubsrHash and max memory required for you program will be around 300 MB.
        Nice approach. Fixed-sized records would never come to my mind. Correct name is Tie::SubstrHash.

        2share!2flame...

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://439830]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others imbibing at the Monastery: (17)
As of 2015-07-29 13:49 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (263 votes), past polls