Re: Array vs. Hash (newbie perlguts)
by tachyon (Chancellor) on Aug 30, 2001 at 18:42 UTC
|
As I understand it a hash is more expensive than an array. Two hashes are more expensive than one hash due to the cost of setting up the buckets. In your case you want to waste some elemets in the array so the answer is GOK. The solution is suck it and see. Run this sort of code and look at real time memory useage. Make either a hash or an array an look at the memory hit.
$size = 20000;
# make a hash
for ($key = 0 ; $key < $size ; $key += 2 ) {
$hash{$key} = [ ( 0..100 ) ];
}
# make an array, wasting 50% of the spaces
#for ($key = 0 ; $key < $size ; $key += 2 ) {
# $ary[$key] = [ ( 0..100 ) ];
#}
print "Made big data structure, sleeping 20 seconds!\n";
sleep 20;
cheers
tachyon
s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print
| [reply] [Watch: Dir/Any] [d/l] |
|
| [reply] [Watch: Dir/Any] |
Re: Array vs. Hash (newbie perlguts)
by dragonchild (Archbishop) on Aug 30, 2001 at 18:24 UTC
|
perchance - this sounds like a case of premature optimzation to me. While, yes, more knowledge is always a good thing, this isn't something that you really need to worry about, unless you running short of RAM on your system. If that's happening, then you need to either buy more RAM or pick another implementation. *shrugs*
The best answer is to use the implementation that best fits your mental model. Don't change your model to fit what you think is optimal for the computer. Change the computer's model to fit what is optimal for you.
The way to think about this is that the computer's cycles are cheap and yours are expensive. The computer's cycles will never make a mistake, but yours will. The computer will never get bored, you will.
Unless you enjoy the mental gymnastics (and some of us do), then just be lazy, get the job done, and get on with your life. :)
------ We are the carpenters and bricklayers of the Information Age. Vote paco for President! | [reply] [Watch: Dir/Any] |
Re: Array vs. Hash (newbie perlguts)
by runrig (Abbot) on Aug 31, 2001 at 02:45 UTC
|
The key to the hash is an integer of a reasonable size...
All depends on what you mean by 'reasonable', how many unique keys you have, how much memory you have, and how often you'll be iterating through non-existent array elements (with the array of arrays approach - well, that won't matter memory-wise, but speed wise...), and whether you really need to worry about it. If you're only using half the elements, then an array of arrays will still win memory-wise. Hashes will take 5-10 times (very rough estimate) more space with an equal # of keys. | [reply] [Watch: Dir/Any] |
|
| [reply] [Watch: Dir/Any] |
|
lookup times for a hash are O(1) and for an array O(N).
That's ridiculous. If values are going to be indexed by integers anyway, then if anything, it takes a bit longer to look
up a key in a hash than looking it up as an array index.
Maybe you're thinking of looking up things in a linked list, ala LISP?
Or if you're strictly searching for a value without a key then
they're both O(n).
Update: Re: your reply: I figured that
was probably what you were talking about, but the original
poster seemed to want to index things by integers, and was just
wondering whether to use a hash or an regular array, and
the issues with doing one or the other. And I've actually
seen code like the below in perl programs which made me
want to whack the programmer over the head with a hash
array :)
| [reply] [Watch: Dir/Any] |
|
Re: Array vs. Hash (newbie perlguts)
by Ryszard (Priest) on Aug 31, 2001 at 02:20 UTC
|
I'm thinking the solution relies on the sparcity. if all your array elements are populated, (ie your data is dense) i think the economies of scale in fine tuning your model will outweigh any benifits you will get in memory saving.
if you know all the indicies before hand i wouldnt go for a hash of arrays as why complicate things? $array[0] is easier to read than $hash{1}[0]
i'm guessing by saying you've got 1..m dimensions in your data (represented by the hash element), why not split it out into 3 or 4 arrays, and only populate the array if there is a value to populate it. this way you can make your data denser while conserving space not really required by a hash. of course this makes it more difficult to parse the data around as you have to parse four structures (or references to), rather than a single reference. Without more specifics of the project and data model, its a little too difficult to make any specific suggestions. As always it looks like a trade off.. this time around its resources againts complexity.
| [reply] [Watch: Dir/Any] [d/l] [select] |