Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling

Re: Why are hashes so much faster?

by plaid (Chaplain)
on Jun 18, 2000 at 01:53 UTC ( #18684=note: print w/replies, xml ) Need Help??

in reply to Why are hashes so much faster?

For a simple lookup at a given index, an array cannot be beaten for pure speed. I'm assuming what you were trying to do in the original script was check to see if a certain value was contained in an array. I will use a bit of terminology from lhoward's excellent post here, namely the big-O notation.

Hashing is done by taking the key of the hash, performing a "hashing" function on this key, and obtaining a number from it. Then, this number is used as an index into an array. A good hashing function will produce an array that is fairly well populated. For instance, if your key is 'name', a function will be called with the argument of 'name'. Some pseudocode might look like

my $random = hashing_function('name'); $array_of_hash_values[$random] = $value;
The hashing_function is usually something that calculates a value based on the string, and performs a modulo (%) by the fixed size of the array. Then, for any subsequent lookups into the hash table, you run the key through the hashing_function again, and take whatever is at that index.
sub lookup_value { my $index = hashing_function(shift()); return $array_of_hash_values[$index]; }
If more than one key hashes to the same index, usually an array is built off of that array, which is why hashes are -ideally- O(1), but not necessarily.

Array lookups are always O(1), and faster than hash lookups since they don't have to perform any hashing function beforehand. My guess is that you were having to do some traversing of the array to look for a certain value, in which case, your lookup function would be O(n), where n is the size of the array. By switching to a hash table, you were able to increase the speed of this lookup function. Other places where a hash would help would be in a delete function, as you could just get rid of the value, instead of having to worry about shifting all values down if the deleted index is in the middle of the array.

In general, hashes are much more efficient than arrays for anything that needs to have a certain relationship between index and its value, but arrays are much faster than hashes if all you want is a list of values where the index is unimportant.

Hope this all makes sense.

Replies are listed 'Best First'.
RE: Re: Why are hashes so much faster?
by Anonymous Monk on Jun 18, 2000 at 04:19 UTC
    (I'm the one who posted the question...I can't get my logins here
    to "take," and everything I post ends up under the name
    "Anonymous Monk" whether I log in or not. Sigh...)

    > My guess is that you were having to do some traversing
    > of the array to look for a certain value

    Yes, it was a case of<KBD> if ( $_ eq 'whatever' ) </KBD>while looping through the array.
    I thought perhaps it'd be faster via "grep" (don't know why I thought so)--
    it wasn't. The solution via hash was stunningly faster.

    Thanks for these replies. I don't get all of it yet, but slowly, slowly...

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://18684]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chanting in the Monastery: (3)
As of 2022-07-05 13:10 GMT
Find Nodes?
    Voting Booth?

    No recent polls found