Indeed. The impetus for the random key hashing scheme, was the potential for a DOS attack when a fixed key hashing scheme was used. So 5.8.1 introduced a random seed for hashing keys. However, for long running perl processes (think mod_perl), it was thinkable that the hash seed was "guessable" from performance of the program on various inputs. Since there was a binary compatibility issue as well, schemes were tried out to fix both.
Once people realized you're really talking about a general performance issue, it started to make sense to make the algorithm self-adapting depending on the length of the lists of identical hash keys.
Abigail-II did a lot of benchmarking on it. Maybe Abigail-II would like to elaborate?
(If that's the case, the document liz provided shall not be there, as the queue length would always be 1 ...
A same hash key list length of 1 for all hash keys, would be optimal if there were no other "costs" involved. However, the re-hashing of existing keys is not something to be done lightly, especially if the number of existing keys is high. So you need to find the best possible combination of same hash key list length and re-hashing. In that respect, the ideal same hash key list length is not 1!
|Replies are listed 'Best First'.|
Re: A short meditation about hash search performance
by Abigail-II (Bishop) on Nov 16, 2003 at 02:54 UTC