|Problems? Is your data what you think it is?|
Re: Re: Re: A short meditation about hash search performanceby liz (Monsignor)
|on Nov 15, 2003 at 21:57 UTC||Need Help??|
So what it says is that the chance run into the worst analysis I given, is probably reduced.
Indeed. The impetus for the random key hashing scheme, was the potential for a DOS attack when a fixed key hashing scheme was used. So 5.8.1 introduced a random seed for hashing keys. However, for long running perl processes (think mod_perl), it was thinkable that the hash seed was "guessable" from performance of the program on various inputs. Since there was a binary compatibility issue as well, schemes were tried out to fix both.
Once people realized you're really talking about a general performance issue, it started to make sense to make the algorithm self-adapting depending on the length of the lists of identical hash keys.
Abigail-II did a lot of benchmarking on it. Maybe Abigail-II would like to elaborate?
A same hash key list length of 1 for all hash keys, would be optimal if there were no other "costs" involved. However, the re-hashing of existing keys is not something to be done lightly, especially if the number of existing keys is high. So you need to find the best possible combination of same hash key list length and re-hashing. In that respect, the ideal same hash key list length is not 1!