|Problems? Is your data what you think it is?|
Re: '%hash = ()' is slower than 'undef %hash'by roboticus (Chancellor)
|on May 18, 2018 at 14:20 UTC||Need Help??|
For many applications, that's a false economy. If you're going to reuse the hash container, then it seems to me to be cheaper to clear the hash (%h = ()) instead of destroying the hash container and then recreating it (undef %hash):
Overwriting the hash is obviously the fastest, as you needn't clear or destroy the container. Of course for many applications you'd have the added headache of ensuring that old data and current data don't mix.
Clearing the hash container allows you to reuse the hash without mixing old and current data, but might appear to be slower than simply deleting the hash container.
Deleting the hash container might appear to be faster until you also account for the time it takes to recreate the hash container when you use it. It matters a little more than it appears, though: clearing the hash leaves the container the same size, so re-using the hash is slightly faster than clearing/recreating it because perl can avoid many of the container resize operations as it adds the keys. On the positive side, though, clearing the container may allow your application to reclaim some memory in the event that some datasets may have significantly more keys than are ordinarily needed. (Although I expect that would be as insignificant as the savings from the resizes just mentioned.)
At least that's how I see it... I'm providing the benchmark so you can point out what I may be missing...
Update: It seems that the old behavior of scalar(%h) changed in version 5.25.3 from displaying "buckets used/bucket count" to simply "buckets used". (A poor idea, in my opinion.) Anyway, with the Hash::Util function bucket_stats we can still get the information. I've edited the text and benchmark accordingly, and rearranged things a little for readability.
Update 2: Added the bit about destroying the container allows you to reclaim memory as a possible benefit.
When your only tool is a hammer, all problems look like your thumb.