http://www.perlmonks.org?node_id=1034723


in reply to Re: performance of counting keys in a big hash
in thread performance of counting keys in a big hash

A good piece of research, however the possibility exists that the two results are related and generating the first one also pre-generates the second one. I suggest that tests like these are best done in separate methods on separate data just to be sure.

#!/usr/bin/perl use strict; use warnings; use Benchmark qw(:all) ; my $n = 100000; our %hash1; @hash1{1..$n} = 1 x $n; my $n2 = 100000; our %hash2; @hash2{1..$n2} = 1 x $n2; my $n3 = 1000000; our %hash3; @hash3{1..$n3} = 1 x $n3; cmpthese( 1000, { 'Method 1' => sub { my @k = keys %hash1; my $v = scalar @k; }, '100K' => sub { my $v = scalar keys %hash2; }, '1 Mill' => sub { my $v = scalar keys %hash3; } } );
Results:
Rate Method 1 1 Mill + 100K Method 1 15.0/s -- -100% + -100% 1 Mill 999999999999999872/s 6645700000000000000% -- + 0% 100K 999999999999999872/s 6645700000000000000% 0% + --
Note result appears to be same for 100k and 1Million entries in hash.
If you spot any bugs in my solutions, it's because I've deliberately left them in as an exercise for the reader! :-)

Replies are listed 'Best First'.
Re^3: performance of counting keys in a big hash
by hdb (Monsignor) on May 22, 2013 at 12:13 UTC

    This is a valuable comment, I had not thought of that. I have reversed the order and get the same results. Does this provide enough proof or could there be a caveat as well?

    To be really sure one would have to run separate scripts?

      I'm just in the process of tidying up my comment by putting in some code that actually works, as opposed to that I first thought of! :-).

      I think reversing the tests is probably enough proof; I don't believe separate scripts would prove anything extra.

      I am using Benchmark to compare my results instead of using manual timing.

      If you spot any bugs in my solutions, it's because I've deliberately left them in as an exercise for the reader! :-)