Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Re^2: performance of counting keys in a big hash

by space_monk (Chaplain)
on May 22, 2013 at 11:54 UTC ( #1034723=note: print w/replies, xml ) Need Help??


in reply to Re: performance of counting keys in a big hash
in thread performance of counting keys in a big hash

A good piece of research, however the possibility exists that the two results are related and generating the first one also pre-generates the second one. I suggest that tests like these are best done in separate methods on separate data just to be sure.

#!/usr/bin/perl use strict; use warnings; use Benchmark qw(:all) ; my $n = 100000; our %hash1; @hash1{1..$n} = 1 x $n; my $n2 = 100000; our %hash2; @hash2{1..$n2} = 1 x $n2; my $n3 = 1000000; our %hash3; @hash3{1..$n3} = 1 x $n3; cmpthese( 1000, { 'Method 1' => sub { my @k = keys %hash1; my $v = scalar @k; }, '100K' => sub { my $v = scalar keys %hash2; }, '1 Mill' => sub { my $v = scalar keys %hash3; } } );
Results:
Rate Method 1 1 Mill + 100K Method 1 15.0/s -- -100% + -100% 1 Mill 999999999999999872/s 6645700000000000000% -- + 0% 100K 999999999999999872/s 6645700000000000000% 0% + --
Note result appears to be same for 100k and 1Million entries in hash.
If you spot any bugs in my solutions, it's because I've deliberately left them in as an exercise for the reader! :-)

Replies are listed 'Best First'.
Re^3: performance of counting keys in a big hash
by hdb (Monsignor) on May 22, 2013 at 12:13 UTC

    This is a valuable comment, I had not thought of that. I have reversed the order and get the same results. Does this provide enough proof or could there be a caveat as well?

    To be really sure one would have to run separate scripts?

      I'm just in the process of tidying up my comment by putting in some code that actually works, as opposed to that I first thought of! :-).

      I think reversing the tests is probably enough proof; I don't believe separate scripts would prove anything extra.

      I am using Benchmark to compare my results instead of using manual timing.

      If you spot any bugs in my solutions, it's because I've deliberately left them in as an exercise for the reader! :-)

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1034723]
help
Chatterbox?
[Corion]: Hurr. Yesterday I played around with ffmpeg as a new toy and found its "scene" filter great - it detects scene changes. Now I could write a module that splits a given video on its cuts into different scenes. Except I have no use case for that :)
[Corion]: (and also, writing yet another FFmpeg module just to wrap system() and grep through its output isn't all that great ...)
[erix]: cut out advertisements from movies? :)
[erix]: robably not possible (or it would have been done already)

How do I use this? | Other CB clients
Other Users?
Others lurking in the Monastery: (11)
As of 2018-05-24 11:14 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Notices?