Beefy Boxes and Bandwidth Generously Provided by pair Networks
Come for the quick hacks, stay for the epiphanies.
 
PerlMonks  

Re^4: What is the most efficient way to see if a hash is empty?

by shmem (Chancellor)
on Apr 28, 2009 at 15:42 UTC ( [id://760647]=note: print w/replies, xml ) Need Help??


in reply to Re^3: What is the most efficient way to see if a hash is empty?
in thread What is the most efficient way to see if a hash is empty?

The least bias would be

%a = 1..2e6; %b = 1..2e6; cmpthese -1, { A=>q[ scalar keys %a, ++$i for 1 .. 10000 ], B=>q[ scalar keys %b and ++$i for 1 .. 10000 ], C=>q[ my $x = keys %a, ++$i for 1 .. 10000 ], D=>q[ my $x = keys %b and ++$i for 1 .. 10000 ], } __END__ Rate D C A B D 280/s -- -4% -31% -31% C 290/s 4% -- -28% -28% A 404/s 44% 39% -- -1% B 406/s 45% 40% 1% --

Testing with empty %b gives

%a = 1..2e6; %b = (); cmpthese -1, { A=>q[ scalar keys %a, ++$i for 1 .. 10000 ], B=>q[ scalar keys %b, ++$i for 1 .. 10000 ], C=>q[ my $x = keys %a, ++$i for 1 .. 10000 ], D=>q[ my $x = keys %b, ++$i for 1 .. 10000 ], } __END__ Rate C D A B C 288/s -- -2% -30% -31% D 293/s 2% -- -28% -29% A 410/s 43% 40% -- -1% B 415/s 44% 41% 1% --

from which I would conclude that there's a minimal difference between testing an empty and a full hash.

update: moritz code using these instead of sub {die}:

use strict; use warnings; use Benchmark qw(cmpthese); for my $len (2..6) { our (%full, %empty); my $a = 'A'; $full{$a++} = $a while length($a) < $len; print "\nLength $len with ", scalar(%full), "items\n"; cmpthese -1, { full => q[scalar keys %full, ++$i for 1 .. 10000], empty => q[scalar keys %empty, ++$i for 1 .. 10000], } } __END__ Length 2 with 22/32items Rate empty full empty 350/s -- -15% full 411/s 17% -- Length 3 with 510/1024items Rate full empty full 406/s -- -2% empty 415/s 2% -- Length 4 with 13961/32768items Rate full empty full 409/s -- -1% empty 414/s 1% -- Length 5 with 310721/524288items Rate empty full empty 414/s -- 0% full 414/s 0% -- Length 6 with 8655839/16777216items Rate empty full empty 407/s -- -4% full 426/s 5% --

But! If I introduce a third test condition which also tests for %full, I get:

... cmpthese -1, { full => q[scalar keys %full, ++$i for 1 .. 10000], empty => q[scalar keys %empty, ++$i for 1 .. 10000], fake => q[scalar keys %full, ++$i for 1 .. 10000], } } __END__ Length 2 with 22/32items Rate fake empty full fake 403/s -- -2% -3% empty 411/s 2% -- -2% full 417/s 4% 2% -- Length 3 with 510/1024items Rate fake full empty fake 415/s -- -1% -1% full 418/s 1% -- -0% empty 419/s 1% 0% -- Length 4 with 13961/32768items Rate full empty fake full 406/s -- -2% -2% empty 414/s 2% -- 0% fake 414/s 2% 0% -- Length 5 with 310721/524288items Rate fake full empty fake 406/s -- -0% -1% full 407/s 0% -- -1% empty 411/s 1% 1% -- Length 6 with 8655839/16777216items Rate fake full empty fake 400/s -- -3% -3% full 411/s 3% -- -0% empty 411/s 3% 0% --

using a sub instead of a quoted string

... cmpthese -1, { full => sub{scalar keys %full, ++$i }, empty => sub{scalar keys %empty, ++$i }, fake => sub{scalar keys %full, ++$i }, } } __END__ Length 2 with 22/32items Rate empty full fake empty 4955017/s -- -4% -5% full 5167154/s 4% -- -1% fake 5215901/s 5% 1% -- Length 3 with 510/1024items Rate full empty fake full 5038923/s -- -4% -5% empty 5265576/s 4% -- -1% fake 5293291/s 5% 1% -- Length 4 with 13961/32768items Rate empty full fake empty 4633858/s -- -1% -5% full 4677165/s 1% -- -4% fake 4858803/s 5% 4% -- Length 5 with 310721/524288items Rate full fake empty full 4812084/s -- -5% -7% fake 5041231/s 5% -- -3% empty 5193418/s 8% 3% -- Length 6 with 8655839/16777216items Rate fake full empty fake 4689731/s -- -4% -16% full 4906438/s 5% -- -12% empty 5604762/s 20% 14% --

Hmm... let's use moritz's code with strings instead of hashes:

for my $len (2..6) { our ($full, $empty); my $a = 'A'; $full .= $a while length($full) < $len; print "\nLength $len with ", scalar($full), " items\n"; cmpthese -1, { full => sub { die unless $full }, empty => sub { die if $empty }, fake => sub { die unless $full }, } } __END__ Length 2 with AA items Rate empty full fake empty 29399327/s -- -5% -35% full 31036076/s 6% -- -31% fake 44964665/s 53% 45% -- ...

It looks like the iteration/sub call overhead is shadowing the scalar %hash lookup, and moritz has been testing something else...

I'm re-reading No More Meaningless Benchmarks! ;-)

Replies are listed 'Best First'.
Re^5: What is the most efficient way to see if a hash is empty?
by BrowserUk (Patriarch) on Apr 28, 2009 at 19:59 UTC

    The trouble is, the comma operator does not prevent the "Useless use of ... in a void context:

    C:\test>perl -wle"substr( $i, 0, 1 ), ++$i for 1 .. 100" Useless use of substr in void context at -e line 1. Use of uninitialized value $i in substr at -e line 1.

    And I'm never sure whether that means it will always be optimised away as with:

    C:\test>perl -MO=Deparse -wle"$x = 1,2,3" Useless use of a constant in void context at -e line 1. Useless use of a constant in void context at -e line 1. Name "main::x" used only once: possible typo at -e line 1. BEGIN { $^W = 1; } BEGIN { $/ = "\n"; $\ = "\n"; } $x = 1, '???', '???'; -e syntax OK

    Or not? And with Benchmark not enabling warnings when it compiles stringified test code, it is very easy to find yourself benchmarking empty statements and drawing conclusions on that basis.

    And as you've demonstrated with your version of moritz benchmark, when attempting to benchmark micro-optimisations like the OPs question, the overhead of calling two levels of sub (benchmark wraps the user supplied code in it's own wrapper internally), can have a significant effect upon the perceived results.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://760647]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others exploiting the Monastery: (6)
As of 2024-04-16 18:00 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found