http://www.perlmonks.org?node_id=502236


in reply to Re^2: Count non-empty hash values
in thread Count non-empty hash values

and probably a tiny bit faster

Yep, you've got me there, and then some:

Rate OP other fishbot Joost OP 293/s -- -21% -76% -92% other 369/s 26% -- -69% -90% fishbot 1205/s 312% 226% -- -68% Joost 3723/s 1173% 908% 209% --

"other" was a solution using values with a regex. "OP" was a corrected key and regex version of the OP's code.

Update: Added OP-style solution to benchmark.

Benchmark code used:

use strict; use warnings; use Benchmark qw{ cmpthese }; our %a; $a{ $_ } = "foo" for 1..1000; $a{ 1 + int rand 1000 } = '' for 1..10; cmpthese( -3, { Joost => 'scalar grep { $_ ne "" } values %a', fishbot => 'scalar grep { $a{$_} ne "" } keys %a', other => 'scalar grep { ! /^$/ } values %a', OP => 'scalar grep { $a{$_} !~ /^$/ } keys %a', } );

Replies are listed 'Best First'.
Re^4: Count non-empty hash values
by ikegami (Patriarch) on Oct 22, 2005 at 21:23 UTC
    grep length, values %a
    is even faster.
    Rate OP other fishbot Joost ikegami OP 186/s -- -34% -67% -88% -90% other 280/s 51% -- -51% -82% -84% fishbot 571/s 207% 104% -- -64% -68% Joost 1578/s 750% 463% 176% -- -12% ikegami 1798/s 868% 542% 215% 14% --

        That catches undefined values (most likely good) but also catches 0 and "0" (most likely bad). Checking for truth had occured to me, but I decided to be consistent with the other solutions, and "0" counting as empty was too big a question mark. Checking for truth is 22% faster than checking the length, though:

        Rate OP other fishbot Joost ikegami truth OP 193/s -- -32% -67% -88% -89% -91% other 284/s 47% -- -52% -82% -84% -87% fishbot 592/s 207% 109% -- -62% -67% -73% Joost 1578/s 719% 456% 166% -- -13% -29% ikegami 1818/s 843% 540% 207% 15% -- -18% truth 2221/s 1052% 682% 275% 41% 22% --
Re^4: Count non-empty hash values
by saskaqueer (Friar) on Oct 22, 2005 at 18:33 UTC

    Whenever posting a benchmark, it is best to include the exact script code you used to generate the benchmark, so that others are able to see that you implemented each benchmark case properly and in its most efficient form. I have seen many cases where the original benchmark results provided by a monk have been completely bogus because of implementation problems in one or more of the benchmark cases. There is even the odd node where the result representing the best benchmark case is incorrect, and where that "best benchmark case" actually comes out as the worst case you could possibly use in real code.