Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot

Re^2: Is there a difference in this declaration? (insignificant)

by tye (Sage)
on May 09, 2014 at 14:57 UTC ( #1085590=note: print w/replies, xml ) Need Help??

in reply to Re: Is there a difference in this declaration?
in thread Is there a difference in this declaration?

but may be significant in looping code

No, not really. You've fallen for the classic fallacy that Benchmark's overblown attempts to "eliminate overhead" can often lead to. The huge values in the "rate" column are a good indicator.

Let's test your theory by actually writing looping code and seeing how "significant" this difference can be. We'll have to come up with a loop that has a useful declaration of a hash inside of it and yet can complete iterations at something close to 6 million times each second and yet where the loop gets enough useful stuff done that almost no other code is required to get a useful result (as other code will further dilute the relative speed-up and thus reduce its significance).

When talking about a Perl operation that can happen 6 million times each second, it is pretty much impossible to make such a single operation be a non-trivial percentage of a useful script's run time. This is classic "micro optimization", a fool's errand.

So, for a declaration of a hash to be useful, surely you have to insert something into the hash. Since it is a fresh declaration, you're also going to need to use the hash or else you'll be building up close to 6 million new hashes each second and will quickly run out of memory. And this needs to somewhat simulate useful code as speeding up useless code is not "significant", it is theory at best and more often just pointless. :)

So, here is looping code that does nothing but add two entries to the hash. It isn't useful, but it is pretty darn minimal. Truly useful code is surely going to have to do more than this for the hash declaration to be a useful part of it.

#!/usr/bin/perl use strict; use warnings; use Benchmark qw{cmpthese}; cmpthese( -1 => { no_assignment => sub { for( 1..10_000 ) { my %hash; $hash{$_} = $_; $hash{-$_} = -$_; } }, assignment => sub { for( 1..10_000 ) { my %hash = (); $hash{$_} = $_; $hash{-$_} = -$_; } }, } ); __END__ Rate assignment ano_assignment assignment 99.4/s -- -8% no_assignment 108/s 9% --

Above is a typical result from a run of the script. In my experience, a 10% speed-up would be characterized as "something I'm quite unlikely to even notice" which falls a long way from "significant".

The speed difference is small enough that I even got this result when I ran the script a few times to verify that my first results weren't atypical:

Rate no_assignment assignment no_assignment 96.6/s -- -3% assignment 99.4/s 3% --

Note that the "with assignment" code is the one that ran faster that time.

Finally, a quick demonstration of why I think's attempt to "eliminate overhead" are overblown. With all of the insertions commented out, a typical result is:

Rate assignment no_assignment assignment 1068/s -- -37% no_assignment 1685/s 58% --

While your original code on my computer gives:

Rate assignment no_assignment assignment 11967704/s -- -49% no_assignment 23642004/s 98% --

...and takes noticeably longer to run. Benchmark has to over and over again try running the code in a tight loop with increasing repetition counts because it gets back time measurements that are too close to "the time it takes to run empty code" for the result to be considered meaningful enough to even be reported.

When that happens, the results are nearly guaranteed to have no practical value.

Note that none of this is meant as much of a criticism of what you wrote. Based on the numbers you got, it certainly might have been possible to have a significant impact. Your statement was quite conservative. But my experience lead me to doubt that such could happen, so I did a quick test to verify it.

This case is actually rather close to the edge of it being possible for a real, useful script to end up 20% faster (a minimum to be noticeable, IME) with only this change (though likely still rather contrived). Certainly extremely unlikely.

The speed difference certainly looks to be insignificant to me.

- tye        

Replies are listed 'Best First'.
Re^3: Is there a difference in this declaration? (insignificant)
by kcott (Chancellor) on May 10, 2014 at 01:24 UTC
    "You've fallen for the classic fallacy ..."

    Utter rubbish! I've "fallen" for no such thing.

    Before posting, I'd assumed the assignment incurred some overhead but also considered that an optimisation might have been applied to negate this. I chose to check it.

    The benchmark code indicated the overhead did exist: I posted the code and results to show this. I made no inferences nor offered any conclusions about the benchmark results.

    I wrote that the overhead was "typically negligible". I see that you excluded that from your opening quote.

    Anything, no matter how small, when multiplied enough times will become a bigger thing: that bigger thing "may be significant".

    -- Ken

      I see that you excluded that from your opening quote.

      I see that you excluded part of what I said from your opening quote:

      Note that none of this is meant as much of a criticism of what you wrote.

      But you certainly seemed to have taken it that way. Understandable, though, despite the disclaimer.

      But I will object to one new thing you added:

      Anything, no matter how small, when multiplied enough times will become a bigger thing: that bigger thing "may be significant".

      That math doesn't actually work very well when talking about code optimization. The more you multiply the code, the more the significance ends up being divided.

      Heck, you can't even multiply the results from by just 1. starts by telling you that something takes 150% (or 100%) more time and then I multiply the code with a 10,000-iteration loop and the difference is divided by 2 or 3 (down close to the maximum possible difference you can actually achieve even with completely contrived code, because real code doesn't have the option of ignoring overhead, like tries to do).

      - tye        

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1085590]
and the monks are chillaxin'...

How do I use this? | Other CB clients
Other Users?
Others taking refuge in the Monastery: (14)
As of 2018-05-22 13:11 GMT
Find Nodes?
    Voting Booth?