Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

Re^4: Concerning hash operations (appending, concatenating)

by RazorbladeBidet (Friar)
on Mar 24, 2005 at 17:53 UTC ( #442136=note: print w/ replies, xml ) Need Help??


in reply to Re^3: Concerning hash operations (appending, concatenating)
in thread Concerning hash operations (appending, concatenating)

But isn't that creating arrays and going through loops that are unnecessary?

--------------
"But what of all those sweet words you spoke in private?"
"Oh that's just what we call pillow talk, baby, that's all."


Comment on Re^4: Concerning hash operations (appending, concatenating)
Re^5: Concerning hash operations (appending, concatenating)
by japhy (Canon) on Mar 24, 2005 at 18:24 UTC
    Here's my benchmark:
    use Benchmark 'cmpthese'; use strict; my %hash = 1 .. 10_000; my %add = 9_001 .. 11_000; cmpthese(-5, { BASE => sub { my %copy = %hash; }, JAPHY => sub { my %copy = %hash; my @new = grep !exists $copy{$_}, keys %add; @copy{@new} = @add{@new}; }, ZAXO => sub { my %copy = %hash; %copy = (%add, %copy); }, NULL => sub { my %copy = %hash; %copy = (%copy); }, });
    Here are the results I get:
    Rate ZAXO NULL JAPHY BASE ZAXO 15.1/s -- -22% -61% -76% NULL 19.3/s 28% -- -50% -69% JAPHY 38.9/s 157% 101% -- -38% BASE 62.4/s 312% 223% 60% --
    The two cases 'BASE' and 'NULL' show that Perl does not optimize %hash = %hash at all, since 'NULL' is three times slower than 'BASE'. 'JAPHY' is 2.5 times faster than 'ZAXO' in this run, with 500 old keys and 500 new keys. When I change %add to be (5_001 .. 12_000), the run times are:
    Rate ZAXO NULL JAPHY BASE ZAXO 11.4/s -- -39% -58% -81% NULL 18.6/s 63% -- -31% -69% JAPHY 26.9/s 136% 45% -- -56% BASE 60.5/s 431% 225% 125% --
    The point is that %a = (%b, %a) is slow because it is O(N+M) where N is the size of %a and M is the size of %b. My code is O(M) -- specifically, O(M(1+x)) where x is the fraction of keys in %b that aren't in %a -- because it doesn't iterate over %a at all, only %b.
    _____________________________________________________
    Jeff japhy Pinyan, P.L., P.M., P.O.D, X.S.: Perl, regex, and perl hacker
    How can we ever be the sold short or the cheated, we who for every service have long ago been overpaid? ~~ Meister Eckhart

      The grep is superfluous and, IMO, only distracts from the code. Here's the sub to add to the above benchmark:

      TANKTALUS => sub { my %copy = %hash; @copy{keys %add} = values %add; },
      And the benchmarks. First, the original 500/500 overlap/new keys:
      Rate ZAXO NULL TANKTALUS JAPHY BASE ZAXO 12.3/s -- -23% -59% -59% -76% NULL 16.1/s 30% -- -46% -47% -68% TANKTALUS 29.8/s 142% 85% -- -2% -41% JAPHY 30.4/s 147% 89% 2% -- -40% BASE 50.9/s 313% 216% 71% 67% --
      As you can see, the difference is pretty insignificant compared to the extra "line noise" that needs to be parsed by the human who maintains it. I didn't think it'd change much with a larger sample, but I tried it anyway (using your 5_001..12_000 hash):
      Rate ZAXO NULL JAPHY TANKTALUS BASE ZAXO 9.48/s -- -39% -55% -55% -81% NULL 15.7/s 65% -- -26% -26% -69% JAPHY 21.1/s 122% 35% -- -0% -59% TANKTALUS 21.1/s 123% 35% 0% -- -59% BASE 51.1/s 439% 226% 142% 142% --
      Somewhere, deep down in the lost precision, the non-grep version comes out ahead. Which we'll attribute to statistical insigificance, just as I did for the 2% in the other direction. ;-)

      The other point is that your answer does not do the same thing as the rest of the examples here. In all the rest of the cases, the new hash always overrides the values in the old hash. That is, if the old hash had a => "b", while the new hash had a => "c", the behaviour of yours vs the rest is different. Which one is correct is dependant on the demands of the situation, although I usually lean towards new hashes winning over old hashes the way I write the code. In cases where I want "first come, first served" policies, then I will definitely keep your grep in mind - it's probably a lot more elegant than the way I have been doing it (if I have - I'd have to look through old code to be sure). Thanks ;-)

        You missed the point, Tanktalus. We're only adding NEW key-values pairs to the hash, thus the grep() is required. The node by Zaxo was showing how to have the original values override the new values if the keys matched.
        _____________________________________________________
        Jeff japhy Pinyan, P.L., P.M., P.O.D, X.S.: Perl, regex, and perl hacker
        How can we ever be the sold short or the cheated, we who for every service have long ago been overpaid? ~~ Meister Eckhart

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://442136]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others meditating upon the Monastery: (7)
As of 2014-12-20 04:53 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (95 votes), past polls