Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

Re^7: Curious find while comparing grep, map, and smart match...

by BrowserUk (Pope)
on Mar 27, 2013 at 17:45 UTC ( #1025769=note: print w/ replies, xml ) Need Help??


in reply to Re^6: Curious find while comparing grep, map, and smart match...
in thread Curious find while comparing grep, map, and smart match...

those "shuffled" results relate to what the other results represent?

They achieve the same results -- an array of 100 random integers in the range 1 ,, 120 -- 20 times more efficiently than your best attempt and nearly 100 times more efficiently than your worst; whilst saving 100MB of memory and the setup costs.

And the random selection of the values in that array is statistically fair with my method and not with yours.


With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.


Comment on Re^7: Curious find while comparing grep, map, and smart match...
Re^8: Curious find while comparing grep, map, and smart match...
by dbuckhal (Monk) on Mar 27, 2013 at 18:50 UTC

    If my goal was to simply generate an array of unique random numbers, then you are correct and I completely understand, and appreciate, your solution. The memory cost saving ability of shuffle is definitely invaluable. But, that was not my goal. My goal was to benchmark the filtering/processing methods of grep, map, and~~ against the same data set.

    Can you see where I did not think your results directly related to the other results? If not, then that's fine.

      But, that was not my goal. My goal was to benchmark the filtering/processing methods of grep, map, and~~ against the same data set.

      That implies that you have an application for filtering values against an array that is currently too slow; so you chose too benchmark alternatives. That's good.

      But rather than benchmarking the actual application, you made up this 'unique random number selection' problem and used that as the basis of your benchmark. That's less good.

      The chances are that if you posted a benchmark for the actual application, then one of the monks would see an alternative approach to that application that would similarly avoid the need to do O(N) processing of a huge list.

      For example, for simple unique filtering of small lists of values, using a hash is way more efficient:

      sub hashGen { my $idx = 0; my %mArray; ++$mArray{ $nums[ ++$idx ] } while keys %mArray < $uSize; return keys %mArray; } __END__ C:\test>junk Rate grepGen mapGen firstGen smartGen hashGen sh +uffleEm grepGen 45.2/s -- -54% -79% -96% -99% + -100% mapGen 97.9/s 116% -- -54% -91% -98% + -100% firstGen 214/s 374% 119% -- -80% -97% + -99% smartGen 1074/s 2274% 997% 401% -- -83% + -96% hashGen 6500/s 14275% 6540% 2932% 506% -- + -75% shuffleEm 25619/s 56551% 26068% 11849% 2286% 294% + --

      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

        Wow, screaming fast! Makes sense because of the nature of hashes: no duplicates. Thanks for the info.

        Rate grepGen mapGen smartGen hashGen grepGen 29.8/s -- -3% -89% -97% mapGen 30.6/s 3% -- -89% -97% smartGen 278/s 833% 807% -- -75% hashGen 1096/s 3583% 3480% 295% --

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1025769]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others scrutinizing the Monastery: (5)
As of 2014-07-25 23:43 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (175 votes), past polls