Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Re: Re: Find unique elements in an array

by ysth (Canon)
on Apr 04, 2004 at 09:35 UTC ( #342467=note: print w/replies, xml ) Need Help??


in reply to Re: Find unique elements in an array
in thread Find unique elements in an array

saying = () instead of = 1 is slightly faster, but the hash slice approach may not scale well with large @a's, since the whole array needs to be placed on the stack at once.

Replies are listed 'Best First'.
Re: Re: Re: Find unique elements in an array
by kvale (Monsignor) on Apr 04, 2004 at 11:52 UTC
    If there is a stack penalty, it is not terrible, as the routine get faster with large arrays:
    use Benchmark qw(:all) ; my @a; push @a, int (rand(100)) foreach 1..2_000_000; my %unique; my (@awd1, @awd2, @awd3); cmpthese(5, { 'jc' => sub { foreach my $thingy (@a) { $unique{$thingy} = 1 +; } @awd1 = keys %unique; }, 'mk' => sub { @unique{ @a} = 1; @awd2 = keys %unique; }, 'ys' => sub { @unique{ @a} = (); @awd3 = keys %unique; }, });
    yields
    Benchmark: timing 5 iterations of jc, mk, ys... jc: 19 wallclock secs (16.75 usr + 0.35 sys = 17.10 CPU) @ 0 +.29/s (n=5) mk: 6 wallclock secs ( 6.00 usr + 0.01 sys = 6.01 CPU) @ 0 +.83/s (n=5) ys: 7 wallclock secs ( 6.00 usr + 0.00 sys = 6.00 CPU) @ 0 +.83/s (n=5) s/iter jc mk ys jc 3.42 -- -65% -65% mk 1.20 185% -- -0% ys 1.20 185% 0% --
    The = () optimization does not seem to make much difference.

    -Mark

      With your original benchmark, I saw a consistent 4-5% increase for (). Obviously this is a constant difference that disappears into the woodwork with larger slices.
      Anyway @unique{@a} = 1; looks a bit curious. Why do we set to 1 the only value in a huge hash?

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://342467]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others perusing the Monastery: (4)
As of 2020-02-25 06:00 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    What numbers are you going to focus on primarily in 2020?










    Results (108 votes). Check out past polls.

    Notices?