Finding unique elements in an array

by Eyck (Priest)
What's the best way to implement uniq for arrays?

What I've got is this:

#!/usr/bin/perl -w @a=qw(ala ma kota tytus ma kolty a pies ma ale); use Data::Dumper; print Dumper(@a); print "Sorted:\n"; print Dumper(sort @a); print "Uniq:\n"; print Dumper(uniq(@a)); print "Uniq2:\n"; print Dumper(uniq(sort @a)); sub uniq { my @out=(); my ($a,$lasta); foreach $a (@_) { push @out,$a unless (defined($lasta) && ($lasta eq $a) +); $lasta=$a; }; return @out; };
This works, but it seems a bit ugly, looks like it's copying stuff too much ( I would prefer to work on original array ) etc...

by holli (Monsignor) on Mar 15, 2005 at 13:03 UTC
Re: Finding unique elements in an array
by pelagic (Priest) on Mar 15, 2005 at 13:28 UTC
    Please ignore:
    I don't remember where I pinched this one, it might have been around here somewhere ;-)
    This shows some different ways to do it and benchmarks:
    use Benchmark; my @list; for ( 0..9999 ) { push @list, sprintf "%d", 100 * rand ; } timethese( 1000, { 'keys_map_1' => sub { my @uniq = keys %{{ map {$_ => 1} + @list }}; }, 'keys_map_undef' => sub { my @uniq = keys %{{ map {$_ => un +def} @list }}; }, 'grep_seen' => sub { my %seen; my @uniq = grep ! $seen +{$_}++, @list; }, } ); __END__ Benchmark: timing 1000 iterations of grep_seen, keys_map_1, keys_map_u +ndef... grep_seen: 15 wallclock secs (14.86 usr + 0.01 sys = 14.87 CPU) +@ 67.23/s (n=1000) keys_map_1: 50 wallclock secs (46.78 usr + 0.83 sys = 47.61 CPU) +@ 21.00/s (n=1000) keys_map_undef: 43 wallclock secs (42.16 usr + 0.94 sys = 43.09 CPU) +@ 23.21/s (n=1000)
    The benchmark results are very much dependant of the size of the array. I took a 10,000 items array as an example.

    To post something reasonable that does implement uniq as in OP's question, here's my solution (somewhat similar to Joost's):
    my @b = map {$a[$_] eq $a[$_ + 1] ? () : $a[$_]} 0..$#a;

      None of these perform the function of uniq as the parent requested. uniq only looks at the previous item, not all previous items.
Re: Finding unique elements in an array
by Joost (Canon) on Mar 15, 2005 at 13:08 UTC
Re: Finding unique elements in an array
by eyepopslikeamosquito (Chancellor) on Mar 15, 2005 at 13:04 UTC

    It's in perlfaq4 "How can I remove duplicate elements from a list or array?".

    Also, your line 19:

    {push @out,$a;} unless (defined($lasta) && ($lasta eq $a));
    gives me a syntax error.

        But that's not what uniq does: it only removes elements that are the same as the previous element in the list.
        Are you looking at a different version of perlfaq4? My perl 5.8.6 version of perlfaq 4 "How can I remove duplicate elements from a list or array" gives 5 options, option a) of which states:
        If @in is sorted, and you want @out to be sorted: (this assumes all true values in the array)
        $prev = "not equal to $in[0]"; @out = grep($_ ne $prev && ($prev = $_, 1), @in);
        This is nice in that it doesn't use much extra memory, simulating uniq(1)'s behavior of removing only adjacent duplicates. The ``, 1'' guarantees that the expression is true (so that grep picks it up) even if the $_ is 0, ``'', or undef.
Re: Finding unique elements in an array
by RazorbladeBidet (Friar) on Mar 15, 2005 at 13:14 UTC
    I think that is probably the clearest way to do it. Here's a way (which I'm not 100% works 100% of the time) that operates on the original list... but as you can see, it's nowhere near as clear:
    sub uniq2 { my @out = (); my ($a, $lasta); for ( my $i = 0; $i < @_; $i++ ) { splice @_, $i--, 1 if defined($lasta) && $_[$i] eq $lasta; $lasta = $_[$i]; } return @_; }
    I would stick with something similar to what you have above - it's easy to read, follow and maintain.

    Although I'm sure someone else can think of an uber-elegant solution :)
Re: Finding unique elements in an array
by Limbic~Region (Chancellor) on Mar 15, 2005 at 13:39 UTC
    I will offer the same solution as I have to similar questions in the past. In my opinion, if you need to do things more than once in a program like sort a hash or get unique elements in an array, it is best to have a work-horse do the heavy lifting for you. While you haven't said this is the case here, it is always nice to have it in your back pocket.

    Cheers - L~R

