http://www.perlmonks.org?node_id=571744

ps has asked for the wisdom of the Perl Monks concerning the following question:

Hello Monks,

I am relatively new to perl and I am just learning to handle arrays. I have a array of numbers and I need only the unique elements to be retained. Can anyone tell me how to do it? And note that the number are not in the order.

  • Comment on Extracting Unique Characters from a Array

Replies are listed 'Best First'.
Re: Extracting Unique Characters from a Array
by mreece (Friar) on Sep 07, 2006 at 17:41 UTC
    array elements are not unique, but hash keys are, so a common approach is to use the uniqueness of hash keys to resolve your array.

    if you want to preserve the original order, you can do this:

    my @nums = qw(2 1 3 5 4 5 4 3 2 1); # order is not important my @unique; # new list of unique elements my %seen; # numbers i have seen so far foreach my $x ( @nums ) { # check each number if ( ! $seen{$x} ) { # skip if we already saw this one $seen{$x} = 1; # note we have seen this one now push @unique, $x; # and store to new list } } ## TODO: do something with @unique
    if you want to sort the resulting list, you can do this:
    my @nums = qw(2 1 3 5 4 5 4 3 2 1); # order is not important my %seen = map { $_ => 1 } @nums; # build a hash; keys will be unique! @nums = sort keys %seen; # replace old list with new sorted list ## TODO: do something with @nums
    another approach, as rsriram demonstrated above, is to sort first, then loop and look for repeating values:
    my @nums = qw(2 1 3 5 4 5 4 3 2 1); @nums = sort @nums; my @unique; my $previous; foreach my $current ( @nums ) { # check each number next if $current eq $previous; # duplicate! push @unique, $current; # store to new list $previous = $current; # 'current' becomes 'previous' } ## TODO: do something with @unique
    (updated to fix some silly issues with untested code)
Re: Extracting Unique Characters from a Array
by VSarkiss (Monsignor) on Sep 07, 2006 at 17:06 UTC
Re: Extracting Unique Characters from a Array
by kwaping (Priest) on Sep 07, 2006 at 19:56 UTC
    This solution uses some moderately-advanced techniques, but I recommend you study the solution and perldoc what you don't understand. You will learn some really useful techniques!
    #!/usr/bin/perl use strict; use warnings; use Carp; use Data::Dumper::Simple; my @array = qw( 7742.8858 7748.5855 5581.2272 1248.2257 6589.5586 5680.1886 8762.1898 5581.2272 5535.5795 6896.6571 6573.2575 6589.5586 5680.1886 5487.3275 5489.5633 6589.5586 5680.1886 ); my %unique = map { $_ => '' } @array; my @sorted_unique = sort { $a <=> $b } keys %unique; print Dumper(%unique,@sorted_unique);

    ---
    It's all fine and dandy until someone has to look at the code.
Re: Extracting Unique Characters from a Array
by sub_chick (Hermit) on Sep 07, 2006 at 17:05 UTC
    I have a array of numbers and I need only the unique elements to be retained.

    Can you post the array you are working with?
    And if you want your array to be in numerical order, you should read up on sort.


    Es gibt mehr im Leben als Bücher, weißt du. Aber nicht viel mehr. - (Die Smiths)"

      The numerical data I am working with are the part numbers. They will be nothing but 4 digit numbers, followed by a decimal point and followed by another 4 digits. Here they are

      7742.8858
      7748.5855
      5581.2272
      1248.2257
      6589.5586
      5680.1886
      8762.1898
      5535.5795
      6896.6571
      6573.2575
      5487.3275
      5489.5633

      I am just pasting few for your reference. The numbers goes on..on and on... I know this array has to be sorted but I am not able to work further removing the repeating entries.

      Thank you for your immediate response.

Re: Extracting Unique Characters from a Array
by rsriram (Hermit) on Sep 07, 2006 at 17:08 UTC

    Hi, Assuming that your array of numbers is:

    @array= qw/10 10 48 22 34 54 23 65 22/;

    The easiest way to do it is first sort the array

    @sarray=sort(@array);

    Now I am checking through every element in the array and storing it to a variable $last_element. The current element will be stored in $i. In the if condition, I am checking if the current element is equal to the last element and if the condition is false, I am printing the element. And here is the code:

    @array= qw/10 10 48 22 34 54 23 65 22/; @sarray=sort(@array); my $last_element = ""; foreach $i (@sarray) { if ($last_element ne $i) { print "$i\n"; $last_element = $i; } }

    You can have answers for these types of questions at Super search.

      Yeah, let's turn a single O(n) scan with a hash (the right way to do it) into an O(n log n) sort followed by another O(n) scan. Aside from the added inefficiency you've lost the original ordering if that needed to be preserved.

      Update: Ah, a followup did note that the list was to be sorted. Still makes more sense to do the O(n) cull of duplicates first and then the O(n log n) sort of the smaller list.

      When in doubt:

      #!/usr/bin/perl use Benchmark qw( timethese cmpthese ); use constant SIZE => 6_000; use constant COUNT => 500; my $count = shift || COUNT; my $size = shift || SIZE; my @source = map { int( rand($size) ) } 1 .. $size; cmpthese( $count, { sort_first => sub { my @sorted = sort @source; my $le = undef; my @uniq; for (@sorted) { if ( $le != $_ ) { $le = $_; push @uniq, $_; } } }, cull_first => sub { my %seen; my @uniq = grep { !$seen{$_}++ } @source; my @sorted = sort @uniq; } } ); exit 0; __END__

      The cull_first version is from 8-20% faster for lists of from 5 to several thousand items.

      Hi,

      Thank you, your code works well!! But I am still unable to understand your logic on how you are using $i and $last_element. Can you explain it to me further?