Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses

Efficient Comparison of Array elements

by aging acolyte (Pilgrim)
on Jan 08, 2003 at 15:42 UTC ( [id://225276] : perlquestion . print w/replies, xml ) Need Help??

aging acolyte has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks,

In my journey to enlightenment I am looking for efficiency.

My problem is I have two arrays and I want to compare them and list those elements that are common to both and those that are unique.


my @one = [1,2,3,4]; my @two = [2,4,6,8]; sub cf{...} common = [2,4]; unique_one = [1,3]; unique_two = [6,8];
Now I am sure that this is a common problem and I know that a set of nested for loops would work - comparing each element in turn. But there must be a better way. Is it time for my understanding of map to grow beyond the "little black book" description?



Replies are listed 'Best First'.
Re: Efficient Comparison of Array elements
by davorg (Chancellor) on Jan 08, 2003 at 15:54 UTC

      Thanks a lot, that is roughly what I want. But a couple of follow ups:

      Is this efficient/practical if both arrays get to sizes of 10K elements?

      The array generated for difference contains elements from both arrays. I wanted those unique to @one AND those unique to @two. Again I can do this by comparing @difference with @one and @difference with @two. But I am running the same code three times. Is it practical?


      BTW - as for the syntax thing - a good general rule for SOPW would be "first engage brain then type...."

Re: Efficient Comparison of Array elements
by LTjake (Prior) on Jan 08, 2003 at 16:10 UTC
    I think Set::Array will do the job. NOTE: UNTESTED CODE (will be updated)
    use Set::Array; my $sao1 = Set::Array->new(1, 2, 3, 4); my $sao2 = Set::Array->new(2, 4, 6, 8); my @common = $sao1->intersection($sao2); my @u_one = $sao1->difference($sao2); my @u_two = $sao2->difference($sao1);
    Update: Code updated, and tested.

    "To err is human, but to really foul things up you need a computer." --Paul Ehrlich
Re: Efficient Comparison of Array elements
by rir (Vicar) on Jan 08, 2003 at 16:36 UTC
    I would not call this efficient. I have concerns regarding this scaling up, but I've not yet had reason to check it out.
    This does seen to be the kind of idiomatic solution you implied you wanted.

    UPDATE: More details on the data set being manipulated make me more cautious about the suitability of this code. If efficiency is really a problem and the differences and intersection of the arrays are all desired my first question would be Are these arrays ordered, as in your example?

    #!/usr/bin/perl use warnings; use strict; my @one = ( 1, 2, 3, 4); my @two = ( 2, 4, 6, 8); my (%in_one, %in_two, @in_both, @only_in_one, @only_in_two); @in_one{ @one} = 1; @in_two{ @two} = 1; @in_both = grep { exists $in_one{$_} } @two; @only_in_one = grep { not exists $in_two{$_}} @one; @only_in_two = grep { not exists $in_one{$_}} @two; local $, = " "; print "\@one ", @one, "\n"; print "\@two ", @two, "\n"; print "\@only_in_one ", @only_in_one, "\n"; print "\@only_in_two ", @only_in_two, "\n"; print "\@in_both ", @in_both, "\n"; __DATA__ @one 1 2 3 4 @two 2 4 6 8 @only_in_one 1 3 @only_in_two 6 8 @in_both 2 4
Don't re-invent the wheel.
by dragonchild (Archbishop) on Jan 08, 2003 at 18:02 UTC
    Set::Array is good. Another is Quantum::Superpositions. Most important is to not re-invent the wheel.

    We are the carpenters and bricklayers of the Information Age.

    Don't go borrowing trouble. For programmers, this means Worry only about what you need to implement.

Re: Efficient Comparison of Array elements
by jdporter (Chancellor) on Jan 08, 2003 at 16:22 UTC
    Just for fun. Feel free to ignore.
    sub set_comm # named after the unix utility "comm" { my @ar = @_; # two sets, passed as array(ref)s. # make the corresponding sets: my @hr = map { my %h; @h{@$_} = (); \%h } @ar; # figure out which is shorter, since we'll be iterating # over that list of keys: my( $shorter, $longer ) = keys %{$h[0]} < keys %{$h[1]} ? ( 0, 1 ) : ( 1, 0 ); # determine common keys: my @common = grep { exists $hr[$longer]{$_} } keys %{$hr[$shorter]}; # remove those from the sets: delete @{$hr[0]}{@common}; delete @{$hr[1]}{@common}; # return "in first only, in second only, in both": return( [ keys %{$hr[0]} ], [ keys %{$hr[1]} ], \@common ); }

    The 6th Rule of Perl Club is -- There is no Rule #6.

Re: Efficient Comparison of Array elements
by runrig (Abbot) on Jan 08, 2003 at 18:46 UTC
    Yet another way, not necessarily the best way:
    my @arr1 = qw(a b b c d); my @arr2 = qw(c c d e e f); my ($unique1, $common, $unique2) = common_unique(\@arr1, \@arr2); sub common_unique { my (@u_c, %union); exists $union{$_} or $union{$_}-- for @{$_[0]}; no warnings 'uninitialized'; $union{$_} <= 0 and $union{$_}++ for @{$_[1]}; while (my ($key, $value) = each %union) { push @{$u_c[$value+1]}, $key; } return @u_c; }