http://www.perlmonks.org?node_id=599277

mdunnbass has asked for the wisdom of the Perl Monks concerning the following question:

Hi everyone,

Yet another question from me.. I am trying to combine non-duplicate elements of 2 arrays into 1. I have read the "difference/intersection/union of 2 arrays" FAQ, and I've read the bit by vroom here, but I felt the need to be difficult. *grins*

Here's the bit I've got going so far:

my @group1 = ('A','B','C','D','E'); my @group2 = ('F','G','H','I','J'); foreach my $hit (@group2) { push (@group1, $hit) unless grep { $hit } @group1; }

Ideally, I want @group1 to contain all the elements from groups 1 and 2. Or, if for instance group2 also had a D and an E, it would contain everything but the duplicates. But, after running the above code, @group1 remains unchanged. Any glaring errors that pop out at you?

Something tells me I got this bit of code a few weeks ago here, but looking back through all of my recent questions, I must be missing it somewhere. So, sorry if this is redundant.

Thanks
Matt

Replies are listed 'Best First'.
Re: Combining arrays with grep/unless?
by kyle (Abbot) on Feb 09, 2007 at 21:05 UTC

    As others have already pointed out, there's a better way. That said, the reason your code fails is the grep isn't doing any kind of test. You want to actually check whether $hit is among the elements in @group1, not whether it is itself true or not.

    use Data::Dumper; my @group1 = ('A','B','C','D','E'); my @group2 = ('F','G','H','I','J'); foreach my $hit (@group2) { push (@group1, $hit) unless grep { $_ eq $hit } @group1; } print Dumper( \@group1 );

    Output:

    $VAR1 = [ 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J' ];

    As long as we're pretending hashes don't exist, I'll mention also that List::Util has a better function for testing list membership called first. It's similar to grep, but it will stop searching as soon as it finds a match.

Re: Combining arrays with grep/unless?
by Fletch (Bishop) on Feb 09, 2007 at 20:52 UTC

    One would hope you didn't get that code here since it's doing a linear scan of @group1 for every element of @group2. The way that's in the FAQ is in the FAQ for a reason.

    Update: Aaah, you want the union to have only distinct copies of the elements.

    my @distinct_union, %seen; for my $e ( @group1, @group2 ) { push @distinct_union, $e unless $seen{ $e }++; }
Re: Combining arrays with grep/unless?
by ikegami (Patriarch) on Feb 09, 2007 at 20:52 UTC

    ow, O(N2) time!

    If the items in @group1 and @group2 are strings or can be represented by an id, it can be done in O(N) time.

    my %seen; my @all = grep !$seen{$_}++, @group1, @group2;

    Update:

    If the items in @group1 and @group2 can be compared, it can be done in O(N log2 N) time (although order is lost).

    my @all; my @group1_s = sort compare_func @group1; my @group2_s = sort compare_func @group2; while (@group1_s && @group2_s) { my $cmp = compare_func($group1_s[0], $group2_s[0]); if ($cmp < 0) { push(@all, shift(@group1_s)); } elsif ($cmp > 0) { push(@all, shift(@group2_s)); } else { push(@all, scalar(shift(@group1_s), shift(@group +2_s))); } } push(@all, @group1_s, @group2_s);
      The first approach isn't O(N) because of the lookup in the hash. Update: for the impatients that don't want to look further in the subthread, this comment ended up to be incorrect.

      Flavio
      perl -ple'$_=reverse' <<<ti.xittelop@oivalf

      Don't fool yourself.

        True, and the OP's isn't really O(N2) either. Both should factor in the length of the string. Both should factor the growth of the array/hash. Mine should also factor in bucket collisions when talking about worst case.

        So, the OP's is O(N2 * L * O(Growth(N))) and mine is O(N * L * O(Growth(N))). I chose to ignore the common (and thus irrelevant) factor, even if accuracy suffered a little.

Re: Combining arrays with grep/unless?
by dragonchild (Archbishop) on Feb 09, 2007 at 21:17 UTC
    You're dealing with uniqueness. Unless you have a specific reason for not using hashes (such as your items don't stringify correctly), use hashes. They exist for a reason. (Several, actually, but who's counting?)

    Alternately, someone has already solved this with Set::Object.


    My criteria for good software:
    1. Does it work?
    2. Can someone else come in, make a change, and be reasonably certain no bugs were introduced?
Re: Combining arrays with grep/unless?
by davidrw (Prior) on Feb 10, 2007 at 04:42 UTC
    didn't see this type of solution in this thread yet, so here goes:
    my @group1 = ('A','B','C','D','E'); my @group2 = ('F','G','H','I','J'); my @group1 = sort keys %{{map {$_ => 1} @group1, @group2}};
    Note that the sort is optional, but otherwise the order will be random.
Re: Combining arrays with grep/unless?
by dorko (Prior) on Feb 09, 2007 at 23:24 UTC
    List::Compare was built for working with two, three or more arrays at once.

    use List::Compare; my @Llist = ('A','B','C','D','E'); my @Rlist = ( 'D','E','F','G','H','I','J'); my $lc = List::Compare->new(\@Llist, \@Rlist); # If I read your post correctly, # this is what you want. my @LorRonly = $lc->get_symmetric_difference; print join ", ", @LorRonly; # Also does... my @intersection = $lc->get_intersection; my @union = $lc->get_union; # And much, much more.
    Output:

    A, B, C, F, G, H, I, J

    Cheers,

    Brent

    -- Yeah, I'm a Delt.
Re: Combining arrays with grep/unless?
by polettix (Vicar) on Feb 10, 2007 at 00:16 UTC
    Update: d'ho, I completely missed the first part of this node by kyle!

    Beyond the fact that you should use one of the more efficient solutions given by others, your error lies in the test given to grep:

    ... grep { $hit } @group1;

    This tests if $hit is true, which always happen in your input data. You should have written:

    ... grep { $_ eq $hit } @group1;

    which does compare each item in @group1 with $hit.

    Flavio
    perl -ple'$_=reverse' <<<ti.xittelop@oivalf

    Don't fool yourself.