Re^4: Fast, Efficient Union and Intersection on arrays

in reply to Re^3: Fast, Efficient Union and Intersection on arrays
in thread Fast, Efficient Union and Intersection on arrays

Good catch on the bum algorithm! I was able to fix the error by adding some parentheses. However, I still get a good advantage for the two grep method.

sub two_greps {
        my %hash;
        @hash{@a} = (1) x @a;

        my @union        =  (@a, grep { !$hash{$_} } @b);
        my @intersection =  (    grep {  $hash{$_} } @b);
}
__DATA__
Range (1,3) vs (3,9)
             Rate  one_cfor   one_for two_greps
one_cfor  55648/s        --      -18%      -23%
one_for   68027/s       22%        --       -5%
two_greps 71891/s       29%        6%        --
Range (1,5) vs (3,8)
             Rate  one_cfor   one_for two_greps
one_cfor  50378/s        --      -13%      -15%
one_for   58140/s       15%        --       -2%
two_greps 59277/s       18%        2%        --
[download]

Update: the reason for the error is that = has a higher precedence than ,. Therefore the bad version was like doing: (my @union = @a), grep { !$hash{$_} } @b;. The grep was executed, but the results were discarded.

A more serious error with the greps is that we test the value in the hash, not its existence. So 0 would never be in the intersection of a set, and always in the union. Two greps still wins by a small amount after that is fixed, too.

sub two_greps {
        my %hash;
        @hash{@a} = (1) x @a;

        my @union        =  (@a, grep { ! exists $hash{$_} } @b);
        my @intersection =  (    grep {   exists $hash{$_} } @b);

}

Range (1,3) vs (0,9)
             Rate  one_cfor   one_for two_greps
one_cfor  50787/s        --      -20%      -25%
one_for   63371/s       25%        --       -7%
two_greps 68027/s       34%        7%        --
Range (1,5) vs (3,8)
             Rate  one_cfor   one_for two_greps
one_cfor  50813/s        --      -12%      -18%
one_for   57670/s       13%        --       -7%
two_greps 62150/s       22%        8%        --
[download]

TGI says moo

In Section Seekers of Perl Wisdom