The stupid question is the question not asked PerlMonks

### Re^3: Fast, Efficient Union and Intersection on arrays

 on Nov 22, 2008 at 10:44 UTC ( #725318=note: print w/replies, xml ) Need Help??

But there is flaw in grep-algorithm: it does not union correctly, if first array is shorter than second. If you consider this, then the for-algorithm will about 6-7% faster.

Nõnda, WK
• Comment on Re^3: Fast, Efficient Union and Intersection on arrays

Replies are listed 'Best First'.
Re^4: Fast, Efficient Union and Intersection on arrays
by TGI (Parson) on Nov 24, 2008 at 19:35 UTC

Good catch on the bum algorithm! I was able to fix the error by adding some parentheses. However, I still get a good advantage for the two grep method.

```sub two_greps {
my %hash;
@hash{@a} = (1) x @a;

my @union        =  (@a, grep { !\$hash{\$_} } @b);
my @intersection =  (    grep {  \$hash{\$_} } @b);
}
__DATA__
Range (1,3) vs (3,9)
Rate  one_cfor   one_for two_greps
one_cfor  55648/s        --      -18%      -23%
one_for   68027/s       22%        --       -5%
two_greps 71891/s       29%        6%        --
Range (1,5) vs (3,8)
Rate  one_cfor   one_for two_greps
one_cfor  50378/s        --      -13%      -15%
one_for   58140/s       15%        --       -2%
two_greps 59277/s       18%        2%        --

Update: the reason for the error is that = has a higher precedence than ,. Therefore the bad version was like doing: (my @union = @a), grep { !\$hash{\$_} } @b;. The grep was executed, but the results were discarded.

A more serious error with the greps is that we test the value in the hash, not its existence. So 0 would never be in the intersection of a set, and always in the union. Two greps still wins by a small amount after that is fixed, too.

```sub two_greps {
my %hash;
@hash{@a} = (1) x @a;

my @union        =  (@a, grep { ! exists \$hash{\$_} } @b);
my @intersection =  (    grep {   exists \$hash{\$_} } @b);

}

Range (1,3) vs (0,9)
Rate  one_cfor   one_for two_greps
one_cfor  50787/s        --      -20%      -25%
one_for   63371/s       25%        --       -7%
two_greps 68027/s       34%        7%        --
Range (1,5) vs (3,8)
Rate  one_cfor   one_for two_greps
one_cfor  50813/s        --      -12%      -18%
one_for   57670/s       13%        --       -7%
two_greps 62150/s       22%        8%        --

TGI says moo

Create A New User
Node Status?
node history
Node Type: note [id://725318]
help
Chatterbox?
 [karlgoethebier]: [shmem]: karlgoethebier: es ist theologisch umstritten, ob jesus an christi himmelfahrt tatsächlich besoffen und mit bollerwagen zu seinem vater aufgebrochen ist [karlgoethebier]: shmem: Ja ischweis [karlgoethebier]: shmem: Vielleicht hat er in den Bollerwagen gebrochen? [karlgoethebier]: shmem:Ach so. Bollerwagen ist eigentlich was für Kinder. Bei uns fahren sie mit Traktoren und Feuerwehrwagen rum...

How do I use this? | Other CB clients
Other Users?
Others rifling through the Monastery: (11)
As of 2017-05-25 12:37 GMT
Sections?
Information?
Find Nodes?
Leftovers?
Voting Booth?
My favorite model of computation is ...

Results (187 votes). Check out past polls.