LanX has asked for the wisdom of the Perl Monks concerning the following question:
I'm trying to design a faster and more robust alternative to perlfaq: intersection,union, difference of two arrays.
One of the things I don't like is, that many such examples in the FAQ use the keys of a %count hash as results.
This is fundamentally wrong because keys are stringifications and any reference type will not be reproduced (just think about arrays of objects or a AoH or ...)
So I thought about using $set{STRINGIFICATION}=REALVALUE as as fundamental datastructure for set operations.
And to significantly speed up things I also wanted to avoid any loop constructs, restricting myself to hashslices and list flattening.
use Data::Dump qw(pp); my @array1=(1..5); my @array2=(3..7); my (%union,%intersection,%difference,@tmp); my (%set1,%set2); @set1{@array1}=@array1; @set2{@array2}=@array2; @tmp= @set1{ keys %set2 }; @intersection{@tmp}=@tmp; # warning: use of uninitialized values delete $intersection{""} ; # dirty hack %union=(%set1,%set2); %difference=%union; delete @difference{keys %intersection}; pp \(%set1,%set2,%union,%intersection,%difference);
prints
Use of uninitialized value $tmp[0] in hash slice at /home/lanx/B/PL/PM +/set.pl line 17. Use of uninitialized value $tmp[3] in hash slice at /home/lanx/B/PL/PM +/set.pl line 17. ( { 1 => 1, 2 => 2, 3 => 3, 4 => 4, 5 => 5 }, { 3 => 3, 4 => 4, 5 => 5, 6 => 6, 7 => 7 }, { 1 => 1, 2 => 2, 3 => 3, 4 => 4, 5 => 5, 6 => 6, 7 => 7 }, { 3 => 3, 4 => 4, 5 => 5 }, { 1 => 1, 2 => 2, 6 => 6, 7 => 7 }, )
my problem is the "dirty hack" because hashslice return undef for non-existant keys and undef is later stringified to an empty string. (muting the warnings isn't the problem)
Any elegant idea how to solve this?
Or do I have to fall back to something like
%intersection = map { exists $set1{$_} ? ($_=>$set1{$_}) : () } keys %set2;
or
@tmp= grep {defined} @set1{ keys %set2 };
???
Cheers Rolf