Re: most efficient way to implement "A without B"

In addition to Zaxo's solution, consider implementing your "lists" as the keys of hashes to begin with (actually this is just packaging Zaxo's solution for easy reuse). Then all these membership questions become relatively easy to code. E.g., your "@A without @B" is basically a relative complement:

use strict;
use warnings;

sub relative_complement {
  return list_2_hashref( grep !exists $_[ 1 ]->{ $_ }, keys %{ $_[ 0 ]
+ } );
}

sub list_2_hashref {
  return +{ map +( $_ => 1 ), @_ };
}

my $diff = relative_complement( list_2_hashref( @A ),
                                list_2_hashref( @B ) );
[download]

From here it's easy to code similar operations such as intersections:

sub intersection {
  my %tally;
  $tally{ $_ }++ for map keys %$_, @_;
  return list_2_hashref( grep $tally{ $_ } == @_, keys %tally );
}
[download]

or the "symmetric difference" (in A but not B, or in B but not A; akin to xor):

sub sym_diff {
  my %tally;
  $tally{ $_ }++ for map keys %$_, @_[0, 1];
  return list_2_hashref( grep $tally{ $_ } == 1, keys %tally );
}
[download]

With the above:

my $A = list_2_hashref( qw( a b c ) );
my $B = list_2_hashref( qw( b c d ) );
my $C = list_2_hashref( qw( c d e ) );

$| = 1;
use Dumpvalue;
my $dumper = Dumpvalue->new();
print "\nA:\n";
$dumper->dumpValue( $A );
print "\nB:\n";
$dumper->dumpValue( $B );
print "\nC:\n";
$dumper->dumpValue( $C );
print "\nrelative complement A, B:\n";
$dumper->dumpValue( relative_complement( $A, $B ) );
print "\nintersection A, B, C:\n";
$dumper->dumpValue( intersection( $A, $B, $C ) );
print "\nsymmetric difference A, B:\n";
$dumper->dumpValue( sym_diff( $A, $B ) );
__END__
A:
'a' => 1
'b' => 1
'c' => 1

B:
'b' => 1
'c' => 1
'd' => 1

C:
'c' => 1
'd' => 1
'e' => 1

relative complement A, B:
'a' => 1

intersection A, B, C:
'c' => 1

symmetric difference A, B:
'a' => 1
'd' => 1
[download]

Alternatively, you can roll out the big guns and use one of the implementations of sets from CPAN, such as Jarkko Hietaniemi's Set::Scalar. With the latter, the above reduces to:

use strict;
use warnings;
use Set::Scalar;

my $A = Set::Scalar->new( qw( a b c ) );
my $B = Set::Scalar->new( qw( b c d ) );
my $C = Set::Scalar->new( qw( c d e ) );

print "A: ", $A->as_string, "\n";
print "B: ", $B->as_string, "\n";
print "C: ", $C->as_string, "\n";
print "relative complement A, B: ",
  $A->difference( $B )->as_string, "\n";
print "intersection A, B, C: ",
  $A->intersection( $B, $C )->as_string, "\n";
print "symmetric difference A, B: ",
  $A->symmetric_difference( $B )->as_string, "\n";

__END__
A: (a b c)
B: (b c d)
C: (c d e)
relative complement A, B: (a)
intersection A, B, C: (c)
symmetric difference A, B: (a d)
[download]

Tangential mini-rant: As much as I like Set::Scalar, I have a major mathematical nit to pick with it: the standard OO $instance->method( @args ) interface is not the right one for set operations, because for too many of these operations, the arguments all have equal standing, something that is obscured by this interface (this is somewhat of a fixation of mine). It is just as stilted to say $A->union( $B ) as it is to say $A->sum( $B ). Moreover, this interface excludes the edge cases in which the operations are applied to no arguments (for operations, such as union, in which this is mathematically well-defined). When the operations on the instances of a class have these symmetry properties, I think it is better to implement them as class methods.

the lowliest monk

Comment on Re: most efficient way to implement "A without B" Select or Download Code


Problems? Is your data what you think it is?
	PerlMonks