http://www.perlmonks.org?node_id=863843

walkingthecow has asked for the wisdom of the Perl Monks concerning the following question:

I have an array that contains a bunch of email addresses like so:
bob@foo.com joe@foo.com jane@foo.com
I then have an array of hashes that contains distribution lists and emails associated with each. It looks like so:
push @{ $user_hash{$dlist_name} }, $user_email;
and that builds something that basically says:
dlist@foo.com => jane@foo.com dlist@foo.com => bob@foo.com dlist@foo.com => joe@foo.com
I then want to check if all values in the array exist in the array of hashes for each dlist.

So, for example, say there is a dlist with the name sales@foo.com, and it looks like this:
sales@foo.com => joe@foo.com
I don't want that dlist to be flagged as having all users because it does not contain all users in the array. However, if it does contain all users in the array, then for that dlist I want it to be flagged as containing all users. I really don't know where to start with this, and so I don't have code to provide.

Replies are listed 'Best First'.
Re: Checking contents of array against contents of a hash
by moritz (Cardinal) on Oct 06, 2010 at 18:09 UTC
Re: Checking contents of array against contents of a hash
by hbm (Hermit) on Oct 06, 2010 at 18:33 UTC

    Can you make the dlists hashes instead of arrays? Then you could check something like this:

    if (!grep {!exists $user_hash{$dlist_name}{$_}} @emails) { # flag 'all users' }

    That would have the advantage of deduping your dlists; where with an array you probably ought to check before pushing.

Re: Checking contents of array against contents of a hash
by GrandFather (Saint) on Oct 06, 2010 at 19:31 UTC

    A very small amount of real data and a sample script and its out plus the expected output that demonstrates what you are tying to do would help a lot.

    Without any extra information I suspect you need to build a reverse lookup hash from your array of hashes.

    True laziness is hard work
Re: Checking contents of array against contents of a hash
by Xiong (Hermit) on Oct 06, 2010 at 20:29 UTC

    #!/run/bin/perl # for-walkingthecow-balls.pl # = Copyright 2010 Xiong Changnian <xiong@cpan.org> = # = Free Software = Artistic License 2.0 = NO WARRANTY = use 5.010; use strict; use warnings; use List::Compare; # Compare elements of two or more list +s #~ use Devel::Comments '###', '####'; #--------------------------------------------------------------------- +-------# # Declare hypothetical data set. # These are the things that may be distributed to containers, non-uniq +uely. my @balls = qw( red green blue yellow ); # These are the containers. Note the spares for further fooling around +. my %boxes = ( #~ round => [ qw( red green blue yellow ) ], #~ cube => [ qw( red green blue yellow ) ], #~ flat => [ qw( red green blue yellow ) ], round => [ qw( red ) ], cube => [ qw( green blue ) ], flat => [ qw( red green blue yellow ) ], ); # This is the goal state. my $want = 1; # How many boxes contain all balls? #--------------------------------------------------------------------- +-------# # Invoke code under test. say qq{Want: $want, Got: }, do_compare (\@balls, values %boxes ); exit(0); #--------------------------------------------------------------------- +-------# # Uses List::Compare. Calling syntax is highly flexible and orthogonal +. sub do_compare { my $unbox_ref = shift; my @boxes_refs = @_; my $got ; # accumulate "finds" here # Construct a work-object. my $lc = List::Compare->new( '-u', $unbox_ref, @boxes_refs ); # u +nsorted # Pretty-print for debug. $lc->print_subset_chart; # Find out if any (other) box contains all the elements in the unb +ox. my $ixL = 0; # the unbox; index of first ref to + new() for my $ixR ( 1..scalar @boxes_refs ) { my $bool = $lc->is_LsubsetR( $ixL, $ixR ); # true if all +L in R ### $bool $got += $bool; }; return $got; }; __END__ Subset Relationships Right: 0 1 2 3 Left: 0: 1 0 0 1 1: 1 1 0 1 2: 1 0 1 1 3: 1 0 0 1 Want: 1, Got: 1
    Feste: Misprison in the highest degree. Lady, cucullus non facit monachum. That's as much to say as, I wear not motley in my brain....
Re: Checking contents of array against contents of a hash
by ssandv (Hermit) on Oct 06, 2010 at 23:58 UTC

    Before you do anything else, you can count the number of emails on the dlist. if it's not at least as big as the master list of users, you can short circuit. After that, you can sort them and walk the dlist using the master list. As soon as you find a master list element that's skipped on the dlist, you're done for that dlist. If you get to the end of the master list and nothing was skipped, flag it. Or something like that.

Re: Checking contents of array against contents of a hash
by Marshall (Canon) on Oct 07, 2010 at 20:50 UTC
    Another approach is shown below. Basically you are trying to find out if the @all array is a subset of each of the dlist arrays. I assumed that perhaps the dlist array might contain more e-mail addresses than the @all array.

    This type of "check off the list" comparison is often done with hash tables. My comparison function makes a hash out of the dlist from the HoA. Then it iterates over each e-mail in @all and "checks off" in the hash as each one is "seen". If I find a e-mail address from the @all list that isn't in the array from the dlist hash, then it can't be a subset and the function returns a failure (not a subset).

    Once it has been determined that the dlist (like "listC") contains all of the e-mail addresses in all, I count up any "extra" ones and return that list. This would allow say the compression of "listC" to be "all@foo.com, billy@foo.com".

    A noteworthy point here is how returning the @extra array was handled. I return () to mean "nothing", NOT undef. undef is a value. Returning () means "nothing" which is "less than" returning an undef value. The first time I did this, it took me many hours to figure out how to do it! So at some point in the future just having seen this one point may save you a lot of grief! Have fun...

    #!/usr/bin/perl -w use strict; use Data::Dumper; my@all = qw(bob@foo.com joe@foo.com jane@foo.com); my %dlists = ( 'listA' =>[ qw (bob@foo.com jane@foo.com joe@foo.com) ], 'listB' =>[ qw (bob@foo.com)], 'listC' =>[ qw (bob@foo.com joe@foo.com jane@foo.com billy@foo.com)], ); foreach my $dlist (keys %dlists) { my ($subset, @extra) = is_array_subset(\@all, \@{$dlists{$dlist}}); if ($subset) { print "$dlist contains all users "; if (@extra) { print "and ".@extra." additional addresses\n"; } else { print "\n"; } } } sub is_array_subset #returns (yes|no status, @extra) { my ($ref_all, $ref_dlist) = @_; my %dlist = map{$_ => 0 }@$ref_dlist; foreach (@$ref_all) { return (0, ()) if !exists($dlist{$_}); #give up! not subset $dlist{$_}++; } my @extra = (); #is subset, now see if any "extra's" foreach (keys %dlist) { push @extra, $_ if $dlist{$_} == 0; } return (1, @extra); } __END__ prints: listC contains all users and 1 additional addresses listA contains all users