http://www.perlmonks.org?node_id=1024555


in reply to Re: compare a list against multiple lists
in thread compare a list against multiple lists

That's an impressive piece of code and so small. Thanks a lot.

Almost perfect for what I'm looking for. Get the output. I need to try to restrain to one reference file which I compare with others, here al.txt for example.

After a bit of customization

## only display for referenced file my %counting_list; for my $key (keys $name{$ref}) { my $count = $name{$ref}{$key}-1; ## -1 to remove reference file #print "n: $key, c $count\n"; #print Dumper(\$name{$ref}); if ($verbose == 1) { my $count = keys $kw{$key}; printf "=> '%s' appears in %2d file%s: '%s'\n", $key, $count, $count > 1 ? 's' : ' ', join(', ', sort keys %{$kw{$key}}); if ($count == 2) { my $mylist; foreach my $k (keys %{$kw{$key}}) { if (!($k eq $ref)) { $mylist = $k; } } $counting_list{$mylist}++; } elsif ($count == 3) { $counting_list{'2lists'}++; } elsif ($count == 4) { $counting_list{'3lists'}++; } elsif ($count > 4) { $counting_list{'4more'}++; } } } } ## summary output my $max = keys (%filelist); $filelist{ $max } = '2lists'; $filelist{ $max+1 } = '3lists'; $filelist{ $max+2 } = '4more'; foreach my $list (keys %filelist) { #print "nolist:X list1a:X list1b:X list2+:X list3+:X\n"; if ($counting_list{ $filelist{$list} }) { print "$filelist{$list}:$counting_list{ $filelist{$lis +t} } "; } else { print "$filelist{$list}:0 "; } } print "\n";

Replies are listed 'Best First'.
Re^3: compare a list against multiple lists
by rjt (Curate) on Mar 21, 2013 at 23:28 UTC

    Glad to help. As for your next question, is this what you're after?

    abel appears 0 times in 0 files (not counting al.txt) baker appears 1 times in 1 files (not counting al.txt) camera appears 1 times in 1 files (not counting al.txt) delta appears 2 times in 1 files (not counting al.txt) edward appears 1 times in 1 files (not counting al.txt) fargo appears 4 times in 4 files (not counting al.txt) golfer appears 2 times in 2 files (not counting al.txt) jerky appears 3 times in 3 files (not counting al.txt)

    If so, try the following code somewhere below the call to read_words():

    my $name = 'al'; for my $kw (sort keys $name{$name}) { my $files = -1 + keys $kw{$kw}; # Do not include original file my $count = -$kw{$kw}->{$name}; $count += $kw{$kw}->{$_} for keys $kw{$kw}; printf "%10s appears %2d times in %2d files (not counting %s.txt)\ +n", $kw, $count, $files, $name; }

    I confess I really don't see what you're trying to do with the $filelist{$max+1} stuff. If I'm off the mark, above, might I suggest you post some specific sample output you'd like to see, given the original inputs. Pseudo code is fine, but not if we don't know what it's supposed to look like. :-)