http://www.perlmonks.org?node_id=1002036


in reply to compare most recent file with second most recent file and output difference

Here's one option:

use strict; use warnings; use File::Slurp qw/read_file/; my ( %file1Hash, %file2Hash, %mergedHash ); my @files = sort { -M $a <=> -M $b } <"*.txt">; do { chomp; $file1Hash{$_}++ } for read_file $files[0]; $mergedHash{$_}++ for keys %file1Hash; do { chomp; $file2Hash{$_}++ } for read_file $files[1]; $mergedHash{$_}++ for keys %file2Hash; print "$_\n" for grep $mergedHash{$_} == 1, keys %mergedHash;

What it does:

do { chomp; $file1Hash{$_}++ } for read_file $files[0]; ^ ^ ^ ^ | | | | | | | + - Read the file, returning a +list | | + - Do this for each line | + - Make the line a hash key and increment the asso +ciated value + - Remove the newline $mergedHash{$_}++ for keys %file1Hash; ^ ^ | | | + - Do this for each key + - Make the line a hash key and increment the associated value print "$_\n" for grep $mergedHash{$_} == 1, keys %mergedHash; ^ ^ | | | + - If it only appears once (in + either file, but not both) + - Print the line

This uses a hash to tally identical lines, then shows only those keys (lines) whose value is 1, i.e., lines which only appear once in either file.

Hope this helps!

Update: Used three hashes: one for each file, and one merging the two hashes, in case the same line is repeated twice within a file--and those lines are only in that one file.