Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

Re: compare most recent file with second most recent file and output difference

by Kenosis (Priest)
on Nov 02, 2012 at 19:46 UTC ( #1002036=note: print w/ replies, xml ) Need Help??


in reply to compare most recent file with second most recent file and output difference

Here's one option:

use strict; use warnings; use File::Slurp qw/read_file/; my ( %file1Hash, %file2Hash, %mergedHash ); my @files = sort { -M $a <=> -M $b } <"*.txt">; do { chomp; $file1Hash{$_}++ } for read_file $files[0]; $mergedHash{$_}++ for keys %file1Hash; do { chomp; $file2Hash{$_}++ } for read_file $files[1]; $mergedHash{$_}++ for keys %file2Hash; print "$_\n" for grep $mergedHash{$_} == 1, keys %mergedHash;

What it does:

do { chomp; $file1Hash{$_}++ } for read_file $files[0]; ^ ^ ^ ^ | | | | | | | + - Read the file, returning a +list | | + - Do this for each line | + - Make the line a hash key and increment the asso +ciated value + - Remove the newline $mergedHash{$_}++ for keys %file1Hash; ^ ^ | | | + - Do this for each key + - Make the line a hash key and increment the associated value print "$_\n" for grep $mergedHash{$_} == 1, keys %mergedHash; ^ ^ | | | + - If it only appears once (in + either file, but not both) + - Print the line

This uses a hash to tally identical lines, then shows only those keys (lines) whose value is 1, i.e., lines which only appear once in either file.

Hope this helps!

Update: Used three hashes: one for each file, and one merging the two hashes, in case the same line is repeated twice within a file--and those lines are only in that one file.


Comment on Re: compare most recent file with second most recent file and output difference
Select or Download Code
Re^2: compare most recent file with second most recent file and output difference
by jjoseph8008 (Initiate) on Nov 05, 2012 at 20:57 UTC
    Thanks...I was trying out the solution kenosis provided..I installed file-slurp in the package manager and have my two files in c:\test\ The slurp.pm is in c:\perl64\site\lib\file. I ran the perl script from c:\test\ and got the below error Use of uninitialized value $file_name in -e at C:/Perl64/site/lib/File/Slurp.pm line 116. Use of uninitialized value $file_name in sysopen at C:/Perl64/site/lib/File/Slur p.pm line 193. Use of uninitialized value $file_name in concatenation (.) or string at C:/Perl6 4/site/lib/File/Slurp.pm line 194. read_file '' - sysopen: No such file or directory at c:\test\test_perl.pl line 9. i am more of a batch script person and not into much of perl. please advise.

      Hi, jjoseph8008

      I'm not sure what's causing the problem within File::Slurp, so please try the following which doesn't use that Module:

      use strict; use warnings; my ( %file1Hash, %file2Hash, %mergedHash ); my @files = sort { -M $a <=> -M $b } <"*.txt">; do { chomp; $file1Hash{$_}++ } for getFileLines($files[0]); $mergedHash{$_}++ for keys %file1Hash; do { chomp; $file2Hash{$_}++ } for getFileLines($files[1]); $mergedHash{$_}++ for keys %file2Hash; print "$_\n" for grep $mergedHash{$_} == 1, keys %mergedHash; sub getFileLines { open my $fh, '<', $_[0] or die $!; return <$fh>; }

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1002036]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others lurking in the Monastery: (6)
As of 2014-09-03 03:58 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite cookbook is:










    Results (35 votes), past polls