Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Re: compare most recent file with second most recent file and output difference

by Kenosis (Priest)
on Nov 02, 2012 at 19:46 UTC ( #1002036=note: print w/ replies, xml ) Need Help??


in reply to compare most recent file with second most recent file and output difference

Here's one option:

use strict; use warnings; use File::Slurp qw/read_file/; my ( %file1Hash, %file2Hash, %mergedHash ); my @files = sort { -M $a <=> -M $b } <"*.txt">; do { chomp; $file1Hash{$_}++ } for read_file $files[0]; $mergedHash{$_}++ for keys %file1Hash; do { chomp; $file2Hash{$_}++ } for read_file $files[1]; $mergedHash{$_}++ for keys %file2Hash; print "$_\n" for grep $mergedHash{$_} == 1, keys %mergedHash;

What it does:

do { chomp; $file1Hash{$_}++ } for read_file $files[0]; ^ ^ ^ ^ | | | | | | | + - Read the file, returning a +list | | + - Do this for each line | + - Make the line a hash key and increment the asso +ciated value + - Remove the newline $mergedHash{$_}++ for keys %file1Hash; ^ ^ | | | + - Do this for each key + - Make the line a hash key and increment the associated value print "$_\n" for grep $mergedHash{$_} == 1, keys %mergedHash; ^ ^ | | | + - If it only appears once (in + either file, but not both) + - Print the line

This uses a hash to tally identical lines, then shows only those keys (lines) whose value is 1, i.e., lines which only appear once in either file.

Hope this helps!

Update: Used three hashes: one for each file, and one merging the two hashes, in case the same line is repeated twice within a file--and those lines are only in that one file.


Comment on Re: compare most recent file with second most recent file and output difference
Select or Download Code
Re^2: compare most recent file with second most recent file and output difference
by jjoseph8008 (Initiate) on Nov 05, 2012 at 20:57 UTC
    Thanks...I was trying out the solution kenosis provided..I installed file-slurp in the package manager and have my two files in c:\test\ The slurp.pm is in c:\perl64\site\lib\file. I ran the perl script from c:\test\ and got the below error Use of uninitialized value $file_name in -e at C:/Perl64/site/lib/File/Slurp.pm line 116. Use of uninitialized value $file_name in sysopen at C:/Perl64/site/lib/File/Slur p.pm line 193. Use of uninitialized value $file_name in concatenation (.) or string at C:/Perl6 4/site/lib/File/Slurp.pm line 194. read_file '' - sysopen: No such file or directory at c:\test\test_perl.pl line 9. i am more of a batch script person and not into much of perl. please advise.

      Hi, jjoseph8008

      I'm not sure what's causing the problem within File::Slurp, so please try the following which doesn't use that Module:

      use strict; use warnings; my ( %file1Hash, %file2Hash, %mergedHash ); my @files = sort { -M $a <=> -M $b } <"*.txt">; do { chomp; $file1Hash{$_}++ } for getFileLines($files[0]); $mergedHash{$_}++ for keys %file1Hash; do { chomp; $file2Hash{$_}++ } for getFileLines($files[1]); $mergedHash{$_}++ for keys %file2Hash; print "$_\n" for grep $mergedHash{$_} == 1, keys %mergedHash; sub getFileLines { open my $fh, '<', $_[0] or die $!; return <$fh>; }

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1002036]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others imbibing at the Monastery: (5)
As of 2015-07-04 11:29 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (59 votes), past polls