comment on

If the files are not huge (i.e., one of them will fit in memory), I would go with the approach specified by hippo. Read file #1 into hash, then compare each line in file #2 to the hash.

If the files are too big for this, a slight modification: Read file #1 and store the seek (or tell) locations in the hash. Then compare each line in file #2 to the corresponding line in file #1, using your hash as a shortcut way to go straight to that line.

Update: Sample of first option:

#!/usr/bin/perl

use strict;
use warnings;

my    %FileInfo1 = ();

my ($inputFilename1, $inputFilename2, @otherParameters) = @ARGV;

# Read File 1 to Hash
open INPUT_FILE1, '<', $inputFilename1;
while (my $inputBuffer1 = <INPUT_FILE1>)
{
    chomp $inputBuffer1;
    my ($key, $data) = split /\|/, $inputBuffer1, 2;
    $FileInfo1{$key} = $data;
}
close INPUT_FILE1;

# Read File 2 and Compare
open INPUT_FILE2, '<', $inputFilename2;
while (my $inputBuffer2 = <INPUT_FILE2>)
{
    chomp $inputBuffer2;
    my ($key, $data) = split /\|/, $inputBuffer2, 2;
    if (!defined $FileInfo1{$key})
    {
        print "$key not found in $inputFilename1\n";
    }
    elsif ($FileInfo1{$key} ne $data)
    {
        print "$key data does not match\n";
        delete $FileInfo1{$key};
    }
    else
    {
        print "$key - OK\n";
        delete $FileInfo1{$key};
    }
}
close INPUT_FILE2;
foreach my $leftoverKey (keys %FileInfo1)
{
    print "$leftoverKey not found in $inputFilename2\n";
}

exit;

__END__

C:\Steve\Dev\PerlMonks\P-2013-10-08@1245-TwoFile-Keyed-Compare>type te
+st*.dat

test1.dat


A001|Steve|45
A002|George|32
A003|Alice|24

test2.dat


A001|Steve|45
A003|Alice|23
A004|Mike|48

C:\Steve\Dev\PerlMonks\P-2013-10-08@1245-TwoFile-Keyed-Compare>perl cm
+pfiles.pl test1.dat test2.dat
A001 - OK
A003 data does not match
A004 not found in test1.dat
A002 not found in test2.dat
[download]

In reply to Re: Comparing strings from different files by marinersk
in thread Comparing strings from different files by Jalcock501

Are you posting in the right place? Check out Where do I post X? to know for sure.
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
Want more info? How to link or How to display code and escape characters are good places to start.


Problems? Is your data what you think it is?
	PerlMonks