The solution you really want is a database. You can get a very lightweight one via the DBD::SQLite module (you'll also want DBI if you do anything with a database).
You'll want to read your file in and store it in a database. I see that you have tab-separated files -- you probably would save yourself a lot of work by using Text::CSV_XS to parse those instead of doing it yourself.
Then, a simple query to the database will find mismatches.
Here's a general (not debugged) example:
use strict; use warnings;
use DBI;
use DBD::SQLite;
use IO::File;
use Text::CSV_XS;
my $db_file = 'ref_compare.db';
my $csv = Text::CSV_XS->new({sep_char=>"\t"});
## remove the db file if it exists
unlink $db_file if -f $db_file;
my $dbh = DBI->connect("dbi:SQLite:dbname=$db_file",'','');
## create two tables.
## 1: For brd_sym_pn
$dbh->do(q'
CREATE TABLE brd_sym_pn ( refdes TEXT, pnum TEXT, pkgtype TEXT )
');
## 2: For sym_text_latest
$dbh->do(q'
CREATE TABLE sym_text_latest (
logpnpkg TEXT, logpnum TEXT, logpkgtype TEXT
)
');
## ok, now load brd_sym_pn
my $sth = $dbh->prepare(q'
INSERT INTO brd_sym_pn (refdes,pnum,pkgtype)
VALUES (?,?,?)
');
my $brd_sym_pn_io = IO::File->new('brd_sym_pn.txt');
## use $brd_sym_pn_io->getline to skip any "header" rows
until ( $brd_sym_pn_io->eof ) {
my $values = $csv->getline( $brd_sym_pn_io ); # parse data line
for ( @$values ) { s/^\s+|\s+$/ } # trim lead/trail whitespace
$sth->execute( @$values ); # inserts row into DB table
}
## ok, now load sym_text_latest
$sth = $dbh->prepare(q'
INSERT INTO sym_text_latest (logpnpkg,logpnum,logpkgtype)
VALUES (?,?,?)
');
my $sym_text_latest = IO::File->new('sym_text_latest.txt');
## use $sym_text_latest->getline to skip any "header" rows
until ( $sym_text_latest->eof ) {
my $values = $csv->getline( $sym_text_latest ); # parse data line
for ( @$values ) { s/^\s+|\s+$/ } # trim lead/trail whitespace
$sth->execute( @$values ); # inserts row into DB table
}
## now you can use any query you want, even in other scripts
## let's find everything where pnums match, but pkgtypes don't:
$sth = $dbh->prepare(q'
SELECT refdes, pnum, pkgtype, logpnum, logpkgtype
FROM brd_sym_pn, sym_text_latest
WHERE brd_sym_pn.pnum = sym_text_latest.logpnum
AND brd_sym_pn.pkgtype != sym_text_latest.logpkgtype
');
$sth->execute();
# print the results out.
print join "\t", qw/refdes pnum pkgtype logpnum logpkgtype/;
while ( my @row = $sth->fetchrow_array ) {
print join "\t", @row;
}
Of course, you could also simply store your first file in a hash, using partnums as keys -- that's just lest flexible in terms of answering other questions about your data.
That should give you a fair number of ideas.
<–radiant.matrix–>
Ramblings and references
“A positive attitude may not solve all your problems, but it will annoy enough people to make it worth the effort.” — Herm Albright
I haven't found a problem yet that can't be solved by a well-placed trebuchet