I don't know if one UR key may appear many times in one input file, each time with the same or different CI keys, or vice versa.
If so, you the hash key must be the whole (chomped) record of one input files, and proceed just like grizzley said, but reversing the fields of the records from the second file when testing for existence in the loop.
Also, if there are many records with exact the same keys on the second file, you can delete the full key from the hash each time you match it, just to get unique records in the output file.
Bonus track (not tested):
#!perl
use strict;
use warnings;
# ...
open $IN1, "<", $infile1 or die "cannot open $infile1: $!\n";
open $IN2, "<", $infile2 or die "cannot open $infile2: $!\n";
open $OUT, ">", $outfile or die "cannot open $outfile: $!\n";
my %pair = ();
while (<$IN1>) {
chomp;
s/^(\w+)\s+(\w+)$/$1 $2/; # just one space between keys
$pair{$_} = 1;
}
while (<$IN2>) {
chomp;
s/^(\w+)\s+(\w+)$/$2 $1/; # swap keys
if (exists $pair{$_}) {
print $OUT "$_\n";
delete $pair{$_};
}
}
close $IN1;
close $IN2;
close $OUT;
Update: Initialized and changed name of the hash and newline added on output records. Still not tested!
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.