The hash you show would be relevant were you trying to match col2 in one file to col2 in another, but that's NOT what your narrative specs. In fact, your textual problem description seems to suggest one kind of data and your code seems to try to deal with another (for which use of a hash could be appropriate-- see comments after the script). So, for the sake of argument, and using only my often-fallible crystal ball, let's assume data files like these:
file one, aka 1140904.txt:
test.txt foo bar baz
texty_as_all_beat_hell.tst foo bar baz
ffoo.txt test bar baz
1140904.txt bar baz bat
and...
file two aka 1140904a.txt:
1140904a.txt foo bar baz
texty_as_all_beat_hell.tst foo bar baz
ffoo.txt test bar baz
baz.test bar blivitz
The code below seems to illustate (very verbosely with excessive detail -- NO, this is NOT the way it should look for production use) one answer to your explicit question (how to distinguish the first from the second file, for which purpose stevieb's approach in the first reply also does satisfactorily) and a tactic for distinguishing the matches from the non-matches:
#!/usr/bin/perl
use 5.018;
use strict; # LET PERL HELP YOU (id typos, etc)
use warnings; # LET PERL HELP YOU (id typos, etc): strict and warnin
+gs, always!
use Data::Dumper;
# print "Enter name for first file: ";
my $file1 = 'C:\_ww\1140904.txt';
# print "Enter name for second file: ";
my $file2 = '1140904a.txt';
my (@fileONE, $fileONE);
open(my $FH, "<", $file1) || die "Can't open $file1: $!\n";
for my $line(<$FH>) {
chomp $line;
say "DEBUG Ln 18: \$line is: $line";
my ($col1) = split(/ /, $line, 2);
push @fileONE, $col1;
next;
}
say Dumper @fileONE;
say "\n\t --------";
my (@fileTWO, $FH2);
open($FH2, "<", $file2) || die "Can't open $file2: $!\n";
for my $line2(<$FH2>) {
chomp $line2;
say "DEBUG Ln 34: \$line2 is: $line2";
my ($col1_2) = split(/ /, $line2);
push @fileTWO, $col1_2;
next;
}
say "DEBUG Ln 37 - reached Ln 37";
say Dumper @fileTWO;
say "\n **************";
my ($fileTWO, $i);
for ($i = 0; $i < @fileONE; ++$i) {
my $BASEname1 = 'fileONE[$';
my $Fname1 = '$' . $BASEname1 . "$i" . ']';
my $BASENAME2 = 'fileTWO[$';
my $Fname2 = '$' . $BASENAME2 . $i . ']';
my $content1 = $fileONE[$i];
my $content2 = $fileTWO[$i];
# say "DEBUG Ln50: content of \$Fname1:\t $fileONE[$i] \n\t and
+ content of \$Fname2: $fileTWO[$i] \n";
if ($content1 eq $content2 ) {
say "Exclude because it's a match: |--> $content2 <--| \n";
} else {
say "\t Content of $Fname1 and $Fname2 does not match.\n";
}
}
If this way of tackling the problem of sorting the matches from the non-matches is irrelevant to your real problem, ignore the above .... but please note that you did NOT include sample data... some thing you should do, for cases such as this: a possible discrepancy between narrative and code.
But really, what I think you were truly asking was for someone to learn Perl for you - "found this bit of code" - which is not a Monk-ish conduct which is widely approved. See, please, On asking for help, How do I post a question effectively? and I know what I mean. Why don't you?. IOW, we're here to help you learn or to solve specific coding problems; NOT to be a script-writing service.
OUTPUT (less the DEBUG output, which is left as a learning aid):
Content of $fileONE[$0] and $fileTWO[$0] does not match.
Exclude because it's a match: |--> texty_as_all_beat_hell.tst <--|
Exclude because it's a match: |--> ffoo.txt <--|
Content of $fileONE[$3] and $fileTWO[$3] does not match.
And why did I leave the DEBUG lines in the code? Because a major goal here is "helping people learn."
|