Greetings to you, oh great Monks
I would like to compare two arrays and keep only those elements that PARTIALLY intersect. For example, let the two array's be:
@arr1 = ("0000007 | John | ABC.txt | 42","0000014 | Jane | XYZ.txt | 34","0000017 | Jessica | GHI.txt | 21", etc);
@arr2 = (0000007, 0000014);
This should result in a text file arr3.dat that contains something like
0000007 | John | ABC.txt | 42
0000014 | Jane | XYZ.txt | 34
I know I should use hashes and know how to split element-by-element but can't figure this one out.
EDIT: CODE ADDED, SORRY FOR BEING UNSPECIFIC. THE FINAL ROW SUGGESTS THAT THE HASHING GOES WRONG...
EDIT AGAIN: SOLUTION BY nemesdani WORKS FINE (BUT SEE MY REPLY).
Thanks!
use feature ':5.10';
open (KEYFILE, "arr2.txt");
@arr2 = <KEYFILE>;
close(KEYFILE);
# Declare and fill up the hash
my %elements;
foreach (@arr2) {
$elements{$_} = 1;
};
#I'm repeating this for several quarterly files:
for($year=2006; $year<2012; $year=$year+1){
for($i=1; $i<5; $i=$i+1){
# Load each quarterly file
$filea = $year . "QTR" . $i . "arr1.txt";
open(MYINFILE, $filea);
@arr1 = <MYINFILE>;
close(MYINFILE);
$sizegn = @arr1;
# For each entry in the quarterly file
for($j=0; $j<$sizegn; $j++){
# Pick only lines that contain the string "txt"
if($arr1[$j] =~ m/txt/){
# splits the elements of @arr1, and format the first e
+lement into 7-digit number:
@arraydata = split(/\|/, $arr1[$j]);
my $element = sprintf("%07d", $arraydata[0]);
# now check if $element is in hash :
if (exists $elements{$element}){
open(MYOUTFILE, ">>Arr3.dat");
print MYOUTFILE $arr1[$j];
close(MYOUTFILE);
}
print "D'oh - '$element' not found\n" unless (exists $
+elements{$element});
}
}
}
}