Just some idea on how to handle this
sort primary file (100s of millions)
split file into ~ 10 milion record sub files
named for the last key contained in each file
sort secondary file -> ssf1
you may sort multiple secondary files (ssf2, ssf3 etc)
perl program
read directory containing primary files by name
create an array containing the file names (@pfn)(last key in the ar
+ray)
open ssf1
ssf2
ssf3
ssf4
$ssf1_record=" "; # or a value lower than lowest value
$ssf2_record=" "; # or a value lower than lowest value
$ssf3_record=" "; # or a value lower than lowest value
$ssf4_record=" "; # or a value lower than lowest value
foreach $pf (@pfn){
open primary file ($pf)
read $pf records into a hash ($pfrh)
while ($ssf1_record < $pf){
compare to hash
if found{
action
}
read $ssf1 record;
}
while ($ssf2_record < $pf){
compare to hash
if found{
action
}
read $ssf2 record;
}
}