My dear monks:
Thank you so much for all the input and opinions. I'm learning more every day! Since reading all these comments, I've re-written my script to parse one record at a time ("paragraph mode" as suggested by Splinky
) from each file, and then discarding that record when it's been printed to STDOUT. I've also created a smaller hash to store only a few needed bits of info. I've also taken the suggestion to simply re-assign the SERIAL numbers as I encounter each record, starting from zero and incrementing once per loop, since it's not necessary that these are sorted in any fashion (Why didn't I think of that before??).
I think I'm also going to use some multiple hashes such that the key is the phone, as suggested above -- this should help things along as well. While the script is still slow, despite some of the changes I've made, I'll make some more modifications and even perhaps write to a hash tied to a DBM as suggested by lhoward
. Thanks again, most wise monks!