|Don't ask to ask, just ask|
Summing repeated counts for items stored in separate fileby dmorgo (Pilgrim)
|on Jul 27, 2007 at 22:21 UTC||Need Help??|
dmorgo has asked for the wisdom of the Perl Monks concerning the following question:
I have two files, keys.txt and values.txt, in this form:
There is a 1-to-1 mapping between lines in the first file and the second file. Let's say these are amounts of tips earned. If a name appears twice, the amounts accumulate. So these files indicate that Joe made $5, Bob made $4, Sally made $7, and Fred made $1.
In the real world there could be many millions of lines, but only a few hundred thousand keys (in this example, the names are the keys).
One can assume the two files are the same number of lines and are always in synch.
What is an elegant way to read these (big) files and print out the total amount, the average amount, the max amount, and the min amount for each key?
The obvious answer is to open each file and read the lines into a hash:
Maybe a better way to do it would be to create an object for each key (the flyweight pattern?) but my question is as much about the way to read the files. Is there a more elegant way than this to read two files in lockstep? Is this a job for tie? (which I haven't used much if ever, so forgive me if that's a stupid question).
I know one answer would be to use the UNIX command line utility, paste, like this:
and then read the one file and do a split. Very simple. But I can't do that in this case and am looking for the best way to do it in Perl.