Re: Reduce CPU utilization time in reading file using perl

by Corion (Pope)
on Sep 30, 2013 at 08:13 UTC ( #1056306=note: print w/replies, xml ) Need Help??

in reply to Reduce CPU utilization time in reading file using perl

If you are comparing two files for common/different keys, and if both files are about the same (huge) size, I guess you will have to get smarter than keeping all the information in memory (because you don't have enough memory).

If you can make an educated guess as to where in a file a key is likely to be found, you could use seek to look for the key in the file. This is horribly slow, but likely still faster than swapping memory. If you want to be fancy, you can cache parts of the file in memory.

If you cannot make an educated guess, I guess it will pay off to convert at least one file into a file with all your keys in fixed width, sorted by the keys. Then you can easily make an educated guess to find a given key. If you convert both files to that structure, you can easily find the keys missing in one of the two files by reading through the sorted key files line by line. This approach will roughly double your disk requirements.

