Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

Re^10: write to Disk instead of RAM without using modules

by Laurent_R (Canon)
on Oct 25, 2016 at 15:55 UTC ( [id://1174698]=note: print w/replies, xml ) Need Help??


in reply to Re^9: write to Disk instead of RAM without using modules
in thread write to Disk instead of RAM without using modules

The problem with your code is that you are loading all your files into memory (the %seen hash).

What I am suggesting is to load only the first file into such a hash, and then to read line by line all the other files and, for each record, see if you have already seen that record in the first file. If you haven't seen it, you know that record will not be in all the files, since it wasn't in the first file anyway, so there is no point to add it to your hash. If you've seen it, just increment the value of the hash for it.

At the end, your hash is in fact a set of counters telling, for each record of the first file, how many times it has been seen. You just need to keep the records whose counter is equal to the number of files.

Update: please look at my other post in reply to your original question (at the bottom of the thread) for a detailed algorithm.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1174698]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others romping around the Monastery: (4)
As of 2024-04-19 01:17 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found