Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

Re^3: Processing ~1 Trillion records

by Jenda (Abbot)
on Oct 25, 2012 at 12:08 UTC ( #1000838=note: print w/ replies, xml ) Need Help??


in reply to Re^2: Processing ~1 Trillion records
in thread Processing ~1 Trillion records

You seem to be accumulating lots of data in the hashes, are you sure it all fits in memory? As soon as you force the computer to swap memory pages to disk, the processing time grows insanely!

It might help to tie the hashes to a DBM file (DB_File, MLDBM, ...) or use a SQLite or some other database to hold the temporary data. Doing as much work as you can upfront in the Oracle database would most probably be even though. Sometimes a use DB_File;tie %data, 'DB_File', 'filename.db'; is all you need to change something from unacceptably slow to just fine.

Jenda
Enoch was right!
Enjoy the last years of Rome.


Comment on Re^3: Processing ~1 Trillion records
Download Code
Re^4: Processing ~1 Trillion records
by aossama (Acolyte) on Oct 25, 2012 at 12:36 UTC
    Is this like using Redis to store/retrieve the key-value?

      Yes. You can use Redis itself, seems it does have a Perl binding. The whole point is to make sure the process fits in memory and the data that had to be moved to the disk is accessed/updated efficiently.

      Jenda
      Enoch was right!
      Enjoy the last years of Rome.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1000838]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others meditating upon the Monastery: (10)
As of 2014-12-22 11:32 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (116 votes), past polls