in reply to Re: Problems with SDBM
in thread Problems with SDBM
Hi, thanks everyone for the answers already provided.
The main reason to tie is resource limits: the data input has about 30 million records (and slightly less than 2 GB) and that is just too large for a hash (untied hash, that is). Having said that, persistence would also be a bonus because later processes would use the same data and would not have to load it again. But persistence is not the primary reason for using tied hashes.
I am not too much concerned with speed performance at this point (although it might become important at some point, given the large data volume), my concern is that the process fails when I have loaded only about half of the data (15.8 million records), presumably because of the large volume of data. I could use several tied hashes to get around this volume limit, but that would be sort of awkward and unwieldy (and not very scalable).
It seems that the Berkeley DB is not available on our system, so it seems that it will not be an option.