Firstly, the performance wasn't really the issue at hand. The question was related to disk-storage rather than performance, The reason for highlighting the time taken was to excuse my having based my conclusions upon a miserly 1% of a complete test rather than having sat around for 2 days x N tests :)

I'm not really sure what you mean by 'array' in the context of using DB_File?

No array of 512_000_000 elements is ever generated. It's doubtful whether most 32-bit machines could access that much memory.

The test program just looped 512_000_000 times (or would have if I had let it), and generated a random fileno and data value at each iteration. These are then used to fill in the values of a tied hash that is underlain by a disk-based btree DB file.

