http://www.perlmonks.org?node_id=377135


in reply to Re^2: Combining Ultra-Dynamic Files to Avoid Clustering (Ideas?)
in thread Combining Ultra-Dynamic Files to Avoid Clustering (Ideas?)

If you want to know how a database could tackle a problem like this of mapping IDs to arbitrary information, read this article on BTrees. Then do as perrin said and use BerkeleyDB. That solves this problem in a highly optimized way, in C.

If the dataset is large enough that it won't fit in RAM, then you probably want to ask it to build you a BTree rather than a hash. A hash is better if the data all fits in RAM.

  • Comment on Re^3: Combining Ultra-Dynamic Files to Avoid Clustering (Ideas?)