|Perl: the Markov chain saw|
Heap structure for lookup?by BrowserUk (Pope)
|on May 27, 2015 at 08:13 UTC||Need Help??|
BrowserUk has asked for the wisdom of the Perl Monks concerning the following question:
If you have 150e6 64-bit values to store for efficient lookup, perl's hashes(~14GB) and arrays(~5GB) are very expensive of memory, and a bitmap is out of the question.
The array is feasible, but to do lookups (binary search) requires it be sorted, and that's quite expensive for this size of array, even when using one of salva's in-place, XS modules.
The values are generated at runtime, and discarded at program end, so DBs are pointless. Even an in-memory sqlite DB which stores numbers as strings is off the cards.
I'm going to have to drop into Inline::C for this for both space and performance reasons.
A straight C array of 64-bit ints is ~1.2GB which is fine; but again sorting it so I can to O(logN) lookups is expensive.
I keep thinking about heaps (or Beaps or B-heaps or other variations), structures that "order" the values as they are inserted; but once built, can any of them be used for efficient (O(logN) or better) lookup?
Wikipedia isn't giving me much on the subject of lookup/searches.
With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority". I'm with torvalds on this
In the absence of evidence, opinion is indistinguishable from prejudice. Agile (and TDD) debunked