|The stupid question is the question not asked|
Re^2: Range overlap (with a lot of ranges)by BrowserUk (Pope)
|on May 24, 2010 at 15:02 UTC||Need Help??|
Building a flat list of 10e6 spans requires 1.9 GB as an AoAs; or 1.1 GB as an array of strings. So trying to build a tree structure using Perl's built-in data structures, or any tree or graph module that uses them is likely to run you out of memory unless you are using 64-bit and have gobs of ram.
With a 2.5e9 range and 10e6 ranges, you're even more out of luck using a linear stick approach.
A potentially viable solution would be to use a hash of (partitioned) strings and then a split and simple search. Storing the same 10e6 ranges this way only requires 55 MB
As you can see, partitioning the dataset this way allows you to zero in on a very small subset of ranges by a simple integer division and hash lookup. After that, you have only half a dozen or so ranges to consider which should be fast enough that you don't need to do anything sophisticated.
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.