Clear questions and runnable code get the best and fastest answer |
|
PerlMonks |
searching a vector arrayby baxy77bax (Deacon) |
on Jun 09, 2011 at 16:13 UTC ( [id://908952]=perlquestion: print w/replies, xml ) | Need Help?? |
baxy77bax has asked for the wisdom of the Perl Monks concerning the following question:
Hi,
Well i have a question for you guys. so what i need is, i need to do some interval searching. Example:
I have a set of random (randomly distributed) intervals (all related in the following way a<=b, where a and b are start and stop positions of intervals) my coverage of the 1-10 would be : the naive solution to do this would be to create an array and then for every interval position ++ the value indexed by the position. BUT this is not an option since i don't have enough memory on my machine (the intervals are as long ad 10^9 positions). Now what i could do is something like this: so, create a vector (or should i say several of them) and fill it with subinterval positions... Oh by the way every set of subintervals contains only non-overlapping intervals, meaning there cannot be a case as stated above , but the subintervals from different sets CAN overlap. so in the above case one subinterval would be from one set and the other from a different set(The number of intervals in both subinterval sets can be as high as interval/avg(length of subinterval)). So this approach is very cheap when it comes to memory and computation and is very practical since i can pinpoint the location of two intersecting sets,but how to do a multi intersection and to do it efficiently, meaning no pairwise comparisons(it would PROBABLY be unacceptable since the number of subsets can rise to the size of the underlying interval), is is a completely different thing. So this is the question i cannot answer or am i just too stupid and tired to see the solution (I had a feeling that, that last beer was a little too much, dammit :)). Summary: Does someone has a suggestion , tip, alternative approach ?? Cheers
UPDATE:
Thanks ppl, these three replays gave me quite a lot to think about. so, to give a quick global thank you and a few answers to monks that have replied so far. zek152 : This is a nice idea and as soon as you posted it i started to work on it, actually i knew i read about it somewhere before (never had a need to actually do any implementation though) so when i saw your post i started working on it straight away. but my original idea was the same as the one the BrowserUk posted, though i didn't realised that it can be achieved it the way he did it. so the answer to BrowserUk's question is : yes. the thing i had in mind when i was writing this post was that i have to do a pair-wise comparison for every two possible pairs of vectors then we are talking about "for within a for loop" and that is O(x^2) but as you pointed out looping the intersect through vectors does the same thing essentially (meaning useful to me) but in the O(x) time. so once more a fresh pair of eyes see better then sleepy old ones. Of course searching through start and stop indexes is a good solution but unfortunately my intervals are rather short (sorry for not stating that) which means too large index arrays... But thanx. So thank you guys once more and this is definitely going to be a good reference post for me , of course if anyone else has an alternative strategy he/she is more then welcome to post it :) Cheers
Back to
Seekers of Perl Wisdom
|
|